Introduction to Ontology Barry Smith August 11, 2012.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
IPY and Semantics Siri Jodha S. Khalsa Paul Cooper Peter Pulsifer Paul Overduin Eugeny Vyazilov Heather lane.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Prentice Hall, Database Systems Week 1 Introduction By Zekrullah Popal.
SRDC Ltd. 1. Problem  Solutions  Various standardization efforts ◦ Document models addressing a broad range of requirements vs Industry Specific Document.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Ontology in Buffalo Barry Smith. 2 Ontology (phil.) The science of being Ontologies (tech.) Standardized classification systems which enable data from.
Information and Business Work
Ontology Notes are from:
1 Introduction to Ontology: Terminology Barry Smith with thanks to Werner Ceusters, Waclaw Kusnierczyk, Daniel Schober.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
File Systems and Databases
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
Chapter 10: Analyzing Systems Using Data Dictionaries Instructor: Paul K Chen.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
1 The Future of Clinical Bioinformatics: Overcoming Obstacles to Information Integration Barry Smith Brussells, Eurorec Ontology Workshop, 25 November.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
Business Domain Modelling Principles Theory and Practice HYPERCUBE Ltd 7 CURTAIN RD, LONDON EC2A 3LT Mike Bennett, Hypercube Ltd.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Ontology Development in the Sciences Some Fundamental Considerations Ontolytics LLC Topics:  Possible uses of ontologies  Ontologies vs. terminologies.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Ceg860 (Prasad)L6MR1 Modularity Extendibility Reusability.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
Imports, MIREOT Contributors: Carlo Torniai, Melanie Courtot, Chris Mungall, Allen Xiang.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Core 6 (University at Buffalo) Dissemination of Ontology Best Practices Barry Smith (PI) Fabian Neuhaus (Post-Doc) Werner.
Ontology for Federation and Integration of Systems Cross-track A2 Summary Anatoly Levenchuk & Cory Casanave Co-chairs 1 Ontology Summit 2012
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Dimitrios Skoutas Alkis Simitsis
Semantic Web - an introduction By Daniel Wu (danielwujr)
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Ontologies Come of Age Deborah L. McGuinness Stanford University “The Semantic Web: Why, What, and How, MIT Press, 2001” Presented by Jungyeon, Yang.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Semantic Enhancement vs. Integration Data-Model DSC Solution
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
Anatomy Ontology Community Melissa Haendel. The OBO Foundry More than just a website, it’s a community of ontology developers.
Ontology and the Semantic Web Barry Smith August 26,
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Digital Libraries Lillian N. Cassel Spring A digital library An informal definition of a digital library is a managed collection of information,
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
1 How to build an ontology Barry Smith
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
1 Standards and Ontology Barry Smith
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
ece 627 intelligent web: ontology and beyond
Active Data Management in Space 20m DG
The Re3gistry software and the INSPIRE Registry
Ontology.
File Systems and Databases
OBO Foundry Principles
ONTOMERGE Ontology translations by merging ontologies Paper: Ontology Translation on the Semantic Web by Dejing Dou, Drew McDermott and Peishen Qi 2003.
Presentation transcript:

Introduction to Ontology Barry Smith August 11, 2012

The problem of (big) data

Some questions How to find data? How to understand data when you find it? How to use data when you find it? How to integrate with other data? How to label the data you are collecting? How to build a set of labels for a new domain that will integrate well with labels used in neighboring domains? Big problem: nearly all of this data is siloed 3

Sources Examples of databases containing person data and data pertaining to skills PersonIDSkillID SkillIDNameDescription 222JavaProgramming IDSkillDescr 333SQL EmplIDSkillName 444Java

The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business systems and Web services these systems are poorly integrated deliver redundant capabilities, make data hard to access, foster error and waste prevent secondary uses of data Based on FY11 Defense Information Technology Repository (DITPR) data 5

6/

One road to a solution: Exploit the network effects of the Web You build a site. Others discover the site and they link to it The more they link to it, the more important and well known the page becomes (this is what Google exploits) Your page becomes important, and others begin to rely on it Many people link to the data, use it New ‘secondary uses’ of the data are discovered With thanks to Ivan Herman 7

Unfortunately the Web is ruled by anarchy. However much we try to link web content together à la google, we will still be left with many, many siloes. Photo credit “nepatterson”, Flickr 8

To avoid silos, data must be available on the Web in a standard way. Use “ontologies” to capture common meanings with logical definitions that are understandable to both humans and computers. using a common language such as OWL (Web Ontology Language) The idea of the Semantic Web

Annotate data using ontologies Source TermOntology Label Db1.NameSE.Skill Db2.SkillDescrSE.ComputerSkill Db3.SkillNameSE.ProgrammingSkill Db1.PersonIDSE.PersonID Db2.IDSE.PersonID Db3.EmplIDSE.PersonID SE.ComputerSkillSE.Skill SE.ProgrammingSkillSE.ComputerSkill Inconsistent and idiosyncratic terms used in source data are associated with single preferred labels from ontologies

Where we stand today html demonstrated the power of the Web to allow sharing of information increasing availability of semantically enhanced data increasing power of semantic software to allow automatic reasoning over online information increasing use of OWL in attempts to break down silos, and create useful integration of on-line data and information 11

Linked Open Data as of September 2010

Ontology success stories, and some reasons for failure unfortunately this data is not really linked 13

Ontology success stories, and some reasons for failure 14 unfortunately this data is not really linked

The result: the more Semantic Technology is successful, they more it fails to achieve it goals the very success of the approach leads to the creation of ever new controlled vocabularies, semantic silos – because multiple ontologies are being created in ad hoc ways The Semantic Web framework as currently conceived yields minimal standardization Creates semantic siloes 15

Basic Formal Ontology (BFO) top-level architecture used in over 120 ontology projects world wide Next tutorial in this series: August 18-19,

People will tell you, all you need is … 17 XML gives you: processable tagging + syntactic interoperability RDF gives you: net-centricity (URIs for unique and consistent naming), linked data OWL (Web Ontology Language) gives you: RDF + semantic interoperability, richer logic

Levels of coordination but these are just tools: they do not rule out stovepipes they do not prevent redundant efforts they do not imply high quality ontologies of the sort that will support reasoning Even if we all speak Irish, thus does not mean that we all understand each other 18

Warning 1. OWL implementation is not enough the issues we face are not only logical, but also sociological they are the same issues already endemic in the database world – database architecture is inflexible – database systems, once distributed, degrade very quickly; create stovepipes, forking, siloes … How to ensure coordinated ontology development over time?

Suggested principles for an ontologist’s code of ethics 1.I hereby swear that I will reuse existing ontology content wherever possible 2.I hereby swear that whenever I reuse terms from an existing ontology, I will keep their original source IDs 3.I hereby swear that before releasing an ontology I will aggressively test it in multiple independent real-world applications 4.I hereby swear that before committing a new term and definition to an ontology I will always think first

Some governance principles Information sharing: to avoid ontology redundancy and inconsistency, there must be sharing of information at every stage Collaborative development: where ontology development needs overlap, the communities involved must either develop shared resources or agree to a division of labor Leverage of existing resources: ontology development should wherever possible involve reuse of existing ontologies. Guiding role of subject-matter experts, who should be involved in the construction and maintenance of all domain ontology content

Warning 2. Ontology is a multi-disciplinary enterprise, in which the same terms are used in conflicting ways by different communities of ontologies universal, type, kind, class instance concept, model representation datum 22

The ontology spectrum (data focus) glossary: A simple list of terms and their definitions. data dictionary: Terms, definitions, naming conventions and representations of the data elements in a computer system. data model (e.g. JC3IEDM): Terms, definitions, naming conventions, representations and the beginning of specification of the relationships between data elements. taxonomy: A complete data model in an inheritance hierarchy where all data elements inherit their behaviors from a single "super data element". ontology: A complete, machine-readable specification of a conceptualization = conceptual data model 23

The ontology spectrum (reality focus) glossary: A simple list of terms and their definitions. controlled vocabulary: A simple list of terms, definitions and naming conventions to ensure consistency. taxonomy: A controlled vocabulary in which the terms form of a hierarchical representation of the types and subtypes of entities in a given domain. The hierarchy is organized by the is_a (subtype) relation ontology: A controlled vocabulary organized by is_a and by further formally defined relations, for example part_of. 24

FMA Pleural Cavity Pleural Cavity Interlobar recess Interlobar recess Mesothelium of Pleura Mesothelium of Pleura Pleura(Wall of Sac) Pleura(Wall of Sac) Visceral Pleura Visceral Pleura Pleural Sac Parietal Pleura Parietal Pleura Anatomical Space Organ Cavity Organ Cavity Serous Sac Cavity Serous Sac Cavity Anatomical Structure Anatomical Structure Organ Serous Sac Mediastinal Pleura Mediastinal Pleura Tissue Organ Part Organ Subdivision Organ Subdivision Organ Component Organ Component Organ Cavity Subdivision Organ Cavity Subdivision Serous Sac Cavity Subdivision Serous Sac Cavity Subdivision part_of is_a Foundational Model of Anatomy 25

In graph-theoretical terms: Ontology Components: alphanumeric IDs form nodes of the graph each node is associated with some single term (preferred label) relationships between nodes, such as is_a form the edges of the graph definitions and synonyms are associated with each node 26

Entity =def anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software 27

A515287DC3300 Dust Collector Fan B521683Gilmer Belt C521682Motor Drive Belt instances universals 28

Catalog vs. inventory Ontology vs. list of items in your warehouse 29

Warning 3. Do not confuse things with words and ideas Level 1: the entities in reality, both instances and universals Level 2: cognitive representations of this reality on the part of scientists... Level 3: publicly accessible concretizations of these cognitive representations in textual and graphical artifacts 30

Ontology development starts with: Level 2 = the cognitive representations of practitioners or researchers in the relevant domain results in: Level 3 representational artifacts (comparable to maps, science texts, dictionaries) 31

Domain =def. a portion of reality that forms the subject- matter of a single science or technology or mode of study; proteomics HIV demographics... 32

Representation =def. an image, idea, map, picture, name or description... of some entity or entities two kinds of representation: analogue (photographs) digital/composite/syntactically structured 33

Class =def. a maximal collection of particulars referred to by a general term the class A =def. the collection of all particular A’s where ‘A’ is a general term (e.g. ‘brother of Elvis fan’, ‘cell’) Classes are on the same level as the instances which they contain 34

(Scientific) Ontology =def. a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent 1. universals in reality 2. those relations between these universals which obtain universally (= for all instances) lung is_a anatomical structure lobe of lung part_of lung 35

Ontology (science) the science of the kinds and structures of objects, properties, events, processes and relations in every domain of reality 36