Ontology and the Semantic Web Barry Smith August 26, 2013 1.

Slides:



Advertisements
Similar presentations
Lecture 2 Ontology and Logic. Aristotelian realism vs. Kantian constructivism Two grand metaphysical theories 20th-century analytic metaphysics dominated.
Advertisements

Species-Neutral vs. Multi-Species Ontologies Barry Smith.
On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
Goal and Status of the OBO Foundry Barry Smith. 2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.  let a million flowers (and weeds) bloom  to.
Ontology in Buffalo Barry Smith. 2 Ontology (phil.) The science of being Ontologies (tech.) Standardized classification systems which enable data from.
Ontology Notes are from:
Universal Core Semantic Layer (UCore SL) An Ontology-Based Supporting Layer for UCore 2.0 Presenter: Barry Smith National Center for Ontological Research.
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo
Historical Introduction to Ontologies Barry Smith.
1 Doing Ontology Over Images Barry Smith. What ontologies are for.
1 The OBO Foundry Towards Gold Standard Terminology Resources in the Biomedical Domain Thomas Bittner (based on a presentation by Barry Smith)
What is an ontology and Why should you care? Barry Smith 1.
1 Intelligence Ontology: A Strategy for the Future Barry Smith University at Buffalo
1 How Ontologies Create Research Communities Barry Smith
1 Introduction to (Geo)Ontology Barry Smith
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
The Problem of Reusability of Biomedical Data OBO Foundry & HL7 RIM Barry Smith.
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Room for Lunch: Arlington Room Room for Evening Reception: Grand Prairie Room.
Why a Credit Card Number is Not a Number Barry Smith 1.
1 Ontologie als konkretisierte Darstellung der Wirklichkeit Barry Smith.
The RNA Ontology RNAO Colin Batchelor Neocles Leontis May 2009 Eckart, Colin and Jane In Cambridge.
1 BIOLOGICAL DOMAIN ONTOLOGIES & BASIC FORMAL ONTOLOGY Barry Smith.
1 The OBO Foundry Barry Smith Center of Excellence in Bioinformatics & Life Sciences, University at Buffalo IFOMIS, Saarland University
How to Organize the World of Ontologies Barry Smith 1.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
1 How Ontologies Create Research Communities Barry Smith
1 The Future of Clinical Bioinformatics: Overcoming Obstacles to Information Integration Barry Smith Brussells, Eurorec Ontology Workshop, 25 November.
The OBO Foundry approach to ontologies and standards with special reference to cytokines Barry Smith ImmPort Science Talk / Discussion June 17, 2014.
Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015.
Limning the CTS Ontology Landscape Barry Smith 1.
Ontological Engineering Barry Smith Computers and Information in Engineering Conference, Buffalo August 19,
Computational Biology and Informatics Laboratory Development of an Application Ontology for Beta Cell Genomics Based On the Ontology for Biomedical Investigations.
The CROP (Common Reference Ontologies for Plants) Initiative Barry Smith September 13,
Ontology of Sensors: Some Examples from Biology
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Intelligence Ontology A Strategy for the Future Barry Smith University at Buffalo
Introduction to Ontology Barry Smith August 11, 2012.
Imports, MIREOT Contributors: Carlo Torniai, Melanie Courtot, Chris Mungall, Allen Xiang.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
1 How Ontologies Create Research Communities Barry Smith University at Buffalo
Ontological Engineering Barry Smith Computers and Information in Engineering Conference, Buffalo August 19,
Horizontal Integration of Warfighter Intelligence Data A Shared Semantic Resource for the Intelligence Community Barry Smith, University at Buffalo, NY,
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
What is an ontology? Barry Smith 1.
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
Towards an Ontology of Military Plans and Planning Barry Smith National Center for Ontological Research, Buffalo.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
How to integrate data Barry Smith. The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business.
2 3 where in the body ? where in the cell ?
About ontologies Melissa Haendel. And who am I that I am giving you this talk? Melissa Haendel Anatomist, developmental neuroscientist, molecular biologist,
What is an ontology and Why should you care? Barry Smith 1.
Need for common standard upper ontology
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
Information Artifact Ontology: General Background Barry Smith 1.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
1 Ontology (Science) vs. Ontology (Engineering) Barry Smith University at Buffalo
Immunology Ontology Rho Meeting October 10, 2013.
OBO Foundry Principles BFO RO Barry Smith 1. OBO Foundry Principles  open  common formal language (OBO Format, OWL DL, CL)  commitment to collaboration.
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Basic Formal Ontology Barry Smith August 26, 2013.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Why do we need upper ontologies? What are their purported benefits?
OBO Foundry Update: April 2010
Presentation transcript:

Ontology and the Semantic Web Barry Smith August 26,

Ontologies are computer-tractable representations of types in specific areas of reality are more and less general (upper and lower ontologies) – upper = organizing ontologies – lower = domain ontologies 2

FMA Pleural Cavity Pleural Cavity Interlobar recess Interlobar recess Mesothelium of Pleura Mesothelium of Pleura Pleura(Wall of Sac) Pleura(Wall of Sac) Visceral Pleura Visceral Pleura Pleural Sac Parietal Pleura Parietal Pleura Anatomical Space Organ Cavity Organ Cavity Serous Sac Cavity Serous Sac Cavity Anatomical Structure Anatomical Structure Organ Serous Sac Mediastinal Pleura Mediastinal Pleura Tissue Organ Part Organ Subdivision Organ Subdivision Organ Component Organ Component Organ Cavity Subdivision Organ Cavity Subdivision Serous Sac Cavity Subdivision Serous Sac Cavity Subdivision part_of is_a Foundational Model of Anatomy 3

ontologies = standardized labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically accessible to computers 4

by allowing grouping of annotations brain 20 hindbrain 15 rhombomere 10 Query brain without ontology 20 Query brain with ontology 45 Ontologies facilitate retrieval of data 5

ontologies = high quality controlled structured vocabularies used for the annotation (description, tagging) of data, images, s, documents, … 6

Ontology’s greatest successes around net-centricity You build a site Others discover the site and they link to it The more they link, the more well known the page becomes (Google …) Your data becomes discoverable Your data becomes more easily discoverable the more you use common vocabularies 7

1.Each group creates a controlled vocabulary of the terms commonly used in its domain, and creates an ontology out of these terms using OWL (Web Ontology Language) syntax 4.Binds this ontology to its data and makes these data available on the Web 5.The ontologies are linked e.g. through their use of some common terms 6.These links create links among all the datasets, thereby creating a ‘web of data’ 7.We can all share the same tags – they are called internet addresses The roots of Semantic Technology 8

Audio Features Ontology 9

10

Where we stand today increasing availability of semantically enhanced data and semantic software increasing use of OWL (Web Ontology Language) in attempts to create useful integration of on-line data and information “Linked Open Data” the New Big Thing 11

as of September

The problem: the more this sort of Semantic Technology is successful, they more it fails The original idea was to break down silos via common controlled vocabularies for the tagging of data The very success of the approach leads to the creation of ever new controlled vocabularies – semantic silos – as ever more ontologies are created in ad hoc ways Every organization and sub-organization now wants to have its own “ontology” The Semantic Web framework as currently conceived and governed by the W3C yields minimal standardization 13

Divided we fail 14

United we also fail 15

The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business systems and Web services these systems are poorly integrated deliver redundant capabilities, make data hard to access, foster error and waste prevent secondary uses of data Based on FY11 Defense Information Technology Repository (DITPR) data 16

what is missing here 17

Syntactic and semantic interoperability Syntactic interoperability = systems can exchange messages (realized by XML). Semantic interoperability = messages are interpreted in the same way by senders and receivers. In UCore, meanings are specified via natural- language strings. Experience shows that this is not a viable route to achieving semantic interoperability. 18

How to avoid the problem of semantic siloes Distributed Development of a Shared Semantic Resource Pilot testing to demonstrate feasibility for I2WD 19

An alternative solution: Semantic Enhancement A distributed incremental strategy of coordinated annotation – data remain in their original state (is treated at ‘arms length’) – ‘tagged’ using interoperable ontologies created in tandem – allows flexible response to new needs, adjustable in real time – can be as complete as needed, lossless, long-lasting because flexible and responsive – big bang for buck – measurable benefit even from first small investments The strategy works only to the degree that it rests on shared governance and training 20

compare: legends for maps 21

compare: legends for maps common legends allow (cross-border) integration 22

The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 23

The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex 24

The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 25

Common legends help human beings use and understand complex representations of reality help human beings create useful complex representations of reality help computers process complex representations of reality help glue data together But common legends serve these purposes only if the legends are developed in a coordinated, non-redundant fashion 26

International System of Units 27

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 28

CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) rationale of OBO Foundry coverage GRANULARITY RELATION TO TIME 29

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Population-level ontologies 30

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Environment Ontology environments 31

32 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) E N V I R O N M E N T

33 RELATION TO TIME GRANULARITY CONTINUANT INDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Environment of population ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Environment of single organism CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Environment of cell MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular environment E N V I R O N M E N T

The OBO Foundry based on the idea of annotation = semantic enhancement of data across all of biology $200 mill. spent so far on using the GO to annotate (tag) biomedical research data through manual effort of PhD biologusts 34

OBO Foundry approach extended into other domains 35 NIF StandardNeuroscience Information Framework ISF OntologiesIntegrated Semantic Framework OGMS and ExtensionsOntology for General Medical Science IDO ConsortiumInfectious Disease Ontology cROPCommon Reference Ontologies for Plants

What these annotations do make data retrievable even by those not involved in their creation allow integration of data deriving from heterogeneous sources break down the walls of roach motels 36

Benefits of the Approach Does not interfere with the source content Enables content to evolve in a cumulative fashion as it accommodates new kinds of data Does not depend on the data resources and can be developed independently from them in an incremental and distributed fashion Provides a more consistent, homogeneous, and well-articulated presentation of the content which originates in multiple internally inconsistent and heterogeneous systems 37

Benefits of the Approach Makes management and exploitation of the content more cost-effective Allows graceful integration with other government initiatives and brings the system closer to the federally mandated net-centric data strategy Creates incrementally an integrated content that is effectively searchable and that provides content to which more powerful analytics can be applied 38

Building the Shared Semantic Resource Methodology of distributed incremental development Training Governance Common Architecture of Ontologies to support consistency, non-redundancy, modularity – Upper Level Ontology (BFO) – Mid-Level Ontologies – Low Level Ontologies 39

Goal: To realize Horizontal Integration(HI) of intelligence data HI =Def. the ability to exploit multiple data sources as if they are one  Problem: the data coming onstream are out of our control  Any strategy for HI must be agile in the sense that it can be quickly extended to new zones of emerging data according to need 40

I2WD Strategy Create an agile strategy for building ontologies within a Shared Semantic Resource (SSR) and apply and extend these ontologies to annotate new source data as they come onstream ⁻Problem: Given the immense and growing variety of data sources, the development methodology must be applied by multiple different groups ⁻How to manage collaboration? 41

Why do large-scale ontology projects fail? focus on vocabularies, lexicons, with no logical structure, no attention to life cycle failure of housekeeping yields redundancy and therefore forking the same data is annotated in different ways by users of different ontology fragments data is siloed as before – HOW TO BUILD THE NEEDED LOGIC INTO THE ARCHITECTURE OF THE ONTOLOGIES? 42

Examples of Principles All terms in all ontologies should be singular nouns Same relations between terms should be reused in every ontology Reference ontologies should be based on single inheritance All definitions should be of the form an S = Def. a G which Ds where ‘G’ (for: genus) is the parent term of S (for: species) in the corresponding reference ontology

Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) Extension Strategy + Modular Organization 44 top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO)

Ontologies are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 45

More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 46

More benefits of orthogonality No need to reinvent the wheel for each new domain Can profit from storehouse of lessons learned Can more easily reuse what is made by others Can more easily reuse training Can more easily inspect and criticize results of others’ work Leads to innovations (e.g. Mireot, Ontofox) in strategies for combining ontologies 47