Download presentation
Presentation is loading. Please wait.
Published byOscar Snow Modified over 9 years ago
1
Ontology and the Semantic Web Barry Smith August 26, 2013 1
2
Ontologies are computer-tractable representations of types in specific areas of reality are more and less general (upper and lower ontologies) – upper = organizing ontologies – lower = domain ontologies 2
3
FMA Pleural Cavity Pleural Cavity Interlobar recess Interlobar recess Mesothelium of Pleura Mesothelium of Pleura Pleura(Wall of Sac) Pleura(Wall of Sac) Visceral Pleura Visceral Pleura Pleural Sac Parietal Pleura Parietal Pleura Anatomical Space Organ Cavity Organ Cavity Serous Sac Cavity Serous Sac Cavity Anatomical Structure Anatomical Structure Organ Serous Sac Mediastinal Pleura Mediastinal Pleura Tissue Organ Part Organ Subdivision Organ Subdivision Organ Component Organ Component Organ Cavity Subdivision Organ Cavity Subdivision Serous Sac Cavity Subdivision Serous Sac Cavity Subdivision part_of is_a Foundational Model of Anatomy 3
4
ontologies = standardized labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically accessible to computers 4
5
by allowing grouping of annotations brain 20 hindbrain 15 rhombomere 10 Query brain without ontology 20 Query brain with ontology 45 Ontologies facilitate retrieval of data 5
6
ontologies = high quality controlled structured vocabularies used for the annotation (description, tagging) of data, images, emails, documents, … 6
7
Ontology’s greatest successes around net-centricity You build a site Others discover the site and they link to it The more they link, the more well known the page becomes (Google …) Your data becomes discoverable Your data becomes more easily discoverable the more you use common vocabularies 7
8
1.Each group creates a controlled vocabulary of the terms commonly used in its domain, and creates an ontology out of these terms using OWL (Web Ontology Language) syntax 4.Binds this ontology to its data and makes these data available on the Web 5.The ontologies are linked e.g. through their use of some common terms 6.These links create links among all the datasets, thereby creating a ‘web of data’ 7.We can all share the same tags – they are called internet addresses The roots of Semantic Technology 8
9
Audio Features Ontology 9
10
10
11
Where we stand today increasing availability of semantically enhanced data and semantic software increasing use of OWL (Web Ontology Language) in attempts to create useful integration of on-line data and information “Linked Open Data” the New Big Thing 11
12
as of September 2010 12
13
The problem: the more this sort of Semantic Technology is successful, they more it fails The original idea was to break down silos via common controlled vocabularies for the tagging of data The very success of the approach leads to the creation of ever new controlled vocabularies – semantic silos – as ever more ontologies are created in ad hoc ways Every organization and sub-organization now wants to have its own “ontology” The Semantic Web framework as currently conceived and governed by the W3C yields minimal standardization 13
14
Divided we fail 14
15
United we also fail 15
16
The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business systems and Web services these systems are poorly integrated deliver redundant capabilities, make data hard to access, foster error and waste prevent secondary uses of data https://ditpr.dod.mil/https://ditpr.dod.mil/ Based on FY11 Defense Information Technology Repository (DITPR) data 16
17
what is missing here 17
18
Syntactic and semantic interoperability Syntactic interoperability = systems can exchange messages (realized by XML). Semantic interoperability = messages are interpreted in the same way by senders and receivers. In UCore, meanings are specified via natural- language strings. Experience shows that this is not a viable route to achieving semantic interoperability. 18
19
How to avoid the problem of semantic siloes Distributed Development of a Shared Semantic Resource Pilot testing to demonstrate feasibility for I2WD 19
20
An alternative solution: Semantic Enhancement A distributed incremental strategy of coordinated annotation – data remain in their original state (is treated at ‘arms length’) – ‘tagged’ using interoperable ontologies created in tandem – allows flexible response to new needs, adjustable in real time – can be as complete as needed, lossless, long-lasting because flexible and responsive – big bang for buck – measurable benefit even from first small investments The strategy works only to the degree that it rests on shared governance and training 20
21
compare: legends for maps 21
22
compare: legends for maps common legends allow (cross-border) integration 22
23
The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 23
24
The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex 24
25
The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 25
26
Common legends help human beings use and understand complex representations of reality help human beings create useful complex representations of reality help computers process complex representations of reality help glue data together But common legends serve these purposes only if the legends are developed in a coordinated, non-redundant fashion 26
27
International System of Units 27
28
RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 28
29
CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) rationale of OBO Foundry coverage GRANULARITY RELATION TO TIME 29
30
RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Population-level ontologies 30
31
RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Environment Ontology environments 31
32
32 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) http://obofoundry.org E N V I R O N M E N T
33
33 RELATION TO TIME GRANULARITY CONTINUANT INDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Environment of population ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Environment of single organism CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Environment of cell MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular environment http://obofoundry.org E N V I R O N M E N T
34
The OBO Foundry based on the idea of annotation = semantic enhancement of data across all of biology $200 mill. spent so far on using the GO to annotate (tag) biomedical research data through manual effort of PhD biologusts 34
35
OBO Foundry approach extended into other domains 35 NIF StandardNeuroscience Information Framework ISF OntologiesIntegrated Semantic Framework OGMS and ExtensionsOntology for General Medical Science IDO ConsortiumInfectious Disease Ontology cROPCommon Reference Ontologies for Plants
36
What these annotations do make data retrievable even by those not involved in their creation allow integration of data deriving from heterogeneous sources break down the walls of roach motels 36
37
Benefits of the Approach Does not interfere with the source content Enables content to evolve in a cumulative fashion as it accommodates new kinds of data Does not depend on the data resources and can be developed independently from them in an incremental and distributed fashion Provides a more consistent, homogeneous, and well-articulated presentation of the content which originates in multiple internally inconsistent and heterogeneous systems 37
38
Benefits of the Approach Makes management and exploitation of the content more cost-effective Allows graceful integration with other government initiatives and brings the system closer to the federally mandated net-centric data strategy Creates incrementally an integrated content that is effectively searchable and that provides content to which more powerful analytics can be applied 38
39
Building the Shared Semantic Resource Methodology of distributed incremental development Training Governance Common Architecture of Ontologies to support consistency, non-redundancy, modularity – Upper Level Ontology (BFO) – Mid-Level Ontologies – Low Level Ontologies 39
40
Goal: To realize Horizontal Integration(HI) of intelligence data HI =Def. the ability to exploit multiple data sources as if they are one Problem: the data coming onstream are out of our control Any strategy for HI must be agile in the sense that it can be quickly extended to new zones of emerging data according to need 40
41
I2WD Strategy Create an agile strategy for building ontologies within a Shared Semantic Resource (SSR) and apply and extend these ontologies to annotate new source data as they come onstream ⁻Problem: Given the immense and growing variety of data sources, the development methodology must be applied by multiple different groups ⁻How to manage collaboration? 41
42
Why do large-scale ontology projects fail? focus on vocabularies, lexicons, with no logical structure, no attention to life cycle failure of housekeeping yields redundancy and therefore forking the same data is annotated in different ways by users of different ontology fragments data is siloed as before – HOW TO BUILD THE NEEDED LOGIC INTO THE ARCHITECTURE OF THE ONTOLOGIES? 42
43
Examples of Principles All terms in all ontologies should be singular nouns Same relations between terms should be reused in every ontology Reference ontologies should be based on single inheritance All definitions should be of the form an S = Def. a G which Ds where ‘G’ (for: genus) is the parent term of S (for: species) in the corresponding reference ontology
44
Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) Extension Strategy + Modular Organization 44 top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO)
45
Ontologies are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 45
46
More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 46
47
More benefits of orthogonality No need to reinvent the wheel for each new domain Can profit from storehouse of lessons learned Can more easily reuse what is made by others Can more easily reuse training Can more easily inspect and criticize results of others’ work Leads to innovations (e.g. Mireot, Ontofox) in strategies for combining ontologies 47
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.