Download presentation
Presentation is loading. Please wait.
Published byBennett McLaughlin Modified over 6 years ago
1
Intelligence Ontology: A Strategy for the Future
Barry Smith University at Buffalo
2
Semantic Web, wikis, statistical textmining, etc.
let a million flowers bloom how create broad-coverage semantic annotation systems which will enable sharing of gigantic bodies of heterogeneous data?
3
let a million flowers (weeds) bloom
how create broad-coverage semantic annotation systems which will enable sharing of gigantic bodies of heterogeneous data?
4
let a million microtheories bloom
what about Cyc?
5
why does Cyc not do the job?
#$Configuration A specialization of both #$StaticSituation and #$SpatialThing-Localized. Each instance of #$Configuration is a static situation consisting of two or more #$PartiallyTangible things of certain types standing in a certain type of spatial relationship (or set of relationships). This (set of) spatial relationship(s) characterizes the #$Configuration's _type_ in the sense that any group of objects of the appropriate types standing in that relationship (or those relationships) correspond to a #$Configuration of that type; and each of these objects, in turn, is said to be configured (see #$objectConfigured) in the (individual) #$Configuration. why does Cyc not do the job?
6
why does Cyc not do the job?
(speculations) Cyc doesn’t care about consistency between microtheories so no progressive cumulation from an established core too little concern for consistency with basic science (common sense should not wear the trousers) no perspicuous policies for updating built by outsiders why does Cyc not do the job?
7
Unified Medical Language System (National Library of Medicine)
built by trained experts massively useful for information retrieval and information integration good versioning and term-ID policies creates out of literature a semantically searchable space Unified Medical Language System (National Library of Medicine)
8
for UMLS local usage respected regimentation frowned upon
mappings between ‘synonyms’ full of noise is_synonymous_with is not transitive no cross-framework consistency no concern to establish consistency with basic science different grades of formal rigor, different degrees of completeness, different update policies for UMLS
9
with UMLS-based annotations
we can know what data we have (via term searches) we can map between data at single granularities (via ‘synonyms’) how do we combine data across granularities? how do we resolve logical conflicts ? how do we know what data we don’t have ? how do we reason with data ? with UMLS-based annotations
10
no evolutionary path towards improvement
with UMLS, Cyc, Web 2.0, ...
11
We will be able to use ontologies to help us share data
only if the ontologies represent the world correctly are humanly intelligible and computationally tractable
12
a new approach prospective standardization based on objective measures of what works bring together selected influential groups to agree on good terminology / annotation habits preemptively
13
for science requirements ensure legacy annotation efforts not wasted
create an evolutionary path towards improvement, of the sort we find in science must be a collaborative, community effort to ensure buy-in ensure future-proofing requirements
14
create a consensus core of interoperable domain ontologies
for science create a consensus core of interoperable domain ontologies starting with low hanging fruit and working outwards from there built and validated by trained experts backed by persons of influence in different communities
15
for science geospatial transport religion weather bacteria chemicals
politics law use common rules drawing on best practices for creating ontologies ... and for linking ontologies
16
for science geospatial transport religion weather bacteria chemicals
politics law ... exploiting the division of labor ... relying on champions in dispersed communities to spread the words
17
ontology of documents ontology of provenance ontology of names ontology of numbers (IDs) ontology of signatures ontology of identity ...
18
that people should use the core to annotate their data
19
and set up feedback mechanisms
annotators discover they need more terms more relations between terms to correct existing relations the ontology gets better as it is used
20
This process leads to improvements and extensions of the ontology
which in turn leads to better annotations a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself This process
21
a growing computer-interpretable map of reality within which major databases are automatically integrated in semantically searchable form The result
22
This solution is already being implemented in the domain of biomedicine
23
create a shared portal (low regimentation)
First step (2003) create a shared portal (low regimentation)
25
Second step (2004) reform efforts initiated to link OBO ontologies together and to ensure orthogonality
26
The OBO Foundry http://obofoundry.org/
Third step (2006) The OBO Foundry Why suddenly the switch from deep-hell black background to heavenly white ? Is that an intended connotation or just a copy and paste coincidence?
27
some groups commit to working together and to following common rules
28
a family of interoperable gold standard biomedical reference ontologies to serve the annotation of
scientific literature model organism databases clinical data experimental results The OBO Foundry
29
A subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles designed to ensure tight connection to the biomedical basic sciences compatibility interoperability formal robustness support for logic-based reasoning
30
A prospective standard
designed to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping) established March 2006 12 initial candidate OBO ontologies – focused primarily on basic science domains several being constructed ab initio A prospective standard
31
Ontology Scope URL Custodians Cell Ontology (CL)
cell types from prokaryotes to mammals obo.sourceforge.net/cgi- bin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Bio- logical Interest (ChEBI) molecular entities ebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Refer- ence Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.net FuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes Gene Ontology Consortium Phenotypic Quality (PaTO) qualities of anatomical structures obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications Protein Ontology Consortium Relation Ontology (RO) relations obo.sf.net/relationship Barry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.net Karen Eilbeck
32
Foundry communities include
Transcriptomics (MIAME Working Group) Proteomics (Proteomics Standards Initiative) Metabolomics (Metabolomics Standards Initiative) Genomics and Metagenomics (Genomic Standards Consortium) In Situ Hybridization and Immunohistochemistry (MISFISHIE Working Group) Phylogenetics (Phylogenetics Community) RNA Interference (RNAi Community) Toxicogenomics (Toxicogenomics WG) Environmental Genomics (Environmental Genomics WG) Nutrigenomics (Nutrigenomics WG) Flow Cytometry (Flow Cytometry Community) Foundry communities include
33
Organism-Level Process
CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function Cellular Process MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function Molecular Process RELATION TO TIME GRANULARITY OBO Foundry coverage (canonical ontologies)
34
CRITERIA The ontology is in, or can be instantiated in, a common formal language. The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap. The ontology should be useful (have a plurality of user communities). CRITERIA
35
UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary. CRITERIA
36
for science ORTHOGONALITY
when communities work together to ensure consistency orthogonality additivity of annotation frameworks ADDITIVITY: if we annotate a database or body of literature with terms from one high-quality ontology, we should be able to add annotations from a second such ontology without conflicts ORTHOGONALITY
37
CRITERIA IDENTIFIERS: The ontology possesses a unique identifier space within OBO. VERSIONING: The ontology provider has procedures for identifying distinct successive versions. The ontology includes textual definitions for all terms. CRITERIA
38
COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* * Smith et al., Genome Biology 2005, 6:R46 CRITERIA
39
OBO Relation Ontology Foundational Spatial Temporal Participation is_a
part_of Spatial located_in contained_in adjacent_to Temporal transformation_of derives_from preceded_by Participation has_participant has_agent
40
Create an OIC portal: a list of those ontologies and related artifacts which already exist
2. Find out which groups of ontology developers are willing to commit to working towards interoperability
41
3. Work with these groups in open, on-line and face-to-face, discussions, the records of which are made available on the OIC portal 4. Move towards a suite of authoritative ontologies, one for each domain – stable attractors 5. Make funding depend on use of authoritative ontologies – because these have been shown to work
42
what is a question? a representation of reality with a hole in it
43
They will provide a representation of the context of reality
within which the holes can appear
44
The OIC Foundry ontologies
will be stable, maximally open resources They will be authoritative in light of NCOR’s evaluation measures Will not reveal sensitive capabilities and interests. They will deal with types (of mobile telephone, of spores, of soil ...) Not with instances
45
(from Jen Williams, OWI)
Some problems (from Jen Williams, OWI)
46
Do we have an instruction manual ?
47
create a simple top level framework
Ontology: An Introduction
50
Too few knowledgeable folks, and fewer cleared.
Computer scientists are teaching people ontology tools and ... Mohammed is_a string Amount of money is_a integer
51
what we need Training events (summer school ...)
to teach people to CREATE ONTOLOGY CONTENT to teach people to USE ONTOLOGY CONTENT
52
Joining up will diminish your fiefdom
If you give everyone the keys to you kingdom, how will you justify your budget? We can do this already, as a start, with open source resources, plus a few brave champions of good annotation habits Later we humiliate those who do not join in
53
People use different tools
format/language of the ontology is not easy to understand the OIC Foundry should use ontologies which are maximally format- and language-neutral
54
we are to enable sharing of gigantic bodies of heterogeneous data we need all the help we can get our computers need all the help we can get
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.