Goal and Status of the OBO Foundry Barry Smith
2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc. let a million flowers (and weeds) bloom to create integration rely on (automatically generated?) post hoc mappings The result is noisy How create broad-coverage semantic annotation systems for biomedicine?
Perhaps even deadly 3
4 for science develop high quality annotation resources in a collaborative, community effort creating an evolutionary path towards improvement of terminologies of the sort we find elsewhere in science Foundry alternative:prospective standardization
5
6 what makes GO so wildly successful ?
7 science basis of the GO: trained experts curating peer-reviewed literature different model organism databases employ scientific curators who use the experimental observations reported in the biomedical literature to associate GO terms with gene products in a coordinated way The methodology of annotations
8 cellular locations molecular functions biological processes used to annotate the entities represented in the major biochemical databases thereby creating integration across these databases and making them available to semantic search A set of standardized textual descriptions of
9 and also need to extend the GO by engaging ever broader community support for the addition of new terms and for the correction of errors need to extend the methodology to other domains, including clinical domains
10 this requires that we establish common rules governing best practices for creating ontologies and for using these in annotations apply these rules to create a complete suite of orthogonal interoperable biomedical reference ontologies
11 shared portal + low regimentation NCBO BioPortal 2003
12 The OBO Foundry
13 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping) established March initial candidate OBO ontologies – focused primarily on basic science domains several being constructed ab initio by influential consortia who have the authority to impose their use on large parts of the relevant communities.
14 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy?) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO
15 OBO Foundry = a subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles reflecting best practice in ontology development designed to ensure tight connection to the biomedical basic sciences compatibility interoperability, common relations formal robustness support for logic-based reasoning The OBO Foundry
16 CRITERIA The ontology is OPEN and available to be used by all. The ontology is in, or can be instantiated in, a COMMON FORMAL LANGUAGE. The developers of the ontology agree in advance to COLLABORATE with developers of other OBO Foundry ontology where domains overlap. CRITERIA The OBO Foundry
17 CRITERIA UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary. The OBO Foundry
18 for science if we annotate a database or body of literature with one high-quality biomedical ontology, we should be able to add annotations from a second such ontology without conflicts AND WITHOUT THE NEED FOR MAPPINGS orthogonality of ontologies implies additivity of annotations The OBO Foundry
19 CRITERIA IDENTIFIERS: The ontology possesses a unique identifier space within OBO. VERSIONING: The ontology provider has procedures for identifying distinct successive versions to ensure BACKWARDS COMPATIBITY with annotation resources already in common use The ontology includes TEXTUAL DEFINITIONS and where possible equivalent formal definitions of its terms. CRITERIA
20 CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content. DOCUMENTATION: The ontology is well- documented. USERS: The ontology has a plurality of independent users. CRITERIA The OBO Foundry
21 COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* * Smith et al., Genome Biology 2005, 6:R46 CRITERIA The OBO Foundry
Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Disease, Disorder and Treatment (OGMS) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) CHEBI Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) Extension Strategy – Downward Population 22 top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO)
OGMS Downward Population + Hub-Spokes Strategy
OGMS Cardiovascular Disease Ontology Genetic Disease Ontology Cancer Disease Ontology Genetic Disease Ontology Immune Disease Ontology Environmental Disease Ontology Oral Disease Ontology Infectious Disease Ontology …
OGMS Cardiovascular Disease Ontology Genetic Disease Ontology Cancer Disease Ontology Genetic Disease Ontology Immune Disease Ontology Environmental Disease Ontology Oral Disease Ontology Infectious Disease Ontology …
BFO, OGMS, and IDO Material Entity Disposition Process Disorder Disease Disease Course Infection Infectious Disease Infectious Disease Course
OGMS Cardiovascular Disease Ontology Genetic Disease Ontology Cancer Disease Ontology Genetic Disease Ontology Immune Disease Ontology Environmental Disease Ontology Oral Disease Ontology Infectious Disease Ontology IDO Staph Aureus IDO MRSA IDO Australian MRSA IDO Australian Hospital MRSA …
How IDO evolves IDOCore IDOSa IDOHumanSa IDORatSa IDOStrep IDORatStrep IDOHumanStrep IDOMRSA IDOHumanBacterial IDOAntibioticResistant IDOMALIDOHIV CORE and SPOKES: Domain ontologies SEMI-LATTICE: By subject matter experts in different communities of interest. IDOFLU 28
Options for multi-species GO: scrambled up Kuśnierczyk W. Taxonomy-based partitioning of the Gene Ontology. Journal of Biomedical Informatics. 2008;41:282–292. doi: /j.jbi PRO: sub-typing within the ontology – species neutral label mouse label arabidopsis label … IDO: Hasse diagram
Status Successes New ontologies being added to the OBO library Advance in cross-product methodology
31 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy?) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO
RELATION TO TIME GRANULARITY CONTINUANT OCCURRE NT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (CHEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) Population-level ontologies 32
RELATION TO TIME GRANULARITY CONTINUANT OCCURRE NT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (CHEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) Population-level ontologies 33 Environment (EnvO, EO)
Successes The OBO Foundry strategy for ontology collaboration and reuse is being replicated in major grant-funded projects
OBO Foundry approach extended into other domains 35 NIF StandardNeuroscience Information Framework ISF OntologiesIntegrated Semantic Framework for Clinical and Translational Science ImmPortImmunology Database and Analysis Portal OGMS and ExtensionsOntology for General Medical Science IDO ConsortiumInfectious Disease Ontology cROPCommon Reference Ontologies for Plants FUNDED
Successes Huge and continuing expansion in the awareness of the need for re-using ontologies Huge and continuing expansion in ontology software created to support Foundry efforts (Ontobee, Mireot, …)
Immunology Database and Analysis Portal (ImmPort)
Current status Coordinating editors: Michael Ashburner Chris Mungall Suzanna Lewis Alan Ruttenberg Richard Scheuermann Barry Smith
New operations committee operations-committee/wiki/OutreachWG operations-committee/wiki/OutreachWG Mathias Brochhausen Melanie Courtot Melissa Haendel Janna Hastings Chris Mungall Alan Ruttenberg Ramona Walls
Ontologies admitted to full membership afte first phase of reviews CHEBI: Chemical Entities of Biological Interest GO: Gene Ontology PATO: Phenotypic Quality Ontology PRO: Protein Ontology XAO: Xenopus Anatomy Ontology ZFA: Zebrafish Anatomy Ontology
Current status Next round of candidates for review OGMS: Ontology for General Medical Science OBI: Ontology for Biomedical Investigations CL: Cell Ontology IDO: Infectious Disease Ontology
Ontology for General Medical Science Jobst Landgrebe (former Co-Chair of the HL7 Vocabulary Group, now Head of Datamining at Allianz Healthcare): “the best ontology effort in the whole biomedical domain by far” 43