OBO Foundry Update: April 2010 Barry Smith
OBO Foundry: The Underlying Idea We have lots and lots of ontologies Most of them are stand alone Yet the biology is inherently interconnected Across granular levels (molecules, cells, organisms, ...) Across different basic types (objects, processes, functions ...) Rigorously enforced modularity
Biological Process Cellular Component Molecular Function RELATION TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function Molecular Process
OBO Foundry Open Biological/Biomedical Ontologies Foundry International open initiative since 2006 (Nat Biotech 2007) Coordinating editors: Michael Ashburner, Chris Mungall, Suzanna Lewis, Alan Ruttenberg, Richard H. Scheuermann, Barry Smith http://obofoundry.org Create a suite of orthogonal and interoperable ontologies Maintain a framework for governance and testing of best practices Initiate a formal, peer review process for well-structured biological accurate ontologies
Why OBO Foundry ontology standards? Operational efficiencies Eliminate duplication of effort, code once, retrieve and reuse efficiently Unbiased, neutral perspective Considering biology/biomedical domain as a whole Homesteading by experts who have an incentive to ensure adequate terminology resources for their respective disciplines
OBO Foundry Principles Revised Proposals February 2010
OBO Foundry Principles open (Creative Commons 3.0 CC-by license or equivalent) common formal language (OBO Format, OWL 2, Common Logic) commitment to collaboration maintenance in light of scientific advance unique identifier space (URIs) naming conventions (Susanna / EBI) versioning (metadata for changes)
OBO Foundry Principles ontologies should be conceivable as the result of populating downwards from some fragment of (BFO2.0) clearly delineated content (coherent natural language definitions of top-level term(s) incorporating cross-product links to other OBO Foundry ontologies) the ontology is well-documented (e.g. in a published paper describing the ontology or in manuals for developers and users) plurality of mutually independent users (documented e.g. via pointers in external URIs, use in cross-products)
OBO Foundry Principles single locus of authority, tracker (SOP), responsive help desk textual definitions (SOP) for all terms, plus equivalent formal definitions (for at least a substantial number of terms) all definitions of the genus-species form, utilizing (some) cross-products single is_a inheritance (= each ontology should be conceived as consisting of a core of asserted single inheritance links, with further is_a relations inferred)
Results of peer review process 2nd OBO Foundry meeting in Hinxton, UK in June 2009 Special mention: CL (Cell Ontology) FMA (Foundational Model of Anatomy Ontology) EnvO: Environment Ontology HPO: Human Phenotype Ontology OBI: Ontology for Biomedical Investigations SO: Sequence Ontology
Approved for full membership CHEBI: Chemical Entities of Biological Interest GO: Gene Ontology PATO: Phenotypic Quality Ontology PRO: Protein Ontology XAO: Xenopus Anatomy Ontology ZFA: Zebrafish Anatomy Ontology
Goals of the peer review process identify problems in ontologies identify, nurture and recommend best practices education: provide illustrations of good practice and flagposts for those new to the domain motivation of ontology authors of ontology reviewers of ontology users, publishers, vendors analogous to scientific journal peer review
Linking ontologies via cross products Compound terms are formed out of simpler constituents {cysteine, leucine, collagen,..} X {biosynthesis, metabolism,..}
LEGO examples: GO+CL GO:germ cell nucleus =def GO:nucleus ChEBI PRO CC cell anat envo PATO MF BP GO-CC 16 174 Cell 14 8 70 13 Gross anat Envo 19 28 phen MP 273 10 ? 654 3300 5815 1040 WP 115 45 432 1450 686 TO 127 2 355 1 disease GO-MF 2407 GO-BP 511 1085 55 GO:germ cell nucleus =def GO:nucleus that is part_of CL:germ cell
LEGO examples: CL+PATO ChEBI PRO CC cell anat envo PATO MF BP GO-CC 16 174 Cell 14 8 70 13 Gross anat Envo 19 28 phen MP 273 10 654 3300 5815 1040 WP 115 45 432 1450 686 TO 127 2 355 19? 1 disease GO-MF 2407 GO-BP 511 1085 55 CL:spiny neuron =def CL:neuron that has_quality PATO:spiny
LEGO examples: CL+ChEBI PRO CC cell anat envo PATO MF BP GO-CC 16 174 Cell 14 8 70 13 Gross anat Envo 19 28 phen MP 273 10 654 3300 5815 1040 WP 115 45 432 1450 686 TO 127 2 355 19? 1 disease GO-MF 2407 GO-BP 511 1085 55 CL:estradiol secreting cell =def CL:secretory cell that has_output CHEBI:estradiol
LEGO examples: EnvO+ChEBI PRO CC cell anat envo PATO MF BP GO-CC 16 174 Cell 14 8 70 13 Gross anat Envo 19 28 phen MP 273 10 654 3300 5815 1040 WP 115 45 432 1450 686 TO 127 2 355 19? 1 disease GO-MF 2407 GO-BP 511 1085 55 ENVO:xylene contaminated soil =def ENVO:soil that has_contaminant CHEBI:xylene
LEGO examples: GO+Anatomy ChEBI PRO CC cell anat envo PATO MF BP GO-CC 16 174 Cell 14 8 70 13 Gross anat Envo 19 28 phen MP 273 10 654 3300 5815 1040 WP 115 45 432 1450 686 TO 127 2 355 19? 1 disease GO-MF 2407 GO-BP 511 1085 55 GO:heart development =def GO:developmental process that results_in_development_of FMA:heart
LEGO examples: Diseases MP:abnormal circulating glucose =def PATO:concentration that inheres_in MA:blood towards CHEBI:glucose ChEBI PRO CC cell anat envo PATO MF BP GO-CC 16 174 Cell 14 8 70 13 Gross anat Envo 19 28 phen MP 273 10 654 3300 5815 1040 WP 115 45 432 1450 686 TO 127 2 355 1 disease GO-MF 2407 GO-BP 511 1085 55 gluconeogenesis diabetes
Sustainable standards policy requires Openness Readily available to access and open to participation Separation of Duties Ontology is vetted by multiple groups Generational Compliance Initially support backward compatibility so no one is stranded Enforce a more rigorous set of criteria in successive generation of products
Varieties of application ontology cross-border national parks slims cross-product ontologies template ontologies: Infectious Disease Ontology
RELATION TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function Molecular Process cross-border national parks: an ontology for studying the effects of viral infection on cell function in shrimp
http://www.infectiousdiseaseontology.org/