Download presentation
Presentation is loading. Please wait.
1
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo http://ontology.buffalo.edu/smith
2
NCBO: National Center for Biomedical Ontology (NIH Roadmap Center) 2 Stanford Medical Informatics University of San Francisco Medical Center The Mayo Clinic University of Washington, Department of Structural Biology University of Pittsburgh, Department of Biomedical Informatics Biomedical Informatics Research Network (BIRN) University at Buffalo, Department of Philosophy
3
On June 22, 1799, in Paris, everything changed 3
4
International System of Units 4
5
Multiple kinds of data in multiple kinds of silos Lab / pathology data EHR data Clinical trial data Patient histories Medical imaging Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data 5
6
How to find data? How to find other people’s data? How to reason with data when you find it? How to work out what data does not yet exist? 6
7
7 How to solve the problem of making the data we find queryable and re- usable by others? Part of the solution must involve: standardized terminologies and coding schemes
8
But there are multiple kinds of standardization for biomedical data, and they do not work well together Terminologies (SNOMED, UMLS) CDEs (Clinical research) Information Exchange Standards (HL7 RIM) LIMS (LOINC) MGED standards for microarray data, etc. top-down grid frameworks (caBIG) 8
9
9 most successful, thus far: UMLS Unified Medical Language System collection of separate terminologies built by trained experts massively useful for information retrieval and information integration UMLS Metathesaurus a system of post hoc mappings between overlapping source vocabularies developed according to different and sometimes conflicting standards
10
10 for UMLS local usage respected regimentation frowned upon cross-framework consistency not important no concern to establish consistency with basic science different grades of formal rigor, different degrees of completeness, different update policies, capricious policies for empirical testing
11
A good solution to the silo problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 11
12
12 ontologies = standardized labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically accessible to computers
13
13 ontologies = high quality controlled structured vocabularies for the annotation (description) of data
14
Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological Ontology Syst. Biol. 56(2):283–294, 2007
15
15 what cellular component? what molecular function? what biological process? ontologies used in curation of literature
16
16 Ontologies help integrate complex representations of reality help human beings find things in complex representations of reality help computers reason with complex representations of reality
17
The Gene Ontology
18
Ontologies facilitate grouping of annotations brain 20 hindbrain 15 rhombomere 10 Query brain without ontology 20 Query brain with ontology 45 but they succeed in this only if there is one consensus ontology for each domain 18
19
19
20
20
21
21 People are extending the GO methodology to other domains of biology and of clinical and translational medicine?
22
It is easier to write useful software if one works with a simplified model (“…we can’t know what reality is like in any case; we only have our concepts…”) This looks like a useful model to me (One week goes by:) This other thing looks like a useful model to him Data in Pittsburgh does not interoperate with data in Vancouver Science is siloed The standard engineering methodology
23
23 an analogue of the UMLS problem proliferation of tiny ontologies by different groups with urgent annotation needs
25
25 the solution establish common rules governing best practices for creating ontologies in coordinated fashion, with an evidence- based pathway to incremental improvement
26
26 a shared portal for (so far) 58 ontologies (low regimentation) http://obo.sourceforge.nethttp://obo.sourceforge.net NCBO BioPortal First step (2001)
27
27
28
OBO builds on the principles successfully implemented by the GO recognizing that ontologies need to be developed in tandem 28
29
The methodology of cross-products compound terms in ontologies to be defined as cross-products of simpler terms: E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose. = factoring out of ontologies into discipline- specific modules (orthogonality) 29
30
The methodology of cross-products enforcing use of common relations in linking terms drawn from Foundry ontologies serves to ensure that the ontologies are maintained and revised in tandem logically defined relations serve to bind terms in different ontologies together to create a network 30
31
31 The OBO Foundry http://obofoundry.org/ Third step (2006)
32
32 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO
33
33 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) initial OBO Foundry coverage GRANULARITY RELATION TO TIME
34
34 CRITERIA opennness common formal language. collaborative development evidence-based maintenance identifiers versioning textual and formal definitions CRITERIA
35
Orthogonality = modularity one ontology for each domain no need for mappings (which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change) everyone knows where to look to find out how to annotate each kind of data 35
36
36 COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO) CRITERIA
37
OBO Foundry provides guidelines (traffic laws) to new groups of ontology developers in ways which can counteract current dispersion of effort
38
38 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO
39
39 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) GRANULARITY RELATION TO TIME
40
Basic Formal Ontology continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function
41
BFO: The Very Top continuant independent continuant dependent continuant quality function role disposition occurrent
42
function - of liver: to store glycogen - of birth canal: to enable transport - of eye: to see - of mitochondrion: to produce ATP - of liver: to store glycogen not optional; reflection of physical makeup of bearer
43
role optional: exists because the bearer is in some special natural, social, or institutional set of circumstances in which the bearer does not have to be
44
role - bearers can have more than one role person as student and staff member - roles often form systems of mutual dependence husband / wife first in queue / last in queue doctor / patient host / pathogen
45
role of some chemical compound: to serve as analyte in an experiment of a dose of penicillin in this human child: to treat a disease of this bacteria in a primary host: to cause infection
46
A good solution to the silo problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 46
47
Because the ontologies in the Foundry are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 47
48
More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 48
49
More benefits of orthogonality it rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes thereby brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness 49
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.