What is an ontology? Barry Smith 1.

Slides:



Advertisements
Similar presentations
Species-Neutral vs. Multi-Species Ontologies Barry Smith.
Advertisements

Lecture 7 Towards a Standard Upper Level Ontology.
On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
Goal and Status of the OBO Foundry Barry Smith. 2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.  let a million flowers (and weeds) bloom  to.
Development of the Field of Biomedical Ontology Barry Smith New York State Center of Excellence in Bioinformatics and Life Sciences University at Buffalo.
Towards an Ontological Treatment of Disease and Diagnosis Barry Smith New York State Center of Excellence in Bioinformatics and Life Sciences University.
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo
1 Doing Ontology Over Images Barry Smith. What ontologies are for.
What is an ontology and Why should you care? Barry Smith 1.
OGMS Applied OGMS is the Ontology for General Medical Science, which provides definitions for all the terms (such as ‘disorder’, ‘symptom’, and so forth)
1 Introduction to (Geo)Ontology Barry Smith
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Function, Role, and Disposition in Basic Formal Ontology Robert Arp and Barry Smith Ontology Research Group (ORG) National Center for.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
The Problem of Reusability of Biomedical Data OBO Foundry & HL7 RIM Barry Smith.
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
The Future of Ontology in Buffalo Barry Smith 1.
AN INTRODUCTION TO BIOMEDICAL ONTOLOGY Barry Smith University at Buffalo 1.
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
Room for Lunch: Arlington Room Room for Evening Reception: Grand Prairie Room.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
Building a Suite of Biomedical Ontologies Barry Smith 1.
1 Ontologie als konkretisierte Darstellung der Wirklichkeit Barry Smith.
1 BIOLOGICAL DOMAIN ONTOLOGIES & BASIC FORMAL ONTOLOGY Barry Smith.
How to Organize the World of Ontologies Barry Smith 1.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
The Core Infectious Disease Ontology. Purpose: To make infectious disease-relevant data deriving from different sources comparable and computable Across.
BFO and Disease Barry Smith Milan, September 4,
Towards an Autoimmune Disease Ontology Alexander D. Diehl 6/13/12.
Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015.
Ontology in Buffalo September 29, 2014 Barry Smith.
Disease, and Other Clinical Natural Kinds Barry Smith Gradualist Approaches to Health and Disease Berlin, March 23,
The Ontology for General Medical Science Barry Smith 11/5/2012.
Limning the CTS Ontology Landscape Barry Smith 1.
Switch on Webex.
OGMS Ontology for General Medical Science 1.
Ontology of Sensors: Some Examples from Biology
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
BFO and Disease Barry Smith 8/ A Chart representing how John’s temperature changes 2.
BFO, SNOMED and Disease Barry Smith IHTSDO, Bethesda, October 8,
Ontology for General Medical Science Overview and OBO Foundry Criteria Albert Goldfain Blue Highway / University at Buffalo ICBO.
Basic Building Blocks for Biomedical Ontologies Barry Smith 1.
BFO and Ontology Design Principles Barry Smith 1.
Horizontal Integration of Warfighter Intelligence Data A Shared Semantic Resource for the Intelligence Community Barry Smith, University at Buffalo, NY,
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Ontology of Disease and the OBO Foundry Chris Mungall NCBO GO Nov 2006.
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
Biomedical Ontologies: The State of the Art Barry Smith and Werner Ceusters MIE, Sarajevo, August 30 1.
How to integrate data Barry Smith. The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business.
2 3 where in the body ? where in the cell ?
Ontology and the Semantic Web Barry Smith August 26,
What is an ontology and Why should you care? Barry Smith 1.
Ontology of Aging Barry Smith March 17,
Need for common standard upper ontology
What developers need to know about ontologies? Barry Smith 1.
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
Information Artifact Ontology Barry Smith 1.
Immunology Ontology Rho Meeting October 10, 2013.
Ontology of Pain Barry Smith National Center for Ontological Research University at Buffalo.
OBO Foundry Principles BFO RO Barry Smith 1. OBO Foundry Principles  open  common formal language (OBO Format, OWL DL, CL)  commitment to collaboration.
Basic Formal Ontology Barry Smith August 26, 2013.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
What is an ontology and Why should you care? Barry Smith 1.
Why do we need upper ontologies? What are their purported benefits?
OBO Foundry Update: April 2010
Toward an Ontological Treatment of Disease and Diagnosis
Presentation transcript:

What is an ontology? Barry Smith 1

You’re interested in which genes control heart muscle development 17,536 results 2

attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Microarray data shows changed expression of thousands of genes. How will you spot the patterns? 3

Lab / pathology data EHR data Clinical trial data Family history data Medical image data Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data How will you find the data you need? 4

−Human −Mouse −Rat −Fish −Yeast −E. coli How will you find the compare the data? How will you integrate the data 5

:. The GO Idea MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity

:. annotation using common ontologies yields integration of databases MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex

ontologies are legends for data 8

they provide a growing set of natural language labels to make the data cognitively accessible to human beings and algorithmically accessible to reasoning systems 9

compare: legends for maps 10

11

:. legends for textbook diagrams

:. ontologies as legends for images 13

what lesion ? what brain function ? 14

legends for literature 15

x i = vector of measurements of gene i k = the state of the gene ( as “on” or “off”) θ i = set of parameters of the Gaussian model... ontologies as legends for mathematical equations 16

17

Pathway diagrams as ontologically annotated dynamic cartoons 18

two kinds of annotations 19

names of instances 20

names of types 21

Ontologies are representations of types 22

... types which are instantiated e.g. in the lab or clinic 23

multiple kinds of relations between represented types provide a tool for algorithmic reasoning 24

Gene Ontology: The Very Top cellular component molecular function biological process 25

Gene Ontology: The Very Top continuant cellular component molecular function occurrent biological process 26

BFO: The Very Top continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function 27

Basic Formal Ontology continuant occurrent independent continuant dependent continuant organism 28

Basic Formal Ontology continuant occurrent independent continuant dependent continuant anatomical structure 29

Continuants continue to exist through time, preserving their identity while undergoing different sorts of changes independent continuants – objects, things,... dependent continuants – qualities, attributes, shapes, potentialities... 30

Qualities temperature blood pressure mass... are continuants they exist through time while undergoing changes 31

Qualities temperature / blood pressure / mass... are dimensions of variation within the structure of the entity; a quality is something which can change while its bearer remains one and the same 32

Qualities temperature / blood pressure / mass... are dimensions of variation within the structure of the entity; a quality is something which can change while its bearer remains one and the same hence only independent continuants may have qualities 33

A Chart representing how John’s temperature changes 34

John’s temperature the temperature he has throughout his entire life, cycles through different determinate temperatures from one time to the next John’s temperature is a physiology variable which, in thus changing, exerts an influence on other physiology variables through time 35

BFO: The Very Top continuant independent continuant dependent continuant quality occurrent temperature 36

Blinding Flash of the Obvious independent continuant dependent continuant quality temperature types instances organism John John’s temperature 37

Blinding Flash of the Obvious independent continuant dependent continuant quality temperature types instances organism John John’s temperature 38

Blinding Flash of the Obvious temperature types instances organism John John’s temperature 39 inheres_in

temperature types instances John’s temperature 40 37ºC37.1ºC37.5ºC37.2ºC37.3ºC37.4ºC instantiates at t 1 instantiates at t 2 instantiates at t 3 instantiates at t 4 instantiates at t 5 instantiates at t 6

human types instances John 41 embryofetusadultneonateinfantchild instantiates at t 1 instantiates at t 2 instantiates at t 3 instantiates at t 4 instantiates at t 5 instantiates at t 6

lower lever of types does not ‘carry identity’ in OntoClean terms are threshold divisions (hence we do not have sharp boundaries, and we have a certain degree of choice, e.g. in how many subtypes to distinguish, though not in their ordering) 42

independent continuant dependent continuant quality temperature types instances organism John John’s temperature 43

independent continuant dependent continuant quality temperature organism John John’s temperature occurrent process course of temperature changes John’s temperature history 44

independent continuant dependent continuant quality temperature organism John John’s temperature occurrent process life of an organism John’s life 45

BFO/GO: The Very Top continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function 46

BFO: The Very Top continuantoccurrent independent continuant dependent continuant quality function role disposition 47

:. Function - of liver: to store glycogen - of birth canal: to enable transport - of eye: to see - of mitochondrion: to produce ATP - of liver: to store glycogen not optional; reflection of physical makeup of bearer; can malfunction 48

:. Role optional: exists because the bearer is in some special natural, social, or institutional set of circumstances in which the bearer does not have to be 49

:. Role - bearers can have more than one role person as student / as staff member - roles often form systems of mutual dependence husband / wife first in queue / last in queue doctor / patient host / pathogen 50

:. Role of some chemical compound: to serve as analyte in an experiment of a dose of penicillin in this human child: to treat a disease of this bacteria in a primary host: to cause infection 51

:. Qualities are categorical features of reality – you just have them Functions, roles and dispositions are potential featires of reality: they are realizable dependent continuants, realized in certain associated processes 52

independent continuant dependent continuant role drug role portion of chemical compound this portion of aspirin role of this portion of aspirin occurrent process process of drug adminstration John’s taking this portion of aspirin 53

independent continuant dependent continuant role drug role portion of chemical compound this portion of aspirin role of this portion of aspirin occurrent process process of drug adminstration John’s taking this portion of aspirin 54 inheres_in realized_in

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 55

The Road to Convergence All ontologies for each given domain (anatomy, chemistry…) should be part of a single suite of interoperable ontologies should use a common top-level core for subdomains with many variants, should follow the strategy of canonical ontologies with extensions should require acceptance of common, tested guidelines on all subscribing ontology developers 56

CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) initial OBO Foundry coverage, ontologies automatically semantically coupled GRANULARITY RELATION TO TIME 57

Disposition (Internally- Grounded Realizable Entity) disposition =def. a realizable entity which if it ceases to exist, then its bearer is physically changed, and whose realization occurs when this bearer is in some special physical circumstances, in virtue of the bearer’s physical make-up 58

Function A Disposition (Internally-Grounded Realizable Entity) that is designed or selected for 59

OGMS Ontology for General Medical Science 60

:. Physical Disorder – independent continuant fiat object part 61

Big Picture 62

A disease is a disposition rooted in a physical disorder in the organism and realized in pathological processes. etiological process produces disorder bears disposition realized_in pathological process produces abnormal bodily features recognized_as signs & symptomsinterpretive process produces diagnosis used_in 63

Elucidation of Primitive Terms ‘bodily feature’ - an abbreviation for a physical component, a bodily quality, or a bodily process. disposition - an attribute describing the propensity to initiate certain specific sorts of processes when certain conditions are satisfied. clinically abnormal - some bodily feature that –(1) is not part of the life plan for an organism of the relevant type (unlike aging or pregnancy), –(2) is causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and –(3) is such that the elevated risk exceeds a certain threshold level.* *Compare: baldness 64

Definitions - Foundational Terms Disorder =def. – A causally linked combination of physical components that is clinically abnormal. Pathological Process =def. – A bodily process that is a manifestation of a disorder and is clinically abnormal. Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism. 65

Dispositions and Predispositions All diseases are dispositions; not all dispositions are diseases. A predisposition is a disposition. Predisposition to Disease of Type X =def. – A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing the disease X. HNPCC is caused by a –disorder (mutation) in a DNA mismatch repair gene that –disposes to the acquisition of additional mutations from defective DNA repair processes, and thus is a –predisposition to the development of colon cancer. 66

Cirrhosis - environmental exposure Etiological process - phenobarbitol- induced hepatic cell death –produces Disorder - necrotic liver –bears Disposition (disease) - cirrhosis –realized_in Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death –produces Abnormal bodily features –recognized_as Symptoms - fatigue, anorexia Signs - jaundice, splenomegaly Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out cirrhosis suggests Laboratory tests produces Test results - elevated liver enzymes in serum used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease cirrhosis 67

Influenza - infectious Etiological process - infection of airway epithelial cells with influenza virus –produces Disorder - viable cells with influenza virus –bears Disposition (disease) - flu –realized_in Pathological process - acute inflammation –produces Abnormal bodily features –recognized_as Symptoms - weakness, dizziness Signs - fever Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out influenza suggests Laboratory tests produces Test results - elevated serum antibody titers used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease flu But the disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course). 68

Huntington’s Disease - genetic Etiological process - inheritance of >39 CAG repeats in the HTT gene –produces Disorder - chromosome 4 with abnormal mHTT –bears Disposition (disease) - Huntington’s disease –realized_in Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum –produces Abnormal bodily features –recognized_as Symptoms - anxiety, depression Signs - difficulties in speaking and swallowing Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out Huntington’s suggests Laboratory tests produces Test results - molecular detection of the HTT gene with >39CAG repeats used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease 69

HNPCC - genetic pre-disposition Etiological process - inheritance of a mutant mismatch repair gene –produces Disorder - chromosome 3 with abnormal hMLH1 –bears Disposition (disease) - Lynch syndrome –realized_in Pathological process - abnormal repair of DNA mismatches –produces Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2) –bears Disposition (disease) - non-polyposis colon cancer –realized in Symptoms (including pain) 70

The OBO Foundry Initiative 71

A good solution to the data integration problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 72

GO is amazingly successful – but covers only three sorts of biological entities: –cellular components –molecular functions –biological processes and does not provide representations of disease-related phenomena 73

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 74

OBO Foundry provides tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology automatic web-based linkage between medical terminologies and biological knowledge resources traffic laws and traffic police 75

the strategy establish common rules governing best practices for creating ontologies in coordinated fashion, with an evidence- based pathway to incremental improvement 76

The methodology of cross-products compound terms in ontologies to be defined as cross-products of simpler terms: E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose. = factoring out of ontologies into discipline- specific modules (orthogonality) 77

The methodology of cross-products enforcing use of common relations in linking terms drawn from Foundry ontologies serves to ensure that the ontologies are maintained and revised in tandem logically defined relations serve to bind terms in different ontologies together to create a network 78

CRITERIA  opennness  common formal language.  collaborative development  evidence-based maintenance  identifiers  versioning  textual and formal definitions CRITERIA 79

Orthogonality = modularity one ontology for each domain no need for mappings (which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change) everyone knows where to look to find out how to annotate each kind of data 80

Ontologies and research groups using BFO and RO –OBO Foundry (60 biomedical ontologies, including GO, OBI, Protein Ontology, Cell Ontology, IDO … –National Cancer Institute (BiomedGT) –NIF (NIH Neuroscience Information Framework) –Cleveland Clinic Semantic Database –Siemens –AstraZeneca –EU (ACGT Cancer Ontology, RAPS, …) 81

Because the ontologies in the Foundry are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 82

More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 83

More benefits of orthogonality it rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes thereby brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness 84

More criteria of a successful standard 1.intelligibility to users, consistent use of terms like ‘term’, ‘class’, ‘entity’, ‘object’ …) 2.track record of lessons learned (GO has 10 years of hard user testing) 3.lots of existing users (ontologies are like telephone networks) 85

 The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO) including the Relation Ontology (RO) COMMON ARCHITECTURE 86

Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) OBO Foundry Modular Organization top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO) 87

continuant independent continuant portion of material object fiat object part object aggregate object boundary site dependent continuant generically dependent continuant information artifact specifically dependent continuant quality realizable entity function role disposition spatial region 0D-region 1D-region 2D-region 3D-region BFO:continuant

occurrent processual entity process fiat process part process aggregate process boundary processual context spatiotemporal region scattered spatiotemporal region connected spatiotemporal region spatiotemporal instant spatiotemporal interval temporal region scattered temporal region connected temporal region temporal instant temporal interval BFO:occurrent

Example: The Cell Ontology

Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) OBO Foundry Modular Organization top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO) 91