What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.

Slides:



Advertisements
Similar presentations
Species-Neutral vs. Multi-Species Ontologies Barry Smith.
Advertisements

On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
Goal and Status of the OBO Foundry Barry Smith. 2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.  let a million flowers (and weeds) bloom  to.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo
1 The OBO Foundry Towards Gold Standard Terminology Resources in the Biomedical Domain Thomas Bittner (based on a presentation by Barry Smith)
What is an ontology and Why should you care? Barry Smith 1.
1 How Ontologies Create Research Communities Barry Smith
1 How Ontologies Create Research Communities Barry Smith University at Buffalo
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Function, Role, and Disposition in Basic Formal Ontology Robert Arp and Barry Smith Ontology Research Group (ORG) National Center for.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
The Problem of Reusability of Biomedical Data OBO Foundry & HL7 RIM Barry Smith.
Using Ontologies to Represent Immunological Networks Lindsay G. Cowell, Anne Lieberman, Anna Maria Masci Duke University Center for Computational Immunology.
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
AN INTRODUCTION TO BIOMEDICAL ONTOLOGY Barry Smith University at Buffalo 1.
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
Room for Lunch: Arlington Room Room for Evening Reception: Grand Prairie Room.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
Why a Credit Card Number is Not a Number Barry Smith 1.
Semantic Interoperability and the Patient Summary Barry Smith 1.
The RNA Ontology RNAO Colin Batchelor Neocles Leontis May 2009 Eckart, Colin and Jane In Cambridge.
1 BIOLOGICAL DOMAIN ONTOLOGIES & BASIC FORMAL ONTOLOGY Barry Smith.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
CoE Ontology Research Group (ORG) Barry Smith Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group Department of Philosophy.
How to Organize the World of Ontologies Barry Smith 1.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
The Core Infectious Disease Ontology. Purpose: To make infectious disease-relevant data deriving from different sources comparable and computable Across.
1 How Ontologies Create Research Communities Barry Smith
The OBO Foundry approach to ontologies and standards with special reference to cytokines Barry Smith ImmPort Science Talk / Discussion June 17, 2014.
Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015.
Limning the CTS Ontology Landscape Barry Smith 1.
Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz,
Gene Ontology (GO) Project
Managing Information Quality in e-Science using Semantic Web technology Alun Preece, Binling Jin, Edoardo Pignotti Department of Computing Science, University.
The CROP (Common Reference Ontologies for Plants) Initiative Barry Smith September 13,
Ontology of Sensors: Some Examples from Biology
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Imports, MIREOT Contributors: Carlo Torniai, Melanie Courtot, Chris Mungall, Allen Xiang.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
Gene Ontology Project
What is an ontology? Barry Smith 1.
Ontology of Disease and the OBO Foundry Chris Mungall NCBO GO Nov 2006.
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
How to integrate data Barry Smith. The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business.
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
2 3 where in the body ? where in the cell ?
Ontology and the Semantic Web Barry Smith August 26,
What is an ontology and Why should you care? Barry Smith 1.
Need for common standard upper ontology
Towards a Top-Domain Ontology for Linking Biomedical Ontologies Holger Stenzhorn a,b Elena Beißwanger c Stefan Schulz a a Department of Medical Informatics,
What developers need to know about ontologies? Barry Smith 1.
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
Immunology Ontology Rho Meeting October 10, 2013.
OBO Foundry Principles BFO RO Barry Smith 1. OBO Foundry Principles  open  common formal language (OBO Format, OWL DL, CL)  commitment to collaboration.
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Basic Formal Ontology Barry Smith August 26, 2013.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
What is an ontology and Why should you care? Barry Smith 1.
What is an ontology and Why should you care?
The Gene Ontology Project
Why do we need upper ontologies? What are their purported benefits?
OBO Foundry Update: April 2010
Presentation transcript:

What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1

A good solution to the silo / stovepipe problem must be: modular incremental bottom-up (not all standards are equal) evidence-based (thoroughly tested) revisable and evolutionary incorporate a strategy for motivating potential developers and users cost effective work with existing ways of collecting data 2

You’re interested in which genes control heart muscle development 17,536 results 3

attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Microarray data shows changed expression of thousands of genes. How will you spot the patterns? 4

5 You’re interested in which of your hospital’s patient data is relevant to understanding how genes control heart muscle development

6 Lab / pathology data EHR data Clinical trial data Family history data Medical imaging Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data How will you spot the patterns? How will you find the data you need?

Controlled Vocabularies and Common Data Elements will provide a way to capture and represent some of this knowledge in a form that is usable by other clinicians and researchers by you yourself but not by computers 7

EPIC will provide a way to capture and represent some of this knowledge in a form that is usable by computers (somewhat) by you yourself but not by other clinicians and researchers 8

Ontologies will (prospectively) provide a way to capture and represent all this knowledge in a form that is usable by other clinicians and researchers by you yourself and by computers Ontologies provide semantic interoperability 9

Uses of ‘ontology’ in PubMed abstracts 10

11 By far the most successful: GO (Gene Ontology)

12

Definitions 13

Gene products involved in cardiac muscle development in humans 14

How does the Gene Ontology work? 15

1. It provides a controlled vocabulary contributing to the cumulativity of scientific results achieved by distinct research communities multi-national, multi-disciplinary, open source (if we all use kilograms, meters, seconds …, our results are callibrated) 16

17 2. It provides a tool for algorithmic reasoning

Hierarchical view representing relations between represented types 18

The massive quantities of annotations linking GO terms to gene products (proteins) is allowing a new kind of clinical research 19

Uses of GO in studies of pathways associated with heart failure development correlated with cardiac remodeling (PMID ) molecular signature of cardiomyocyte clusters derived from human embryonic stem cells (PMID ) contrast between cardiac left ventricle and diaphragm muscle in expression of genes involved in carbohydrate and lipid metabolism. (PMID ) immune system involvement in abdominal aortic aneurisms in humans (PMID ) 20

A value proposition for the clinical terminologies of the future using the GO terminology standard enables you to do better, fundable, translational research champions of good practice in terminology will test the tools and resources to the point where they will become reliable and easily usable by those who follow 21

GO is amazingly successful – but covers only three sorts of biological entities: –cellular components –molecular functions –biological processes and does not provide representations of disease-related phenomena 22

23 People are extending the GO methodology to other domains of biology and of clinical and translational medicine

24 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry

25 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) initial OBO Foundry coverage, ontologies automatically semantically coupled GRANULARITY RELATION TO TIME

Jeff Rose “if you have the structure and the model correct you don’t have to do it all at once” start with some test disease domains: –Cardiovascular Gene Ontology –Infectious Disease Ontology –Congenital Heart Defect Ontology 26

OBO Foundry provides tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology automatic web-based linkage between medical terminologies and biological knowledge resources 27

But there are multiple kinds of standardization for biomedical data, and they do not work well together Terminologies (SNOMED, UMLS) CDEs (Clinical research) Information Exchange Standards (HL7 RIM) LIMS (LOINC) MGED standards for microarray data, etc. top-down grid frameworks (caBIG) 28

29 most successful, thus far: UMLS Unified Medical Language System collection of separate terminologies built by trained experts massively useful for information retrieval and information integration UMLS Metathesaurus a system of post hoc mappings between overlapping source vocabularies developed according to different and sometimes conflicting standards

30 for UMLS local usage respected regimentation frowned upon cross-framework consistency not important no concern to establish consistency with basic science different grades of formal rigor, different degrees of completeness, different update policies, capricious policies for empirical testing

A good solution to the silo problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 31

It is easier to write useful software if one works with a simplified model (“…we can’t know what reality is like in any case; we only have our concepts…”) This looks like a useful model to me (One week goes by:) This other thing looks like a useful model to him Data in Pittsburgh does not interoperate with data in Vancouver Science is siloed The standard engineering methodology

33 an analogue of the UMLS problem proliferation of tiny ontologies by different groups with urgent annotation needs

35 the solution establish common rules governing best practices for creating ontologies in coordinated fashion, with an evidence- based pathway to incremental improvement

36 a shared portal for (so far) 58 ontologies (low regimentation)  NCBO BioPortal First step (2001)

37

OBO builds on the principles successfully implemented by the GO recognizing that ontologies need to be developed in tandem 38

The methodology of cross-products compound terms in ontologies to be defined as cross-products of simpler terms: E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose. = factoring out of ontologies into discipline- specific modules (orthogonality) 39

The methodology of cross-products enforcing use of common relations in linking terms drawn from Foundry ontologies serves to ensure that the ontologies are maintained and revised in tandem logically defined relations serve to bind terms in different ontologies together to create a network 40

41 The OBO Foundry Third step (2006)

42 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO

43 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) initial OBO Foundry coverage GRANULARITY RELATION TO TIME

44 CRITERIA  opennness  common formal language.  collaborative development  evidence-based maintenance  identifiers  versioning  textual and formal definitions CRITERIA

Orthogonality = modularity one ontology for each domain no need for mappings (which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change) everyone knows where to look to find out how to annotate each kind of data 45

46  COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO) CRITERIA

OBO Foundry provides guidelines (traffic laws) to new groups of ontology developers in ways which can counteract current dispersion of effort

48 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO

49 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) GRANULARITY RELATION TO TIME

Basic Formal Ontology continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function

BFO: The Very Top continuant independent continuant dependent continuant quality function role disposition occurrent

function - of liver: to store glycogen - of birth canal: to enable transport - of eye: to see - of mitochondrion: to produce ATP - of liver: to store glycogen not optional; reflection of physical makeup of bearer

role optional: exists because the bearer is in some special natural, social, or institutional set of circumstances in which the bearer does not have to be

role - bearers can have more than one role person as student and staff member - roles often form systems of mutual dependence husband / wife first in queue / last in queue doctor / patient host / pathogen

role of some chemical compound: to serve as analyte in an experiment of a dose of penicillin in this human child: to treat a disease of this bacteria in a primary host: to cause infection

A good solution to the silo problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 56

Because the ontologies in the Foundry are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 57

More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 58

More benefits of orthogonality it rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes thereby brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness 59