1 Ontologies in Biomedicine: The Good, The Bad and The Ugly Barry Smith

Slides:



Advertisements
Similar presentations
1 Five Steps to Interoperability (in the domain of scientific ontology) Barry Smith.
Advertisements

Upper Ontology Summit Wednesday March 15 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National.
Ontological analysis of the semantic types Anand Kumar MBBS, PhD IFOMIS, University of Saarland, Germany. BIOMEDICALONTOLOGYBIOMEDICALONTOLOGY.
National center for ontological research University at Buffalo The Center for the Arts October 27, 2005.
Ontology Notes are from:
Ontology and the Future of Biomedical Research Barry Smith
1 An Ontology of Relations for Biomedical Informatics Barry Smith 10 January 2005.
1 Ontologies in Biomedicine: The Good, The Bad and The Ugly Barry Smith
The Role of Foundational Relations in the Alignment of Biomedical Ontologies Barry Smith and Cornelius Rosse.
1 Beyond Concepts Barry Smith
1 Ontology in 15 Minutes Barry Smith. 2 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars)
FMA: a domain reference ontology Comments on Cornelius Rosse’s talk Anita Burgun WG6 meeting, Rome 29 Apr- 2 May 2005.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
Thomas Bittner and Barry Smith IFOMIS (Saarbrücken) Normalizing Medical Ontologies Using Basic Formal Ontology.
STOP Barry Smith Smart Terminologies via Ontological Principles.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
AN INTRODUCTION TO BIOMEDICAL ONTOLOGY Barry Smith University at Buffalo 1.
HL7 RIM Lessons for Semantic Interoperability
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
Tutorial on Ontology Design Barry Smith and Werner Ceusters.
Developing a Biomedical Ethics Ontology (BMEO) Robert Arp, Ph.D. Ontology Research Group (ORG) National Center for Biomedical Ontology.
1 A General Introduction to Biomedical Ontology Barry Smith
Anatomical Information Science Barry Smith
1 The OBO Relation Ontology Genome Biology 2005, 6:R46 based on the fundamental distinction between instances and universals takes instances and time into.
What is an Ontology? AmphibiaTree 2006 Workshop Saturday 8:45–9:15 A. Maglia.
Ifomis.org 1 Biomedical Ontology in Saarbrücken Barry Smith
Son of SN Barry Smith. The Virtues of Single Inheritance (= True Hierarchy) better coding clearer instructions better automatic reasoning better definitions.
HL7 RIM Exegesis and Critique Regenstrief Institute, November 8, 2005 Barry Smith Director National Center for Ontological Research.
1 The Future of Clinical Bioinformatics: Overcoming Obstacles to Information Integration Barry Smith Brussells, Eurorec Ontology Workshop, 25 November.
Why, in the future, all sciences will be computer sciences Barry Smith.
FHIM Overview How the FHIM can organize other information modeling efforts.
Semantic Relations in the Environmental Domain Gerhard Budin.
HL7 HL7  Health Level Seven (HL7) is a non-profit organization involved in the development of international healthcare.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
Concept Model for observables, investigations, and observation results For the IHTSDO Observables Project Group and LOINC Coordination Project Revision.
Why we need the OBO Core Michael Ashburner, Suzanna Lewis and Barry Smith.
Amo amos amot amomus amotis amont. Happy birthday Swiss-Prot Fortaleza August 2006.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Core 6 (University at Buffalo) Dissemination of Ontology Best Practices Barry Smith (PI) Fabian Neuhaus (Post-Doc) Werner.
1 HL7 RIM Barry Smith
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Wrap-Up Barry Smith. Principles of Ontology Development.
Ontology of Disease and the OBO Foundry Chris Mungall NCBO GO Nov 2006.
Ontological Foundations of Biological Continuants Stefan Schulz, Udo Hahn Text Knowledge Engineering Lab University of Jena (Germany) Department of Medical.
SSO: THE SYNDROMIC SURVEILLANCE ONTOLOGY Okhmatovskaia A, Chapman WW, Collier N, Espino J, Conway M, Buckeridge DL Ontology Description The SSO was developed.
Networking and Health Information Exchange Unit 5b Health Data Interchange Standards.
ADVANCED DB SYSTEMS BIOMEDICAL ENGINEERING. Index INTRODUCTION  BIOMEDICAL ENGINEERING  B.E. DATASETS APPLICATIONS  DATA MINING ON FDA DATABASE  ONTOLOGY-BASED.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
FDT Foil no 1 On Methodology from Domain to System Descriptions by Rolv Bræk NTNU Workshop on Philosophy and Applicablitiy of Formal Languages Geneve 15.
The Role of Architecture and Ontology for Interoperability EFMI Special Topic Conference 2010 June Reykjavik, Iceland Bernd Blobel eHealth Competence.
What is an Ontology? A representation of knowledge in a domain In theory Thomas Gruber (1993) “An ontology is a formal, explicit specification of a shared.
Approach to building ontologies A high-level view Chris Wroe.
1 The OBO Relation Ontology: Preliminaries Barry Smith
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
1 The Logic of Biological Classification Barry Smith
Basic Formal Ontology Barry Smith August 26, 2013.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Semantic Media Wiki Open Terminology Development - Initial Steps - Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer.
Ontology III Cristian Cocos (CLIStFX). Recap What Why (interoperability, “Tower of Babel,” the problem of “human idiosyncrasy”) Upper-Level Ontology,
1 Why computer science needs philosophy Barry Smith National Center for Ontological Research.
1 The Future of (Biomedical) Ontology: Overcoming Obstacles to Information Integration Barry Smith (IFOMIS) Manchester
1 Standards and Ontology Barry Smith
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Integrating SysML with OWL (or other logic based formalisms)
Ontology in 15 Minutes Barry Smith.
Chapter 2 Database Environment.
Ontology in 15 Minutes Barry Smith.
Presentation transcript:

1 Ontologies in Biomedicine: The Good, The Bad and The Ugly Barry Smith

2 The Good Foundational Model of Anatomy (FMA) Pro Very clear statement of scope: structural human anatomy, at all levels of granularity, from the whole organism to the biological macromolecule Powerful treatment of definitions, from which the entire FMA hierarchy is generated – can serve as basis for formal reasoning Con Some unfortunate artifacts in the ontology deriving from its specific computer representation (Protégé)

3 Intermediate GALEN Pro Allows formal representation of clinical information Allows multiple views of relevant detail as needed Uses powerful Description Logic (DL)-based formal structure Con Remains only partially developed Contains errors: Vomitus contains carrot – which DLs did not prevent

4 Intermediate The Gene Ontology Con Poor formal architecture Full of errors menopause part_of death Poor support for automatic reasoning and error- checking Poor treatment of definitions Not trans-granular No relation to time or instances

5 The Gene Ontology Pro Open Source Cross-Species... has recognized the need for reform, including explicit representation of granular levels

6 Problem of Circularity GO: : Protection from natural killer cell mediated cytolysis Definition: The process of protecting a cell from cytolysis by natural killer cells.

7 GO: hemolysis Definition: The processes that cause hemolysis X = def. the Y of X this is worse than circular

8 The Bad Reactome Pro Rich catalogue of biological process Con Incoherent treatment of categories: ReferentEntity (embracing e.g. small molecules) is a sibling of PhysicalEntity (embracing complexes, molecules, ions and particles). Similarly CatalystActivity is a sibling of Event.

9 The Bad National Cancer Institute Thesaurus Pro Open source; ambitiously broad coverage; DL-based Con Poor realization of DL formalism Full of mistakes (many inherited from its UMLS sources): –three disjoint classes of plants: Vascular Plant, Non-vascular Plant, Other Plant –three disjoint kinds of cells: Cell, Normal Cell, Abnormal Cell –Normal Cell is_a Microanatomy See

10 National Cancer Institute Thesaurus Duratec, Lactobutyrin and Stilbene Aldehyde classified as: Unclassified Drugs and Chemicals Pro NCIT, too, has recognized the need for reform (NCIT is part of the OBO library)

11 The Ugly UMLS Semantic Network Pros Broad coverage; no multiple inheritance Cons Incoherent use of ‘conceptual entities’ (e.g. the digestive system as a conceptual part of the organism) Full of errors

12 UMLS Semantic Network Edges in the graph represent merely “possible significant relations”: –Bacterium causes Experimental Model of Disease –Experimental Model of Disease affects Fungus –Experimental model of disease is_a Pathologic Function

13 UMLS Semantic Network Unclear what the nodes of the graph are: Drug Delivery Device contains Clinical Drug Drug Delivery Device narrower_in_meaning_than Manufactured Object The use-mention confusion: “Swimming is healthy and has 8 letters”

14 The Ugly Clinical Terms Version 2 (The Read Codes) Classifies chemicals into: chemicals whose name begins with ‘A’, chemicals whose name begins with ‘B’, chemicals whose name begins with ‘C’,...

15 The Astonishingly (Criminally?) Ugly Health Level 7 HL7 is a UML-based standard for exchange of information between clinical information systems has proved very crumbly as a standard The HL7 Reference Information Model (RIM) is supposed to overcome this problem by defining the universe of healthcare data in a rigorous way

16 HL7-RIM Animal Definition: A subtype of Living Subject representing any animal-of-interest to the Personnel Management domain. Person A subtype of Living Subject representing single human being [sic] who, in the context of the Personnel Management domain, must also be uniquely identifiable through one or more legal documents. LivingSubject Definition: A subtype of Entity representing an organism or complex animal, alive or not.

17 HL7 RIM: The Problem of Circularity Person = Person with documents has the form: ‘An A is an A which is B’ – useless in practical terms since neither we nor the machine can use them to find out what ‘A’ means – incorporate a vicious infinite regress – have the effect of making it impossible to refer to A’s which are not Bs, for example to an undocumented person

18 HL7 Logically Incoherent act = the record of an act This has the form: An X is the Y of an X again worse than circular

19 HL7-RIM: Logically Contradictory Definitions Definition of Act: An Act is an action of interest that has happened, can happen, is happening, is intended to happen, or is requested/demanded to happen. Definition of Act: An Act is the record of something that is being done, has been done, can be done, or is intended or requested to be done.

20 HL7 RIM Ontologically Incoherent The truth about the real world is constructed through a combination and arbitration of attributed statements... As such, there is no distinction between an activity and its documentation.

21 HL7 Incredibly Successful embraced as US federal standard; central part of $15 billion program to integrate all UK hospital information systems made mandatory by Canada Health Infoway adopted by Oracle as basis for its EHR support programs

22 HL7 Merchandizing

23 From molecules to diseases A good ontology should enable us to organize our information resources in such a way that we can bridge the granularity gap between genomics and proteomics data and phenotype (clinical, pharmacological, patient-centered) data

24 good ontologies require: Coherent upper level taxonomy distinguishing continuants (cells, molecules, organisms...) occurrents (events, processes) dependent entities (qualities, functions...) independent entities (their bearers) universals (types, kinds) instances (tokens, instances) Coherent relation ontology supporting inference both within and between ontologies.

25 good ontologies require: Consistent use of terms, supported by logically coherent (non-circular) definitions, in both human-readable and computable formats

26 Open Biomedical Ontologies (OBO) Upper Biomedical Ontology (UBO) root UBO: :topUBO: :top subclass BFO:continuant:continuantBFO:continuant:continuant – subclass BFO:dependent_entity:dependent_entity BFO:dependent_entity:dependent_entity subclass UBO: :quality UBO: :quality – subclass UBO: :phenotype UBO: :phenotype » subclass UBO: :state UBO: :state – subclass UBO: :disease UBO: :disease » subclass UBO: :function UBO: :function – subclass GO: :molecular_function GO: :molecular_function subclass BFO:disposition:disposition BFO:disposition:disposition – subclass BFO:independent_entity:independent_entity BFO:independent_entity:independent_entity subclass UBO: :substance UBO: :substance – subclass UBO: :protein UBO: :protein – subclass GO: :cellular_component GO: :cellular_component – subclass UBO: :anatomical_entity UBO: :anatomical_entity » subclass UBO: :gross_anatomical_entity UBO: :gross_anatomical_entity – subclass UBO: :organism UBO: :organism » subclass UBO: :microbe UBO: :microbe » subclass UBO: :plant UBO: :plant » subclass UBO: :animal UBO: :animal subclass BFO:fiat_part_of_substance:fiat_part_of_substance BFO:fiat_part_of_substance:fiat_part_of_substance subclass BFO:boundary_of_substance:boundary_of_substance BFO:boundary_of_substance:boundary_of_substance subclass BFO:aggregate_of_substances:aggregate_of_substances BFO:aggregate_of_substances:aggregate_of_substances subclass BFO:occurrent:occurrentBFO:occurrent:occurrent – subclass BFO:dependent_occurrent:dependent_occurrent BFO:dependent_occurrent:dependent_occurrent subclass UBO: :process UBO: :process –subclass GO: :biological_processGO: :biological_process subclass BFO:fiat_part_of_process:fiat_part_of_process BFO:fiat_part_of_process:fiat_part_of_process – subclass UBO: :life_cycle_stage UBO: :life_cycle_stage subclass BFO:aggregate_of_processes:aggregate_of_processes BFO:aggregate_of_processes:aggregate_of_processes –subclass EO: :environment ontologyEO: :environment ontology subclass BFO:temporal_boundary_of_process:temporal_boundary_of_process BFO:temporal_boundary_of_process:temporal_boundary_of_process – subclass BFO:independent_occurrent:independent_occurrent BFO:independent_occurrent:independent_occurrent

27 OBO Relation Ontology (RO) Clear distinction between universals (classes, kinds, types and instances (individuals, tokens Precise formal definitions of relations Automatic applicability to time-indexed instance- data e.g. in Electronic Health Record Consistency with the Relation Ontology now a criterion for admission to the OBO ontology library see Genome Biology Apr. 2006

28 Three types of relations between instances: Mary’s heart part_of Mary between an instance and a universal: Mary instance_of homo sapiens between universals: gastrulation part_of embryonic development

29 A suite of primitive instance-level relations identical_to part_of located_in adjacent_to earlier derives_from...

30 A suite of defined relations between universals Foundationalis_a part_of Spatiallocated_in contained_in adjacent_to Temporaltransformation_of derives_from preceded_by Participationhas_participant has_agent

31 GALEN: Vomitus contains carrot All portions of vomit contain all portions of carrot All portions of vomit contain some portion of carrot Some portions of vomit contain some portion of carrot Some portions of vomit contain all portions of carrot

32 all-some structure A part_of B =def. given any instance a of A there is some instance b of B such that a part_of b on the instance level Allows automatic ontology integration via cascading reasoning: A R 1 B B R 2 C  A R 3 C

33 adjacent_to cell wall adjacent_to cytoplasm intron adjacent_to exon Golgi apparatus adjacent_to endoplasmic reticulum periplasm adjacent_to plasma membrane presynaptic membrane adjacent_to synaptic cleft

34 A adjacent_to B every instance of A stands in the instance- level adjacent_to relation to some instance of B

35 adjacent_to as a relation between universals is not symmetric nucleus adjacent_to cytoplasm Not: cytoplasm adjacent_to nucleus seminal vesicle adjacent_to urinary bladder Not: urinary bladder adjacent_to seminal vesicle

36 The Granularity Gulf most existing data-sources are of fixed, single granularity many (all?) clinical phenomena cross granularities

37 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars, individuals) in current ontologies

38 Key idea To define ontological relations like part_of, develops_from it is not enough to look just at universals / classes / types / ‘concepts’ : we need also to take account of instances and time

39 transformation_of A transformation_of B =def. any instance of A was at some earlier time an instance of B

40 transformation_of c at t 1 C c at t C 1 time same instance mature RNA transformation_of pre-RNA adult transformation_of child carcinomatous colon transformation_of colon

41 transformation_of relations cross both time and granularity C c at t c at t 1 C 1

42 Advantages of the methodology of enforcing commonly accepted coherent definitions promote quality assurance (better coding) guarantee automatic reasoning across ontologies and across data at different granularities yields direct connection to times and instances in the EHR