1 The Unbearable Lightness of Biomedical Informatics Barry Smith Saarbrücken/Buffalo

Slides:



Advertisements
Similar presentations
Ontology Assessment – Proposed Framework and Methodology.
Advertisements

KR-2002 Panel/Debate Are Upper-Level Ontologies worth the effort? Chris Welty, IBM Research.
Ontological analysis of the semantic types Anand Kumar MBBS, PhD IFOMIS, University of Saarland, Germany. BIOMEDICALONTOLOGYBIOMEDICALONTOLOGY.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
1 An Ontology of Relations for Biomedical Informatics Barry Smith 10 January 2005.
The Role of Foundational Relations in the Alignment of Biomedical Ontologies Barry Smith and Cornelius Rosse.
1 Beyond Concepts Barry Smith
1 Ontology in 15 Minutes Barry Smith. 2 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars)
FMA: a domain reference ontology Comments on Cornelius Rosse’s talk Anita Burgun WG6 meeting, Rome 29 Apr- 2 May 2005.
Battling Scylla and Charybdis: The Search for Redundancy and Ambiguity in the 2001 UMLS Metathesuarus James J. Cimino Department of Medical Informatics.
Thomas Bittner and Barry Smith IFOMIS (Saarbrücken) Normalizing Medical Ontologies Using Basic Formal Ontology.
On the Application of Formal Principles to Life Science Data: A Case Study in the Gene Ontology Barry Smith * Jacob Köhler † Anand Kumar * *
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
1 Forms of Life Barry Smith 2.
AN INTRODUCTION TO BIOMEDICAL ONTOLOGY Barry Smith University at Buffalo 1.
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
Biological Ontologies Neocles Leontis April 20, 2005.
What is an Ontology? AmphibiaTree 2006 Workshop Saturday 8:45–9:15 A. Maglia.
1 From Formal Ontology to Biomedical Ontology Barry Smith Biomereology.
Research Methods Chapter 1. Behavioral Research Behavioral Medicine Communication Criminology Human Development Education Psychology Sociology.
Ifomis.org 1 Biomedical Ontology in Saarbrücken Barry Smith
Some comments on Granularity Scale & Collectivity by Rector & Rogers Thomas Bittner IFOMIS Saarbruecken.
Alternatives to Metadata IMT 589 February 25, 2006.
1 The Future of Clinical Bioinformatics: Overcoming Obstacles to Information Integration Barry Smith Brussells, Eurorec Ontology Workshop, 25 November.
Why, in the future, all sciences will be computer sciences Barry Smith.
From T. MADHAVAN, & K.Chandrasekaran Lecturers in Zoology.. EXIT.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
1 Enriching and Designing Metaschemas for the UMLS Semantic Network Department of Computer Science New Jersey Institute of Technology Yehoshua Perl James.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
UMLS Unified Medical Language System. What is UMLS? A Unified knowledge representation system Project of NLM Large scale Distributed First launched in.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Taken from Schulze-Kremer Steffen Ontologies - What, why and how? Cartic Ramakrishnan LSDIS lab University of Georgia.
Use of the UMLS in Patient Care James J. Cimino, M.D. Center for Medical Informatics Columbia University.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
Mining the Biomedical Research Literature Ken Baclawski.
Some Thoughts to Consider 8 How difficult is it to get a group of people, or a group of companies, or a group of nations to agree on a particular ontology?
The UMLS Semantic Network Alexa T. McCray Center for Clinical Computing Beth Israel Deaconess Medical Center Harvard Medical School
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 12 RDF, OWL, Minimax.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Automatically Identifying Candidate Treatments from Existing Medical Literature Catherine Blake Information & Computer Science University.
APPLICATION OF ONTOLOGIES IN CANCER NANOTECHNOLOGY RESEARCH Faculty of Engineering in Foreign Languages 1 Student: Andreea Buga Group: 1241E – FILS Coordinating.
1 The Logic of Biological Classification Barry Smith
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Health IT Workforce Curriculum Version 1.0 Fall Networking and Health Information Exchange Unit 4a Basic Health Data Standards Component 9/Unit.
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
COMP6215 Semantic Web Technologies
Knowledge Representation Part I Ontology
CCNT Lab of Zhejiang University
ece 627 intelligent web: ontology and beyond
Ontology in 15 Minutes Barry Smith.
Introduction to Applied and Theoretical Ontology Barry Smith
Ontological analysis of the semantic types
LEARNING OUTCOMES ASSESSMENT OF PHYSICS SUBJECT
Ontology in 15 Minutes Barry Smith.
Semantic Nets and Frames
LEARNING OUTCOMES ASSESSMENT OF PHYSICS SUBJECT
Building Ontologies with Protégé-2000
Presentation transcript:

1 The Unbearable Lightness of Biomedical Informatics Barry Smith Saarbrücken/Buffalo

2 if Medical WordNet* is the solution what is the problem? *Coling Proceedings, Vol. 1, pp

3

4 Cerebellar tumor

5 DNA Protein Organelle Cell Tissue Organ Organism m m m

6 The quantity-quality divide 30,000 genes in human 200,000 proteins 100s of cell types 100,000s of disease types 1,000,000s of biochemical pathways (including disease pathways) … legacy of Human Genome Project … and of attempts to institute the electronic health record

7 DNA Protein Organelle Cell Tissue Organ Organism m m m

8 FUNCTIONAL GENOMICS proteomics, reactomics, metabonomics, toxicopharmacogenomics phenomics, behaviouromics, …

9 DNA Protein Organelle Cell Tissue Organ Organism m m m The method of annotations

10 DNA Protein Organelle Cell Tissue Organ Organism m m m The method of indexing

11 The Gene Ontology menopause sensitivity to blue light heptolysis

12

13 How overcome incompatibilities between different scientific index terms? immunology genetics cell biology

14 One answer (statistical) computational linguistics Pattern recognition based on string searches

15 String searches need constraints we can’t leave it to luck to overcome terminological incompatibilities

16 Remember –different disciplines are using different terminologies to refer to the same objects, processes, features in reality immunology genetics cell biology

17 An alternative answer: “Ontology”

18 Ontology, roughly: Overcome terminological incompatibilities by creating a standardized framework into which diverse vocabularies can be mapped

19 Kinds of Ontologies Terms General Logic Thesauri formal Taxonomies Frames (OKBC) Data Models (UML, STEP) Description Logics (DAML+OIL) Principled, informal hierarchies ad hoc Hierarchies (Yahoo!) structured Glossaries XML DTDs Data Dictionaries (EDI) ‘ordinary’ Glossaries XML Schema DB Schema Glossaries & Data Dictionaries MetaData, XML Schemas, & Data Models Formal Ontologies & Inference Thesauri, Taxonomies Michael Gruninger

20 Kinds of Ontologies A shared vocabulary plus a specification of its intended meaning meaning specified explicitly in a logically rigorous way Two extremes

21 Kinds of Ontologies Terms General Logic Thesauri formal Taxonomies Frames (OKBC) Data Models (UML, STEP) Description Logics (DAML+OIL) Principled, informal hierarchies ad hoc Hierarchies (Yahoo!) structured Glossaries XML DTDs Data Dictionaries (EDI) ‘ordinary’ Glossaries XML Schema DB Schema Glossaries & Data Dictionaries MetaData, XML Schemas, & Data Models Formal Ontologies & Inference Thesauri, Taxonomies

22 Kinds of Ontologies A shared vocabulary plus a specification of its intended meaning meaning specified explicitly in a logically rigorous way Too expensive

23 Kinds of Ontologies A shared vocabulary plus a specification of its intended meaning Meaning specified informally via natural language Two extremes

24 Work on biomedical ontologies grew out of work on medical thesauri and nomenclatures

25 Kinds of Ontologies Terms General Logic Thesauri formal Taxonomies Frames (OKBC) Data Models (UML, STEP) Description Logics (DAML+OIL) Principled, informal hierarchies ad hoc Hierarchies (Yahoo!) structured Glossaries XML DTDs Data Dictionaries (EDI) ‘ordinary’ Glossaries XML Schema DB Schema Glossaries & Data Dictionaries MetaData, XML Schemas, & Data Models Formal Ontologies & Inference Thesauri, Taxonomies

26 Fruit Orange Vegetable similarTo Apfelsine synonymWith NarrowerTerm Graph with labels edges (similarTo, Narrower, synonymWith) Fixed set of edge labels (a.k.a. relations) Goble & Shadbolt

27 Unified Medical Language System (UMLS) UMLS Metathesaurus: 1 million biomedical concepts 2.8 million concept names from more than 100 controlled vocabularies and classifications built by US National Library of Medicine

28 UMLS Source Vocabularies MeSH – Medical Subject Headings … ICD International Classification of Diseases … GO – Gene Ontology … FMA – Foundational Model of Anatomy …

29 To reap the benefits of standardization we need to make ONE SYSTEM out of many different terminologies = UMLS “Semantic Network” nearest thing to an “ontology” in the UMLS

30 UMLS SN Alexa McCray, “An Upper Level Ontology for the Biomedical Domain”, Comparative and Functional Genomics, 4 (2003),

31 UMLS SN 134 Semantic Types 54 types of edges (relations) yielding a graph containing more than 6,000 edges

32 Fragment of UMLS SN

33

34

35 UMLS SN Top Level entity event physical conceptual object entity organism

36 conceptual entity Organism Attribute Finding Idea or Concept Occupation or Discipline Organization Group Group Attribute Intellectual Product Language

37 conceptual entity Organism Attribute Finding Idea or Concept Occupation or Discipline Organization Group Group Attribute Intellectual Product Language

38 Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence

39 Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence

40 Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence

41 Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence

42 Lake Geneva is an Idea or Concept

43 Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence

44 UMLS Fingers is_a Body Location or Region Hand is_a Body Part, Organ, or Organ Component hand part_of body BUT NOT fingers part_of hand

45 Problem: Running together of concepts and entities in reality bioinformatics à la UMLS SN ( like many “knowledge engineering” disciplines ) floats free from reality in a conceptual world of its own creation

46 Blood Pressure Ontology The hydraulic equation: BP = CO*PVR arterial blood pressure (BP) is directly proportional to the product of blood flow (cardiac output, CO) and peripheral vascular resistance (PVR).

47 UMLS SN blood pressure is an Organism Function cardiac output is a Laboratory or Test Result or Diagnostic Procedure

48 BP = CO*PVR thus asserts that blood pressure is proportional either to a laboratory or test result or to a diagnostic procedure

49 Problem: Confusion of reality with our (ways of gaining) knowledge about reality

50 UMLS Semantic Network entity physical conceptual object entity

51 Physical Object Substance Food Chemical Body

52 Chemical Viewed Structurally Functionally

53 Problem: Confusion of objects with our ways of referring to objects

54 Chemical Viewed Structurally Functionally Inorganic Organic Enzyme Biomedical or Chemical Chemical Dental Material

55 This multiple inheritance leads to errors in coding Gene Ontology will eliminate multiple inheritance

56 UMLS Semantic Network entity physical conceptual object entity organism is_a

57 UMLS SN is_a = def. If one item ‘is_a’ another item then the first item is more specific in meaning than the second item. (Italics added)

58 fish is_a vertebrate copulation is_a biological process both testes is_a testis Nazi is_a Nazism plant parts is_a plant

59

60 What are the nodes in this graph? Almost all nodes are linked to other nodes by a multiplicity of different types of edges Compare: swimming is healthy swimming has 8 letters

61 Semantic Network Definition: Concept = def. An abstract concept, such as a social, religious, or philosophical concept UMLS Definition: Concept = def. A class of synonymous terms

62

63 How can concepts figure as relata of these relations? part_of = def. Composes, with one or more other physical units, some larger whole causes =def. Brings about a condition or an effect. contains =def. Holds or is the receptacle for fluids or other substances.

64 How can a set of synonymous terms serve as a receptacle for fluids or other substances? How can sets of synonymous terms stand in relations such as affects or causes?

65 connected_to =def. Directly attached to another physical unit as tendons are connected to muscles. How can a concept be directly attached to another physical unit?

66 What are the relata which are linked by the edges in the SN graph?

67 To answer this question we need to distinguish clearly between concepts and classes: concepts are creatures of cognition classes are invariants (types, kinds, universals) out there in reality

68 If ontologies are about meanings / concepts it becomes impossible to deal coherently with those relations between entities in reality which involve appeal to both classes and their instances.

69 Illustration re: part_of heart part_of human human heart part_of human testis part_of human human testis part_of human

70 For instances: part_of = instance-level parthood (for example between Mary and her heart) For classes A part_of B =def. given any instance a of A there is some instance b of B such that a part_of b This is an assertion about As.

71 a adjacent_to b (instance-level adjacency, for example between Mary’s head and Mary’s neck) For classes: A adjacent_to B =def. given any instance a of A there is some instance b of B which is such that a adjacent_to b

72 A adjacent_to B as an assertion about classes is never an assertion about As exclusively

73 A adjacent_to B =def. given any instance a of A there is some instance b of B which is such that a adjacent_to b and given any instance b of B there is some instance a of A which is such that a adjacent_to b

74 Almost all of the 54 types of edges in SN are dealt with incoherently part_of HAS INVERSE has_part nucleus part_of cell cell has_part nucleus

75

76 Acquired Abnormality affects Fish Experimental Model of Disease affects Fungus Food causes Experimental Model of Disease Bacterium causes Experimental Model of Disease Biomedical or Dental Material causes Mental or Behavioral Dysfunction Manufactured Object causes Disease or Syndrome Vitamin causes Injury or Poisoning

77 How to do better?

78 How to do better? How to create a network of biomedically relevant terms/classes, with coherently defined relations between them, to which expert terms of the UMLS can be assigned in a maximally intelligible way?

79 What linguistic framework is shared in common by immunologists, geneticists and cell biologists, by phenobehavioromists and by toxicopharmacogenomists?

80 Answer: the natural language they all use to talk about biological (biomedical) phenomena

81 BioWordNet joint work with Christiane Fellbaum (see paper in Proceedings)

82 BioWordNet use WordNet’s biomedical vocabulary, to create a better alternative to UMLS SN

83 Strengths of WordNet 2.0 Open source Very broad coverage Is-a / part-of architecture Tool for automatic sense disambiguation

84 Weaknesses of WordNet 2.0 Problems with relations Mixes up expert and non-expert vocabulary Errors Gaps Noise all prevent WordNet’s being used in scientific context as substitute for UMLS SN

85 Fix WordNet’s relations by using the methodology outlined above already applied to: Foundational Model of Anatomy Gene Ontology Open Biological Ontologies

86 Institute for Formal Ontology and Medical Information Science Saarbrücken

87 WordNet mixes up expert and non-expert vocabulary, both current and medieval: suppuration#2 {pus, purulence, suppuration, ichor, sanies, festering}

88 WordNet contains biomedically relevant errors snore-sleep WordNet: if someone snores, then he necessarily also sleeps snoring = the respiratory induced vibration of glottal tissues associated not only with sleep but also with relaxation or obesity

89 WordNet has too much noise for purposes of scientific applications

90 13 senses for feel is a verb experience – She felt resentful find – I feel that he doesn't like me feel – She felt small and insignificant; feel – We felt the effects of inflation feel – The sheets feel soft grope –He felt for his wallet finger – Feel this soft cloth! explore – He felt his way around the dark room) feel – It feels nice to be home again feel – He felt the girl in the movie theater)

91 Medical senses of ‘feel’ palpate – examine a body part by palpation: The runner felt her pulse. sense – perceive by a physical sensation, e.g. coming from the skin or muscles: He felt his flesh crawl feel – seem with respect to a given sensation: My cold is gone – I feel fine today

92 WordNet has gaps even in its coverage of biomedical natural language

93 WordNet seness of ‘regulation’ 1. regulation (ordinance, rule) 2. rule, regulation -- (a principle that customarily governs behavior; "short haircuts were the regulation") 3. regulation -- (the state of being controlled or governed) 4. regulation -- (the ability of an early embryo to continue normal development after its structure has been somehow damaged) 5. regulation, regularization, regularisation -- (the act of bringing to uniformity) 6. regulation, regulating -- (the act of controlling according to rule; "fiscal regulations are in the hands of politicians")

94 Biological sense of ‘regulation’: A process that modulates the frequency, rate or extent of behavior (Gene Ontology)

95 WordNet senses of ‘inhibition’ 1. inhibition, suppression -- ((psychology) the conscious exclusion of unacceptable thoughts or desires) 2. inhibition -- (the quality of being inhibited) 3. inhibition -- the process whereby nerves can retard or prevent the functioning of an organ or part; "the inhibition of the heart by the vagus nerve") 4. prohibition, inhibition, forbiddance -- (the action of prohibiting or forbidding)

96 Biological senses of ‘inhibition’ much broader inhibition = negative regulation enzymes can be inhibited reactions can be inhibited … and not only by nerves

97 WordNet senses of ‘binding’ 1. binding -- (the capacity to attract and hold something) 2. binding -- (a strip sewn over or along an edge for reinforcement or decoration) 3. dressing, bandaging -- (the act of applying a bandage) 4. binding, book binding; "the book had a leather binding")

98 biological sense of ‘binding’ interacting selectively with (Gene Ontology)

99 Remove errors, noise and gaps in a two-stage process 1.select biomedically relevant natural- language terms from WordNet 2.0 extended by standard biomedical information sources 2.validate these terms and the relations between them

100 Validation each arc in BWN is converted into a natural- language sentence e.g. ‘mumps is an inflammation’ via controlled human subjects experiments: are accredited 1. as intelligible by non-experts 2. as true by experts

101 we use logical methods to ensure a coherent treatment of BWN’s upper-level classes and relations and thereby also bring logical rigor in a practical fashion to the whole of the UMLS Metathesaurus

102 Bring ontological rigour to BWN Terms General Logic Thesauri formal Taxonomies Frames (OKBC) Data Models (UML, STEP) Description Logics (DAML+OIL) Principled, informal hierarchies ad hoc Hierarchies (Yahoo!) structured Glossaries XML DTDs Data Dictionaries (EDI) ‘ordinary’ Glossaries XML Schema DB Schema Glossaries & Data Dictionaries MetaData, XML Schemas, & Data Models Formal Ontologies & Inference Thesauri, Taxonomies

103 The long-term goal BWN should serve as scaffolding/indexing system for the much larger and denser net of expert biomedical terminology which is the UMLS Metathesaurus

104 The End