STOP Barry Smith Smart Terminologies via Ontological Principles.

Slides:



Advertisements
Similar presentations
Enhancing GO for the sake of clinical bionformatics Anand Kumar IFOMIS, University of Leipzig/Saarbrücken.
Advertisements

What is Ontology? Dictionary:A branch of metaphysics concerned with the nature and relations of being. Barry Smith:The science of what is, of the kinds.
Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium.
Meiosis produces haploid gametes.
Gene Ontology John Pinney
GOAT: The Gene Ontology Annotation Tool Dr. Mike Bada Department of Computer Science University of Manchester
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
The Gene Ontology Barry Smith March 2004.
1 An Ontology of Relations for Biomedical Informatics Barry Smith 10 January 2005.
The Role of Foundational Relations in the Alignment of Biomedical Ontologies Barry Smith and Cornelius Rosse.
1 Introduction to (Geo)Ontology Barry Smith
Gene Ontology Luis Tari. Gene Ontology (GO) URL: Gene Ontology is A hierarchy of roles of genes.
1 Ontology in 15 Minutes Barry Smith. 2 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars)
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
COG and GO tutorial.
Thomas Bittner and Barry Smith IFOMIS (Saarbrücken) Normalizing Medical Ontologies Using Basic Formal Ontology.
Ontology The science of the kinds and structures of objects, and their properties and relations. Defined by a scientific field's vocabulary and by the.
The Ontology of the Gene Ontology Barry Smith Jennifer Williams Steffen Schulze-Kremer
On the Application of Formal Principles to Life Science Data: A Case Study in the Gene Ontology Barry Smith * Jacob Köhler † Anand Kumar * *
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Ontological Model for Colon Carcinoma: A Case Study for Knowledge Representation in Clinical Bioinformatics Kumar A 1,2, Yip L 3, Jaremek M 2, Scheib H.
ifomis.de 1 Outline Part 0: HL7 RIM Part 1: Survey of GO and its problems Part 2: Extending GO to make a full ontology Part 3: Conclusion.
What is an Ontology? AmphibiaTree 2006 Workshop Saturday 8:45–9:15 A. Maglia.
AP Biology The Cell Cycle Part 2.
The Cell Cycle. The Cell Theory: All organisms consist of cells and arise from preexisting cells n Mitosis is the process by which new cells are generated.
Reference Ontologies, Application Ontologies, Terminology Ontologies Barry Smith
Son of SN Barry Smith. The Virtues of Single Inheritance (= True Hierarchy) better coding clearer instructions better automatic reasoning better definitions.
Business Domain Modelling Principles Theory and Practice HYPERCUBE Ltd 7 CURTAIN RD, LONDON EC2A 3LT Mike Bennett, Hypercube Ltd.
 Scientific study of life.  Present era is most exciting in biology  Scientists are trying to solve biological puzzles like:  How a single microscopic.
Chapter 2 – Chromosomes and Sexual Reproduction. Basic Cell Types - Prokaryotic “before nucleus” Unicellular Simple structure –No internal membranes Eubacteria.
Cell Growth and Division
Unit 4 Vocabulary Review. Nucleic Acids Organic molecules that serve as the blueprint for proteins and, through the action of proteins, for all cellular.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
1865- Gregor Mendel studied inheritance patterns using pea plants and observed traits were inherited as separate units. These traits are now known as.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Gene Ontology Consortium
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Wrap-Up Barry Smith. Principles of Ontology Development.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Ontological Foundations of Biological Continuants Stefan Schulz, Udo Hahn Text Knowledge Engineering Lab University of Jena (Germany) Department of Medical.
Integrating the Cell Cycle Ontology with the Mouse Genome Database David R. Smith Mary Dolan Dr. Judith Blake.
Life Science “The Molecular Basis of Heredity”. Amino Acid Any of the organic acids that are the chief component of proteins, either manufactured by cells.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
VR. Formal Principles for Biomedical Ontologies Barry Smith
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
First Week Gihan E-H Gawish, MSc, PhD Ass. Professor Molecular Genetics and Clinical Biochemistry Molecular Genetics and Clinical BiochemistryKSU.
CHAPTER 1 INTRODUCTION: THEMES IN THE STUDY OF LIFE.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Unit 2 The Molecule of Life Genes and Heredity. What is a gene?
Cell Growth, Division and Reproduction
Species and Classification in Biology Barry Smith
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
PATO and TO Barry Smith. HP: ! tachycardia =def. Process: GO: cardiac muscle contraction Quality: PATO: increased rate HP = Human.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
DNA DNA  The common thread of life.. What are the functions of DNA?  There are two main functions of DNA –It is the molecule of heredity. –It contains.
Life Science. Explain that cells are the basic unit of structures and function of living organisms. Cells are the basic unit of structures of living organisms.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Meiosis produces haploid gametes. Section 1: Meiosis K What I Know W What I Want to Find Out L What I Learned.
Molecular Genetics Jeopardy DNATranscriptionTranslationEpigeneticsPotpourri Final Jeopardy.
2/3/2005 Gene Ontology (GO) The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions.
Protein Synthesis The Making of Proteins Using Genetic Information.
Ontology in 15 Minutes Barry Smith.
Overview Gene Ontology Introduction Biological network data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Ontology in 15 Minutes Barry Smith.
What is Ontology? s Dictionary:A branch of metaphysics concerned with the nature and relations of being. Barry Smith:The science of what is, of.
Presentation transcript:

STOP Barry Smith

Smart Terminologies via Ontological Principles

ifomis.de 3 Thanks to Anand Kumar Steffen Schulze-Kremer Jane Lomax

ifomis.de 4 Part One Introduction

ifomis.de 5 GO here an example a.of the sorts of problems confronting life science data integration b.of the degree to which philosophy and logic are relevant to the solution of these problems

ifomis.de 6 When a gene is identified three important types of questions need to be addressed: 1. Where is it located in the cell? 2. What functions does it have on the molecular level? 3. To what biological processes do these functions contribute?

ifomis.de 7 GO’s three ontologies molecular functions cellular components biological processes

ifomis.de 8 Each of GO’s ontologies is organized in a graph-theoretical structure involving two sorts of links or edges: is-a (= is a subtype of ) (copulation is-a biological process) part-of (cell wall part-of cell)

ifomis.de 9 Part Two GO as ‘Controlled Vocabulary’

ifomis.de 10 Principle of Univocity terms should have the same meanings (and thus point to the same referents) on every occasion of use

ifomis.de 11 Principle of Compositionality The meanings of compound terms should be determined 1. by the meanings of component terms together with 2. the rules governing syntax

ifomis.de 12 Principle of Syntactic Separateness Do not confuse sentences with terms If you want to say: No As are Bs do not invent a new class of non-Bs and say A is_a non-B Holliday junction helicase complex is-a unlocalized

ifomis.de 13 Principle of Objectivity which classes exist in reality is not a function of our biological knowledge. (Terms such as ‘unclassified’ or ‘unknown ligand’ or ‘not otherwise classified as peptides’ do not designate biological natural kinds, and nor do they designate differentia of biological natural kinds)

ifomis.de 14 Keep Epistemology Separate from Ontology If you want to say that We do not know where As are located do not invent a new class of A’s with unknown locations (A well-constructed ontology should grow linearly; it should not need to delete classes or relations because of increases in knowledge)

ifomis.de 15 GO: cellular component unknown cellular component unknown is-a cellular component

ifomis.de 16 binding is_a molecular function binding is_a English noun

ifomis.de 17 Principle of Meta-Data Do not include meta-data as if it were just more data Do not confuse meta-data with data about classes in the ontology itself

ifomis.de 18 Principle of Meta-Data obsolete molecular function - list of molecular function terms declared obsolete obsolete molecular function is_a molecular function obsolete molecular function (obsolete)

ifomis.de 19 obsolete molecular function (obsolete) (obsolete)

ifomis.de 20 meta-data data reality

ifomis.de 21 meta-data comments on terms data terms reality natural kinds

ifomis.de 22 meta-data comments on terms data terms ‘is_a’, ‘part_of ’ reality natural kinds is_a, part_of

ifomis.de 23 data: nucleus part_of cell reality: < cellular component part_of Gene Ontology reality: <

ifomis.de 24 data: nucleus part_of cell reality: < cellular component part_of Gene Ontology reality: <

ifomis.de 25 Russell’s Paradox GO names itself SwissProt does not name itself Consider: the database of all biological databases that do not name themselves this names itself if and only if it does not name itself

ifomis.de 26 Part Three GO’s Relation

ifomis.de 27 Principle of Single Inheritance every non-root class in a classificatory hierarchy has exactly one parent no classificatory diamonds:

ifomis.de 28 Linnaeus

ifomis.de 29

ifomis.de 30 Uses of multiple inheritance associated with errors in coding B C is-a 1 is-a 2 A because ‘is-a’ no longer univocal

ifomis.de 31 e.g. is_a is pressed into service to express location is-located-at and similar relations are expressed by creating special compound terms using: site of … … within … … in … extrinsic to … yielding associated errors

ifomis.de 32 ‘is-a’ overloading an obstacle to integration with other ontologies and causes other problems

ifomis.de 33 e.g. problems with ‘within’ lytic vacuole within a protein storage vacuole lytic vacuole within a protein storage vacuole is-a protein storage vacuole time-out within a baseball game is-a baseball game embryo within a uterus is-a uterus

ifomis.de 34 similar problems with part_of extrinsic to membrane part_of membrane.

ifomis.de 35 two distinct terms in GO’s cellular component ontology GO: synaptonemal complex (obsolete) GO: : synaptonemal complex

ifomis.de 36 ‘synaptonemal complex’ GO: synaptonemal complex Definition OBSOLETE. A structure that holds paired chromosomes together during prophase I of meiosis and that promotes genetic recombination.

ifomis.de 37 GO: synaptonemal complex This term was made obsolete because the definition is not true for every organism. To update annotations, use the cellular component term ‘synaptonemal complex ; GO: ’.

ifomis.de 38 ‘synaptonemal complex’ GO: synaptonemal complex Definition: A proteinaceous scaffold found between homologous chromosomes during meiosis. Yet still: synaptonemal complex part_of chromosome

ifomis.de 39 structural constituent of bone structural constituent of chorion (sensu Insecta) structural constituent of chromatin structural constituent of cuticle structural constituent of cytoskeleton structural constituent of epidermis structural constituent of eye lens structural constituent of muscle structural constituent of myelin sheath structural constituent of nuclear pore structural constituent of peritrophic membrane (sensu Insecta) structural constituent of ribosome – note possibility of confusion with ‘major ribosome unit’ (check) structural constituent of tooth enamel structural constituent of vitelline membrane (sensu Insecta) Examples of GO Functions

ifomis.de 40 structural constituent of bone structural constituent of tooth enamel are molecular functions Not biological processes Not cellular components

ifomis.de 41 structural constituent of bone structural constituent of chorion (sensu Insecta) structural constituent of chromatin structural constituent of cuticle structural constituent of cytoskeleton structural constituent of epidermis structural constituent of eye lens structural constituent of muscle structural constituent of myelin sheath structural constituent of nuclear pore structural constituent of peritrophic membrane (sensu Insecta) structural constituent of ribosome – note possibility of confusion with ‘major ribosome unit’ (check) structural constituent of tooth enamel structural constituent of vitelline membrane (sensu Insecta) what is the relation between ‘constituent’ and ‘component’?

ifomis.de 42 Units, constituents, components, parts, … What is the relation between structural constituent of ribosome and large ribosomal subunit ? How does process relate to activity ? these are questions of ontology in the philosophical sense

ifomis.de 43 Part Four GO’s Definitions

ifomis.de 44 Judith Blake: The use of bio-ontologies … ensures consistency of data curation, supports extensive data integration, and enables robust exchange of information between heterogeneous informatics systems... ontologies … formally define relationships between the concepts.

ifomis.de 45 "Gene Ontology: Tool for the Unification of Biology" an ontology "comprises a set of well- defined terms with well-defined relationships" (Ashburner et al., 2000, p. 27)

ifomis.de 46 GO’s term definitions First problem: Circularity (and worse) hemolysis Definition: The processes that cause hemolysis …

ifomis.de 47 OBO Definition of ‘part_of’: Used for representing partonomies The subject (child node) of the relationship is the subpart; the object (parent node) is the superpart.

ifomis.de 48 Principle of Intelligibility The terms used in a definition should be simpler (more intelligible, more logically or ontologically basic) than the term to be defined – for otherwise the definition would provide no assistance to the understanding -- not enough just to avoid circularity

ifomis.de 49 Example: GO: : endonuclease activity, active with either ribo- or deoxyribonucleic acids and producing 3'-phosphomonoesters Definition: Catalysis of the hydrolysis of ester linkages within nucleic acids by creating internal breaks to yield 3'- phosphomonoesters,

ifomis.de 50 Problems with GO’s definitions GO: : cell fate commitment Definition: The commitment of cells to specific cell fates and their capacity to differentiate into particular kinds of cells. x is a cell fate commitment =def x is a cell fate commitment and p

ifomis.de 51 Principle: Don’t confuse defining the meaning of a term with providing extra information about the world

ifomis.de 52 Request If GO is to introduce logical definitions, please make sure that people are involved who know some logic.

ifomis.de 53 Part Four Is this all just PHILOSOPHY ?

ifomis.de 54 Is this all just philosophy ?

ifomis.de 55 CONCLUSION (1) Problems caused by GO’s problems with formal rigor 1. Coding errors  constant updating 2. Obstacles to ontology integration 3. Unclear what kinds of reasoning permitted

ifomis.de 56 Conclusion (2) Quality assurance and ontology maintenance must be automated Automation requires robust formal architecture Robust formal architecture requires that one respects ontological principles (DL will go only some way to solving these problems)

ifomis.de 57 The End

ifomis.de 58 Why Description Logic is not enough First reason: semantics for DL is exclusively set-theoretic is_a is not set-theoretic inclusion NOT: adult is_a child NOT: animal owned by the emperor is_a animal weighing less than 200 Kg NOT: animal in Leipzig is_a animal

ifomis.de 59 Why Description Logic is not enough Second reason: DL will not tell you how complex unit subunit constituent component part … are related to each other – for that you need a philosophical analaysis

ifomis.de 60 GO’s three ontologies are separate No links or edges defined between them molecular functions cellular components biological processes

ifomis.de 61 Three granularities: Molecular (for ‘functions’) Cellular (for components) Whole organism (for processes)

ifomis.de 62 GO has cells but it does not include terms for molecules or organisms within any of its three ontologies except when it makes mistakes, e.g. GO: host =Df Any organism in which another organism spends part or all of its life cycle

ifomis.de 63 Are the relations between functions and processes a matter of granularity? Molecular activities are the ‘building blocks’ of biological processes ? But they not allowed to be represented in GO as parts of biological processes

ifomis.de 64 GO’s three ontologies molecular functions cellular components biological processes

ifomis.de 65 GO’s three ontologies molecular functions cellular components organism- level biological processes cellular processes

ifomis.de 66 ‘part-of’; ‘is dependent on’ molecular functions molecule complexe s cellular processes cellular components organism- level biological processes organisms

ifomis.de 67 molecular functions molecule complexe s cellular processes cellular components organism- level biological processes organisms

ifomis.de 68 molecule complexes cellular component s molecular function s cellular functions organism- level biological functions organisms molecular processe s cellular processes organism- level biological processes

ifomis.de 69 molecule complexes cellular component s molecular function s cellular functions organism- level biological functions organisms molecular processe s cellular processes organism- level biological processes functionings

ifomis.de 70 molecule complexe s cellular component s molecular function s cellular functions organism- level biological functions organisms molecular processe s cellular processes organism- level biological processes functionings molecular location s cellular locations organism- level locations