The Ontology of the Gene Ontology Barry Smith Jennifer Williams Steffen Schulze-Kremer

Slides:



Advertisements
Similar presentations
Enhancing GO for the sake of clinical bionformatics Anand Kumar IFOMIS, University of Leipzig/Saarbrücken.
Advertisements

What is Ontology? Dictionary:A branch of metaphysics concerned with the nature and relations of being. Barry Smith:The science of what is, of the kinds.
Gene Ontology John Pinney
The Gene Ontology Barry Smith March 2004.
1 An Ontology of Relations for Biomedical Informatics Barry Smith 10 January 2005.
The Role of Foundational Relations in the Alignment of Biomedical Ontologies Barry Smith and Cornelius Rosse.
1 Introduction to (Geo)Ontology Barry Smith
Gene Ontology Luis Tari. Gene Ontology (GO) URL: Gene Ontology is A hierarchy of roles of genes.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Medical Ontologies: An Overview Barry Smith
Thomas Bittner and Barry Smith IFOMIS (Saarbrücken) Normalizing Medical Ontologies Using Basic Formal Ontology.
STOP Barry Smith Smart Terminologies via Ontological Principles.
On the Application of Formal Principles to Life Science Data: A Case Study in the Gene Ontology Barry Smith * Jacob Köhler † Anand Kumar * *
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
AN INTRODUCTION TO BIOMEDICAL ONTOLOGY Barry Smith University at Buffalo 1.
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
Pathways and Networks for Realists Barry Smith 1.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
Overview of Software Requirements
ifomis.de 1 Outline Part 0: HL7 RIM Part 1: Survey of GO and its problems Part 2: Extending GO to make a full ontology Part 3: Conclusion.
Reference Ontologies, Application Ontologies, Terminology Ontologies Barry Smith
Approaches to ---Testing Software Some of us “hope” that our software works as opposed to “ensuring” that our software works? Why? Just foolish Lazy Believe.
PHASE 3: SYSTEMS DESIGN Chapter 7 Data Design.
Sub-session 1B: General Overview of CRVS systems.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Chapter 9 Database Planning, Design, and Administration Sungchul Hong.
Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz,
Chapter 4 System Models A description of the various models that can be used to specify software systems.
System models Abstract descriptions of systems whose requirements are being analysed Abstract descriptions of systems whose requirements are being analysed.
المحاضرة الثالثة. Software Requirements Topics covered Functional and non-functional requirements User requirements System requirements Interface specification.
Knowledge representation
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Lesson Overview Lesson Overview Science in Context Lesson Overview 1.2 Science in Context.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Chapter 7 System models.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
System models l Abstract descriptions of systems whose requirements are being analysed.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Taken from Schulze-Kremer Steffen Ontologies - What, why and how? Cartic Ramakrishnan LSDIS lab University of Georgia.
Human Genome Project Daniel Ospina Joaquín Llano.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
Modeling Issues for Data Warehouses CMPT 455/826 - Week 7, Day 1 (based on Trujollo) Sept-Dec 2009 – w7d11.
VR. Formal Principles for Biomedical Ontologies Barry Smith
Naïve Approach “ABCC5 ⊑  encodes.MRP5” Critique There may be ABCC5 (sensu nucleotide chain) instances that happen to never encode any instance of the.
Some Thoughts to Consider 8 How difficult is it to get a group of people, or a group of companies, or a group of nations to agree on a particular ontology?
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
Winter 2011SEG Chapter 11 Chapter 1 (Part 1) Review from previous courses Subject 1: The Software Development Process.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
Requirements Analysis
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
1 SWE Introduction to Software Engineering Lecture 14 – System Modeling.
Basic Formal Ontology Barry Smith August 26, 2013.
Department of Mathematics Computer and Information Science1 CS 351: Database Management Systems Christopher I. G. Lanclos Chapter 4.
P3 Business Analysis. 2 Section F: Project Management F1.The nature of projects F2. Building the Business Case F4. Planning,monitoring and controlling.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Lesson Overview Lesson Overview Science in Context Lesson Overview 1.2 Science in Context Scientific methodology is the heart of science. But that vital.
1 Standards and Ontology Barry Smith
Logical Database Design and the Rational Model
DEVELOPING AN OWL-DL ONTOLOGY FOR RESEARCH AND CARE OF
Ontology in 15 Minutes Barry Smith.
Introduction to Applied and Theoretical Ontology Barry Smith
Overview Gene Ontology Introduction Biological network data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
The Gene Ontology: an evolution
Ontology in 15 Minutes Barry Smith.
What is Ontology? s Dictionary:A branch of metaphysics concerned with the nature and relations of being. Barry Smith:The science of what is, of.
Subject Name: SOFTWARE ENGINEERING Subject Code:10IS51
Presentation transcript:

The Ontology of the Gene Ontology Barry Smith Jennifer Williams Steffen Schulze-Kremer

2 The Prime Directive As the right of each sentient species to live in accordance with its normal cultural evolution is considered sacred, no Star Fleet personnel may interfere with the healthy development of alien life and culture. Such interference includes the introduction of superior knowledge, strength, or technology to a world whose society is incapable of handling such advantages wisely.

ifomis.de3 The Bioinformatics Prime Directive no computer scientist may interfere with the information resources provided by biologists

ifomis.de4 The Story of GONG Computer scientists develop browsers, query-interfaces, tools for statistical analysis or for cross- ontology mapping which take the biological information as something inviolable

ifomis.de5 IFOMIS: Renegade StarTroop Institute for Formal Ontology and Medical Information Science Faculty of Medicine University of Leipzig ifomis.de

6 The Gene Statistic The Gene Ontology

ifomis.de7 GO: the Gene Ontology 3 large telephone directories of standardized designations for gene functions and products designed to cover the whole of biology model for fungal ontology, plant ontology, drosophila ontology, etc.

ifomis.de8 Primary aim of GO not rigorous definition and principled classification but rather: providing a practically useful framework for keeping track of the biological annotations that are applied to gene products Thesis: GO can realize its goal more adequately (and avoid many coding errors) by taking ontology (especially the logic of classifications and definitions) seriously

ifomis.de9 GO: the Gene Ontology GO divided into 3 separate hierarchies each organized via is_a and part_of

ifomis.de10 Problems with is_a A is_a B = every instance of A is an instance of B

ifomis.de11 Problems with is_a Holliday junction helicase complex is_a unlocalized protein storage vacuole is_a vacuole (sensu Streptophyta)

ifomis.de12 Problems with part_of ‘part_of’ = ‘can be part of’ (flagellum part_of cell) ‘part_of’ = ‘is sometimes part of’ (replication fork part_of the nucleoplasm) ‘part_of’ = ‘is included as a sublist in’

ifomis.de13 GO divided into three disjoint term hierarchies cellular component ontology molecular function ontology biological process ontology flagellum, chromosome, cell ice nucleation, binding, protein stabilization glycolysis, death

ifomis.de14 three separate hierarchies = no is_a and no part_of relations defined between them PUZZLE: How are the classes in the three separate hierarchies linked together? cellular component ontology molecular function ontology biological process ontology

ifomis.de15 Component Component is easy to understand: A component is a 3-dimensional entity which endures through time

ifomis.de16 Process Process is easy to understand: A process is an occurrent entity = an entity which unfolds itself through time in successive temporal parts

ifomis.de17 What is a function?

ifomis.de18 Definition of «Function» UMLS Semantic Network: Functional Concept = df A concept which is of interest because it pertains to the carrying out of a process or activity. GO: Molecular Function = df the action characteristic of a gene product.

ifomis.de19 How are the 3 ontologies related? Function = “the action characteristic of a gene product” Process = “phenomenon marked by changes that lead to a particular result, mediated by one or more gene products” NO PART-WHOLE RELATIONS BETWEEN FUNCTION AND PROCESS ONTOLOGIES

ifomis.de20 The True Story about Process and Function A process is an occurrent entity A component is a continuant entity

ifomis.de21 The True Story about Function and Process A process is an occurrent entity A component is an independent continuant entity There are also dependent continuant entities: qualities, roles, dispositions, powers … and functions

ifomis.de22 The function of your heart is: to pump blood This function endures through time and gets exercised. This function exists even when it is not being exercised The exercise of a function is a process

ifomis.de23 Functions exist even when they are not being expressed Functions exist even when there is no functioning

ifomis.de24 Constitiuent-Process-Function Processes depend on constituents Processes realize functions Constituents have functions

ifomis.de25 Dependent continuants are realized through occurrent processes the exercise of a function the performance of a role the execution of a plan the application of a therapy the realization of a disposition the course of a disease

ifomis.de26 GO: “A biological process is accomplished via one or more ordered assemblies of molecular functions.”

ifomis.de27 But no: “GO molecular functions are occurrent rather than continuant. The terminology we've used to date is, I agree, confusing but the activities described in the molecular function ontology are events -- they represent the function as it is exercised rather than the potential to exercise that function.”

ifomis.de28 “The defintions you cite are certainly inconsistent with this at the moment, but this is a temporary situation. … true path violations … do crop up fairly regularly, but are always fixed.”

ifomis.de29 Confusion of Function and Activity If function = activity (= functioning) how can GO deal with dormant/suppressed functions? How can GO deal with the relation of expression which involves a function and its exercise?

ifomis.de30 A step towards clarity On March 2003 (nearly) all nodes in the Molecular Function ontology (except the root) had ‘activity’ added to their names Function = activity How does ‘process’ relate to ‘activity’

ifomis.de31 GO’s answer “A biological process is accomplished via one or more ordered assemblies of molecular functions.” BUT: there are no part-whole relations across ontologies Result: constant coding errors resulting from lack of clear principles as concerns what the basic notions of ‘function’ and ‘process’ mean

ifomis.de32 Examples of GO Molecular Functions anti-coagulant activity (defined as: “a substance that retards or prevents coagulation”) enzyme activity (defined as: “a substance that catalyzes”) structural molecule (defined as: “the action of a molecule that contributes to structural integrity”)

ifomis.de33 GO: : structural constituent of cell wall Definition: The action of a molecule that contributes to the structural integrity of a cell wall. confuses constituents with actions, which GO includes in its function ontology.

ifomis.de34 extracellular matrix structural constituent + puparial glue (sensu Diptera) structural constituent of bone structural constituent of chorion (sensu Insecta) structural constituent of chromatin structural constituent of cuticle + structural constituent of cytoskeleton structural constituent of epidermis + structural constituent of eye lens structural constituent of muscle structural constituent of myelin sheath structural constituent of nuclear pore structural constituent of peritrophic membrane (sensu Insecta) structural constituent of ribosome structural constituent of tooth enamel structural constituent of vitelline membrane (sensu Insecta)

ifomis.de35 Problems caused by lack of intuitive formal understandings of its basic ontological terms The need for expert knowledge places severe obstacles in the way of using GO as a basis for computer applications computers do not have access to expert biological knowledge

ifomis.de36 As GO increases in size and scope it will “be increasingly difficult to maintain the semantic consistency we desire without software tools that perform consistency checks and controlled updates”. The addition of each new term will require the curator to understand the entire structure of GO in order to avoid redundancy and to ensure that all appropriate linkages are made with other terms.

ifomis.de37 Benefits of the GO Approach 1) Work on populating GO could start immediately, without its authors needing to solve some of the intricate problems which face ontologies when formalized as logical theories. 2) Populating GO does not require the completion of complex protocols of formally determined steps but can be done intuitively by the expert biologist. 3) There are few formal constraints standing in the way of easy incorporation of existing controlled vocabularies from the biological domain.

ifomis.de38 Drawbacks 1) It is unclear what kinds of reasoning are permissible on the basis of GO’s hierarchies. 2) The rationale of GO’s subclassifications is unclear. 3) No procedures are offered by which GO can be validated. 4) There are insufficient rules for determining how to recognize whether a given concept is or is not present in GO.

ifomis.de39 GO DOES NOT COMPUTE Solution: Rebuild from scratch before it is too late MANGO