On the Application of Formal Principles to Life Science Data: A Case Study in the Gene Ontology Barry Smith * Jacob Köhler † Anand Kumar * *

Slides:



Advertisements
Similar presentations
Enhancing GO for the sake of clinical bionformatics Anand Kumar IFOMIS, University of Leipzig/Saarbrücken.
Advertisements

The cell Cell theory: All living things contain cells.
Chapter 3 Cells: The Living Units Organelles and Cell Cycle.
What is Ontology? Dictionary:A branch of metaphysics concerned with the nature and relations of being. Barry Smith:The science of what is, of the kinds.
Cell Structure. Two Cell Types 1. Prokaryotic Cells- Simple cells made up of a cell wall, cell membrane, cytoplasm, and DNA. They do not have membrane.
Gene Ontology John Pinney
The Gene Ontology Barry Smith March 2004.
Gene Ontology Luis Tari. Gene Ontology (GO) URL: Gene Ontology is A hierarchy of roles of genes.
Medical Ontologies: An Overview Barry Smith
Thomas Bittner and Barry Smith IFOMIS (Saarbrücken) Normalizing Medical Ontologies Using Basic Formal Ontology.
The Ontology of the Gene Ontology Barry Smith Jennifer Williams Steffen Schulze-Kremer
STOP Barry Smith Smart Terminologies via Ontological Principles.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
ifomis.de 1 Outline Part 0: HL7 RIM Part 1: Survey of GO and its problems Part 2: Extending GO to make a full ontology Part 3: Conclusion.
AP Biology The Cell Cycle Part 2.
Reference Ontologies, Application Ontologies, Terminology Ontologies Barry Smith
Ch.12 Warm up Define: genome, gametes, chromatin, chromosome, centromere, kinetochore, checkpoint, Cdk, MPF What is the longest part of the cell cycle?
La nuova biologia.blu Le cellule e i viventi
Mitosis.  In biology, mitosis is the process by which a cell separates its duplicated genome into two identical halves. It is generally followed immediately.
Cell Division.
CELL DIVISION AND REPRODUCTION © 2012 Pearson Education, Inc.
Cell Cycle & Mitosis Meiosis
Eukaryotic Cell Structure
The Cell Cycle Picture of animal cell -- Picture of plant.
Multiplication of cells takes place by division of pre- existing cells. Cell multiplication is equally necessary after the birth of the individual for.
OBJECTIVE SWBAT: Identify cellular organelles and their purpose.
CYTOSKELETON AND EXTRACELLULAR MATRIX Block 5 Erik Kessler, Michael O’Brien, Bryan Richman.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
CS 790 – Bioinformatics Introduction and overview.
Cell Division and Inheritance.  Units of inheritance are called genes.  Genes are found in chromosomes and chromatin.  Chromatin consists of DNA and.
Learning Outcome B1. Analyze the functional inter-relationship of cell structures.
The Cell Cycle & Mitosis
Studying Life Vodcast 1.3 Unit 1: Introduction to Biology.
Cell Growth and Division. Why do cells divide? DNA “Overload” DNA “Overload” –Not enough information for a big cell Exchanging Materials Exchanging Materials.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Cell Structure.
Centrosomes organize microtubules Centrioles: bundles of microtubules – Pull chromosomes, form core in cilia Centrosomes and Centrioles.
Human Genetics Mitosis and Meiosis. Chromosomes and Cell Division  How are Chromosomes replicated?  Cell Division:  Why are there two types: mitosis.
Chapter 12: The Cell Cycle. The Cell Cycle  A. The Role of Cell Division Purposes of Cell Division (Or, one reason we need that ATP from Cellular Respiration.
The Cell Cycle & Mitosis Chapter – The Cell Cycle Key Concept: – Cells have distinct phases of growth, reproduction, and normal functions.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
VR. Formal Principles for Biomedical Ontologies Barry Smith
Section 6-2 The Cell Cycle. The Cell Cycle Describes the Life of a Eukaryotic Cell Cell division in eukaryotic cells is more complex than in prokaryotic.
Organelles Biology I. Plant Cell Cellular boundaries Cell membrane: thin flexible barrier around cells Cell Wall: An inflexible barrier protects the.
By Madison Berke.  A main purpose of a cell is to organize. Cells hold a variety of pieces and each cell has a different set of functions. It’s way easier.
The Cell Cycle:. Why do cells divide? Reasons for Cell Division 1.Growth and Development (why you aren’t the same height as you were 10 years ago) 2.Repair/Replace.
Research! Look up each antibiotic from our lab and describe what it does to bacteria. As a group, draw a picture that you think best represents.
Lesson Overview Lesson Overview The Process of Cell Division Cell Division.
GO-Slim term Cluster frequency cytoplasm 1944 out of 2727 genes, 71.3% 70 out of 97 genes, 72.2% out of 72 genes, 86.1% out.
You are performing mitosis. Where is this occurring? Describe what is happening.
Species and Classification in Biology Barry Smith
Cell Division & Cell Cycle. Reproduction.
Basic Biological Principles Chemical Basis for Life.
Cell Division. Paired “threads” Figure 8.1 An Early View of Mitosis.
The Cell Cycle & Mitosis Chapter – The Cell Cycle Key Concept: – Cells have distinct phases of growth, reproduction, and normal functions.
DNA and the Genetic Code 46 molecules of DNA are located in the nucleus of all cells in the human body except sperm and oocytes –23 molecules are inherited.
Life Science. Explain that cells are the basic unit of structures and function of living organisms. Cells are the basic unit of structures of living organisms.
The Cell Chapter 4. Cells  Marks the boundary between the “ living and the dead ”  Structural and functional unit of an organism  Smallest structure.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
2/3/2005 Gene Ontology (GO) The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions.
Cells Organelles Specialization Communication. What is Alive? All living things are:  Made of cells  Obtain energy  Metabolize  Evolve  Respond 
Introduction to Cells AP Biology. The Miller-Urey Experiment.
The Cell Cycle October 12, Cell Division Functions in Reproduction, Growth, and Repair.
+ Cell checkpoints and Cancer. + Introduction Catastrophic genetic damage can occur if cells progress to the next phase of the cell cycle before the previous.
Ch.12 Warm up 1. Define: genome, gametes, chromatin, chromosome, centromere, kinetochore, checkpoint, Cdk, MPF 2. What is the longest part of the cell.
 Chapter 12~ The Cell Cycle. 3 Key Roles of Cell Division  Reproduction  Growth and development  Tissue renewal and repair.
The Cell Cycle.
What is Ontology? s Dictionary:A branch of metaphysics concerned with the nature and relations of being. Barry Smith:The science of what is, of.
Cell Division, Cell Cycle Control, and Cancer
CELL STRUCTURE AND FUNCTION
Presentation transcript:

On the Application of Formal Principles to Life Science Data: A Case Study in the Gene Ontology Barry Smith * Jacob Köhler † Anand Kumar * * †

ifomis.de 2 Part One Survey of GO

ifomis.de 3 GO is a ‘controlled vocabulary’ designed to standardize annotation of genes

ifomis.de 4 GO very successful used by over 20 genome database and many other groups in academia and industry and methodology much imitated

ifomis.de 5 GO here an example a.of the sorts of problems confronting life science data integration b.of the degree to which philosophy and logic are relevant to the solution of these problems

ifomis.de 6 GO three large telephone directories of terms used in annotating genes and gene products

ifomis.de 7 When a gene is identified three important types of questions need to be addressed: 1. Where is it located in the cell? 2. What functions does it have on the molecular level? 3. To what biological processes do these functions contribute?

ifomis.de 8 GO’s three ontologies: cellular components molecular functions biological processes March 15, 2004: 1395 component terms 7291 function terms 8479 process terms

ifomis.de 9 Cellular Component Ontology flagellum chromosome membrane cell wall nucleus (counterpart of anatomy)

ifomis.de 10 Molecular Function Ontology ice nucleation protein stabilization kinase activity binding

ifomis.de 11 Biological Process Ontology glycolysis death adult walking behavior

ifomis.de 12 Part Two GO as ‘Controlled Vocabulary’

ifomis.de 13 Principle of Univocity terms should have the same meanings (and thus point to the same referents) on every occasion of use

ifomis.de 14 Principle of Compositionality The meanings of compound terms should be determined 1. by the meanings of component terms together with 2. the rules governing syntax

ifomis.de 15 The story of ‘ / ’

ifomis.de 16 / GO: calcium/calmodulin-dependent protein kinase complex =Df An enzyme that catalyzes the phosphorylation of a protein; it requires calmodulin and calcium.

ifomis.de 17 / GO: ciliary/flagellar motility =df Locomotion due to movement of cilia or flagella.

ifomis.de 18 / GO: negative regulation of chromatin assembly/disassembly =df Any process that stops, prevents or reduces the rate of chromatin assembly and/or disassembly

ifomis.de 19 / GO: microtubule/kinetochore interaction =df Physical interaction between microtubules and chromatin via proteins making up the kinetochore complex

ifomis.de 20 / GO: G1/S transition of mitotic cell cycle =df Progression from G1 phase to S phase of the standard mitotic cell cycle.

ifomis.de 21 / GO: interpretation of nuclear/cytoplasmic to regulate cell growth =df The process where the size of the nucleus with respect to its cytoplasm signals the cell to grow or stop growing.

ifomis.de 22 / GO: hexuronate (glucuronate/galacturonate) porter activity =df Catalysis of the reaction: hexuronate(out) + cation(out) = hexuronate(in) + cation(in)

ifomis.de 23 comma male courtship behavior (sensu Insecta), wing vibration

ifomis.de 24 Part Three GO’s Formal Architecture

ifomis.de 25 Each of GO’s ontologies is organized in a graph-theoretical data structure involving two sorts of links or edges: is-a (= is a subtype of ) (copulation is-a biological process) part-of (cell wall part-of cell)

ifomis.de 26 GO’s graph-theoretic data structure designed to help human annotators to locate the designated terms for the features associated with specific genes

ifomis.de 27 GO allows Multiple Inheritance its classes may have more than one parent

ifomis.de 28

ifomis.de 29 Uses of multiple inheritance associated with errors in coding B C is-a 1 is-a 2 A ‘is-a’ no longer univocal

ifomis.de 30 ‘is-a’ is pressed into service to mean a variety of different things no rules for correct coding ambiguities serve as obstacles to integration

ifomis.de 31

ifomis.de 32 storage vacuole is-a vacuole is a storage vacuole a special kind of vacuole? is a box used for storage a special kind of box?

ifomis.de 33

ifomis.de 34 ‘within’ lytic vacuole within a protein storage vacuole lytic vacuole within a protein storage vacuole is-a protein storage vacuole time-out within a baseball game is-a baseball game embryo within a uterus is-a uterus

ifomis.de 35 Problems with Location is-located-at / is-located-in and similar relations need to be expressed in GO via some combination of ‘is-a’ and ‘part-of’ … is-a unlocalized … is-a site of … is-a … within … etc.

ifomis.de 36 Problems with location extrinsic to membrane part-of membrane

ifomis.de 37 Old GO: part-of = can be part of GO : nucleus part-of GO : cell

ifomis.de 38 Old GO: Three meanings of ‘part-of ’ ‘part-of’ = ‘can be part of’ (flagellum part-of cell) ‘part-of’ = ‘is sometimes part of’ (replication fork part-of the nucleoplasm) ‘part-of’ = ‘is included as a sublist in’

ifomis.de 39 New GO: part-of = is necessarily part of larval fat body development is necessarily part-of larval development (sensu Insecta) (seems wrong)

ifomis.de 40 Part Three GO and Life Science Data Integration

ifomis.de 41 GO’s three ontologies are separate No links or edges defined between them molecular functions cellular components biological processes

ifomis.de 42 DNA Protein Organelle Cell Tissue Organ Organism m m Granularity m

ifomis.de 43 Three granularities: Molecular (for ‘functions’) Cellular (for components) Whole organism (for processes)

ifomis.de 44 GO has cells but it does not include terms for molecules or organisms within any of its three ontologies except when it makes mistakes, e.g. GO: host =Df Any organism in which another organism spends part or all of its life cycle

ifomis.de 45 DNA Protein Organelle Cell Tissue Organ Organism m m Granularity m

ifomis.de 46 GO’s three ontologies are in fact four molecular functions cellular components organism- level biological processes cellular processes

ifomis.de 47 ‘part-of’; ‘is dependent on’ molecular functions molecule complexe s cellular processes cellular components organism- level biological processes organisms

ifomis.de 48 molecular functions molecule complexe s cellular processes cellular components organism- level biological processes organisms

ifomis.de 49 molecule complexe s cellular component s molecular function s cellular functions organism- level biological functions organisms molecular processe s cellular processes organism- level biological processes

ifomis.de 50 Human beings know what ‘walking’ means Human beings know that adults are older than embryos GO needs to be linked to ontology of development and in general to resources for reasoning about time and change

ifomis.de 51 but such linkages are possible only if GO itself has a coherent formal architecture

ifomis.de 52

ifomis.de 53 Is this just philosophy ?

ifomis.de 54 Human consequences of inconsistent and/or indeterminate use of syntactic operators 29% of GO’s contain one or more problematic syntactic operators but these terms are used in only 14% of annotations

ifomis.de 55 Computational consequences much information not available for purposes of automatic information retrieval

ifomis.de 56 Inconsistent use of ‘is-a’ and ‘part-of’ 1. leads to coding errors  constant updating 2. makes it unclear what kinds of reasoning are permissible on the basis of GO’s hierarchies 3. creates obstacles to ontology alignment and thus also to data integration

ifomis.de 57 The End Workshop: The Formal Architecture of the Gene Ontology Leipzig, May Guest Speaker: Michael Ashburner