Towards a Top-Level Ontology for Molecular Biology Stefan Schulz 1, Elena Beisswanger 2, Udo Hahn 2, Joachim Wermter 2, 1 Freiburg University Hospital,

Slides:



Advertisements
Similar presentations
Species-Neutral vs. Multi-Species Ontologies Barry Smith.
Advertisements

Schulz S, Boeker M – BioTopLite – An Upper Level Ontology for the Life Sciences – ODLS 2013 BioTopLite : An Upper Level Ontology for the Life Sciences.
A Description Logics Approach to Clinical Guidelines and Protocols Stefan Schulz Department of Med. Informatics Freiburg University Hospital Germany Udo.
On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
Bridging Multiple Ontologies: Route to Representation of the Liver Immune Response Anna Maria Masci, Jeffrey Roach, Bernard de Bono, Pierre Grenon, Lindsay.
Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium.
Stefan Schulz Language and Information Engineering (JULIE) Lab, Jena University, Germany Representing Natural Kinds by Spatial Inclusion and Containment.
Developing a protein-interactions ontology Esther Ratsch European Media Laboratory.
Towards a Computational Paradigm for Biological Structure Stefan Schulz Department of Medical Informatics University Hospital Freiburg (Germany) Udo Hahn.
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo
1 The OBO Foundry Towards Gold Standard Terminology Resources in the Biomedical Domain Thomas Bittner (based on a presentation by Barry Smith)
The Role of Foundational Relations in the Alignment of Biomedical Ontologies Barry Smith and Cornelius Rosse.
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
FMA: a domain reference ontology Comments on Cornelius Rosse’s talk Anita Burgun WG6 meeting, Rome 29 Apr- 2 May 2005.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
Thomas Bittner and Barry Smith IFOMIS (Saarbrücken) Normalizing Medical Ontologies Using Basic Formal Ontology.
The Problem of Reusability of Biomedical Data OBO Foundry & HL7 RIM Barry Smith.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Underlying Ontologies for Biomedical work - The Relation Ontology (RO) and Basic Formal Ontology (BFO) Thomas Bittner SUNY Buffalo
Using Ontologies to Represent Immunological Networks Lindsay G. Cowell, Anne Lieberman, Anna Maria Masci Duke University Center for Computational Immunology.
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
Towards Interoperability of Biomedical Ontologies Workshop on Biological Upper Ontologies BioTop and Chemistry Ontology Stefan Schulz, Elena Beisswanger,
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
Room for Lunch: Arlington Room Room for Evening Reception: Grand Prairie Room.
The RNA Ontology RNAO Colin Batchelor Neocles Leontis May 2009 Eckart, Colin and Jane In Cambridge.
1 BIOLOGICAL DOMAIN ONTOLOGIES & BASIC FORMAL ONTOLOGY Barry Smith.
How to Organize the World of Ontologies Barry Smith 1.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
Limning the CTS Ontology Landscape Barry Smith 1.
Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz,
Standardization of Anatomy Parts and Wholes – From Function to Location Stefan Schulz Department of Medical Informatics University Hospital Freiburg (Germany)
The Foundational Model of Anatomy and its Ontological Commitment(s) Stefan Schulz University Medical Center, Freiburg, Germany FMA in OWL meeting November.
Stefan Schulz Medical Informatics Research Group
Ontology of Sensors: Some Examples from Biology
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
Alignment of the UMLS Semantic Network with BioTop Methodology and Assessment Stefan Schulz University Medical Center, Freiburg, Germany Elena Beisswanger.
Applying Rigidity to Standardizing OBO Foundry Candidate Ontologies A.Patrice Seyed and Stuart C. Shapiro Department of Computer Science Center for Cognitive.
1 Enriching and Designing Metaschemas for the UMLS Semantic Network Department of Computer Science New Jersey Institute of Technology Yehoshua Perl James.
The Chronious Ontology Suite: Methodology and Design Principles Luc Schneider[1], Mathias Brochhausen[1,2] [1] Institute for Formal Ontology and Medical.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
A School of Information Science, Federal University of Minas Gerais, Brazil b Medical University of Graz, Austria, c University Medical Center Freiburg,
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Ontological Foundations of Biological Continuants Stefan Schulz, Udo Hahn Text Knowledge Engineering Lab University of Jena (Germany) Department of Medical.
From GENIA to BioTop – Towards a Top-Level Ontology for Biology Stefan Schulz, Elena Beisswanger, Udo Hahn, Joachim Wermter, Anand Kumar, Holger Stenzhorn.
The ICPS: A taxonomy, a classification, an ontology or an information model? Stefan SCHULZ IMBI, University Medical Center, Freiburg, Germany.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
The Plant Ontology: Development of a Reference Ontology for all Plants Plant Ontology Consortium Members and Curators*: Laurel D.
How to Distinguish Parthood from Location in Bio-Ontologies Stefan Schulz a,b, Philipp Daumke a, Barry Smith c,d, Udo Hahn e a Department of Medical Informatics,
2 3 where in the body ? where in the cell ?
Naïve Approach “ABCC5 ⊑  encodes.MRP5” Critique There may be ABCC5 (sensu nucleotide chain) instances that happen to never encode any instance of the.
Need for common standard upper ontology
Towards a Top-Domain Ontology for Linking Biomedical Ontologies Holger Stenzhorn a,b Elena Beißwanger c Stefan Schulz a a Department of Medical Informatics,
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
Higgs bosons, mars missions, and unicorn delusions: How to deal with terms of dubious reference in scientific ontologies Stefan Schulz a,b Mathias Brochhausen.
OBO Foundry Principles BFO RO Barry Smith 1. OBO Foundry Principles  open  common formal language (OBO Format, OWL DL, CL)  commitment to collaboration.
Basic Formal Ontology Barry Smith August 26, 2013.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Stefan Schulz, Martin Boeker, Holger Stenzhorn: How Granularity Issues Concern Biomedical Ontology Integration Stefan Schulz, Martin Boeker, Holger Stenzhorn.
Towards a Computational Paradigm for Biological Structure
DEVELOPING AN OWL-DL ONTOLOGY FOR RESEARCH AND CARE OF
Representing Natural Kinds by Spatial Inclusion and Containment
September 8, 2015 | Basel, Switzerland
Ontological Foundations for Biomedical Sciences
Developing a protein-interactions ontology
Why do we need upper ontologies? What are their purported benefits?
Stefan SCHULZ IMBI, University Medical Center, Freiburg, Germany
Standardization of Anatomy Parts and Wholes – From Function to Location Stefan Schulz Department of Medical Informatics University Hospital Freiburg (Germany)
Standardization of Anatomy Parts and Wholes – From Function to Location Stefan Schulz Department of Medical Informatics University Hospital Freiburg (Germany)
Presentation transcript:

Towards a Top-Level Ontology for Molecular Biology Stefan Schulz 1, Elena Beisswanger 2, Udo Hahn 2, Joachim Wermter 2, 1 Freiburg University Hospital, Department of Medical Informatics, Germany 2 Jena University Language and Information Engineering (JULIE) Lab

What’s an ontology…?

Our notion of ontology Universally accepted truths Consolidated but context- dependent facts Hypotheses, beliefs, statistical associations Domain Knowledge

Consolidated but context- dependent facts Hypotheses, beliefs, statistical associations Ontology ! Universally accepted truths Domain Knowledge

Data Sources in Biomedicine ONTO LOGIES PubMed Literature Collection Raw Data Experimental Data Yeast UniProt FlyBase Databases Mouse Clinical Data

ONTO LOGIES PubMed Literature Collection Raw Data Experimental Data Yeast UniProt FlyBase Databases Mouse Clinical Data Bioontologies are devised to support Data Annotation Data Integration

How can ontologies be structured?

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology Transcription DNA-dependent transcription antisense RNA transcription mRNA transcription rRNA transcription tRNA transcription … (from Gene Ontology) OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology DOLCE BFO OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology DOLCE BFO Entity Continuant Dependent Continuant Realizable Entity Function Role Independent Continuant Object Object Aggregate Occurrent (from BFO) OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Domain Upper Ontology / Top Domain Ontology Upper Ontology / Top Ontology Domain Ontology DOLCE BFO OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology DOLCE BFO OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology DOLCE BFO Biology Domain: Organism Body Part Cell Cell Component Tissue Protein Nucleic Acid DNA RNA Biological Function Biological Process OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology Simple Bio Upper Ontology GFO-Bio DOLCE BFO GENIA OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …) Ontological Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology Simple Bio Upper Ontology GFO-Bio GENIA DOLCE BFO Ontological Layers BioTop OBO (Gene Ontology, Sequence Ontology, Cell Ontology, Mouse Anatomy, ChEBI, …)

GENIA as example for top level domain ontology.

Developed at Tsujii Laboratory, University Tokyo Taxonomy, 48 classes Verbal definitions (“scope notes”) No relations other than subclass relations Purpose: Semantic annotation of biological papers “The GENIA ontology is intended to be a formal model of cell signaling reactions in human. It is to be used as a basis of thesauri and semantic dictionaries for natural language processing applications […] Another use of the GENIA ontology is to provide a basis for integrated view of multiple databases” The GENIA Ontology

corpus/GENIAontology.owl GENIA Ontology

GENIA Critique

Amino Acid Monomer: “An amino acid monomer, e.g. tyrosine, serine, tyr, ser” DNA: “DNAs include DNA groups, families, molecules, domains, and regions” Insufficient Definitions

False Taxonomic Parent GENIA: Source ‘Organism’, ‘Tissue’, … are objects and not primarily sources! ‘Source’ is a role … “Sources are biological locations where substances are found …”

Are the instances of ‘Cell Type’ classes ? Otherwise rename the class as ‘Cell’! Misconception of “Type” GENIA: Cell Type “A cell type, e.g. T- Lymphocyte, T-cell, astrocyte, fibroblast”

Non-Conformant Naming Policy GENIA: Amino Acid, GENIA: Protein “An amino acid molecule or the compounds that consist of amino acids.” “Proteins include protein groups, families, molecules, complexes, and substructures.” Names suggest single molecules Don’t work against biologists’ intuition!

GENIA Redesign: BioTop

Ontologically founded Top level ontology for biology Extensive use of formal relations (OBO) Formal semantics (OWL-DL) Use as a semantic glue between existing ontologies Scope: as GENIA (in a 1 st phase) BioTop

BioTop Relations partOf properPartOf locatedIn derivesFrom hasParticipant

BioTop Relations partOf properPartOf locatedIn derivesFrom hasParticipant hasFunction (functionOf)

BioTop Relations grainOf (hasGrain) componentOf (hasComponent) partOf properPartOf locatedIn derivesFrom hasParticipant hasFunction (functionOf)

hasGrain non-transitive subrelation of hasPart Example: “Collective of Cell hasGrain only Cell” Collectives can gain or lose grains without changing their identity Collectives and hasGrain

hasComponent non-transitive subrelation of hasPart Example: Definition of Protein Compounds are determined by their components Compounds and hasComponent Protein Arginine Carbon Oxygen NH 2

Classes defined by necessary and sufficient conditions whenever possible Rationales -Precise understanding of meaning -Empowering the classifier for automated validation processes Required introduction of new classes, e.g. Phosphate, Ribose Full Definitions Sufficient conditions for class ‘Nucleotide’ Nucleotide Ribose Base Phosphate

Rearranged Classes GENIABioTop hasComponent componentOf properPartOf

BioTop.owl : -143 non-taxonomic relation instances (seven relation types) -128 classes BioTopGenia.owl : -Imports GENIA -Maps BioTop classes to GENIA classes BioTop Figures

BioTop: Interfacing with OBO Ontologies Biological Process↔ Biological Process Gene Ontology Protein Function ↔ Molecular Function Gene Ontology Cell Component ↔ Cellular Component Gene Ontology Cell ↔ Cell Cell Ontology and Cell FMA Atom↔ Atoms ChEBI Organic Compound↔ Organic Molecular Entities ChEBI Tissue ↔ Tissue FMA DNA, RNA↔ DNA Sequence Ontology, RNA Sequence Ontology Protein↔ Protein Sequence Ontology BioTopOBO Ontologies

Proposal: BioTop to enrich OBO Foundry RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy?) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO)

OBO Foundry and BioTop RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy?) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) BioTop

BFO BioTop The OBO Foundry

Outlook Integration with BFO (work in process) Completion of textual and formal definitions Extension of process and function branch Integration of BioTop in OBO foundry Mapping BioTop to the UMLS Semantic Network Use of BioTop in text mining: Experimental validation of added value compared to GENIA / thesaurus approach Empirical Studies to create evidence of whether Formal Ontologies better serve the needs of knowledge annotation / processing in biomedicine Informal Thesauri are sufficient

if inadequate if inconsistent Analyze/ complete textual class definitions Analyze/ clean taxonomic relations Check ontological nature of classes Identify interfaces to existing biomedical ontologies Make classes disjoint (wherever possible) Check consistency Check inferred hierarchy for adequacy Add /change necessary and (wherever possible) sufficient conditions Select formal relations Connect classes to upper ontology Analyze/correct non- taxonomic relations, add descriptions BioTop Redesign is going on

if inadequate if inconsistent BioTop Redesign is going on Analyze/ complete textual class definitions Analyze/ clean taxonomic relations Check ontological nature of classes Identify interfaces to existing biomedical ontologies Make classes disjoint (wherever possible) Check consistency Check inferred hierarchy for adequacy Add /change necessary and (wherever possible) sufficient conditions Select formal relations Connect classes to upper ontology Analyze/correct non- taxonomic relations, add descriptions Contributions / Critique welcome: saarland.de

Towards a Top-Level Ontology for Molecular Biology Stefan Schulz 1, Elena Beisswanger 2, Udo Hahn 2, Joachim Wermter 2, 1 Freiburg University Hospital, Department of Medical Informatics, Germany 2 Jena University Language and Information Engineering (JULIE) Lab

hasGrain non-transitive subrelation of hasPart “Collective of x hasGrain x” Collectives do not depend on single elements Controversial: Referents in biological texts: Collectives (pluralities) or count objects? “Emerin is phosphorylated on serine 49 by protein kinase A” Collectives and Relation hasGrain PKA or Emerin

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology Ontology Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Ontology Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) DOLCE BFO Ontology Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) DOLCE BFO Ontology Layers

Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Simple Bio Upper Ontology GFO-Bio GENIA DOLCE BFO BioTop Ontology Layers

GENIA Ontology

topics/Corpus/genia-ontology.html GENIA Ontology

45 classes Taxonomy (taxonomic relations not clearly defined) Text definitions (“scope notes”) Developed for semantic annotation of biological papers “The GENIA ontology is intended to be a formal model of cell signaling reactions in human.... use of the GENIA ontology is to provide a basis for integrated view of multiple databases“ GENIA Ontology

GENIA Critique 1. Incomplete Textual Definitions 2. Modeling Errors

1.Amino Acid Monomer: “An amino acid monomer, e.g. tyrosine, serine, tyr, ser” 2. DNA: “DNAs include DNA groups, families, molecules, domains, and regions” GENIA Scope Notes

“… substances involved in biochemical reactions.” “Sources are biological locations where substances are found and their reactions take place, …” Critique: Organism, body part, cell type, are not primarily ‘sources’… Suggestion: ‘Source’ as role, ‘object’ as superclass instead False Taxonomic Parent GENIA: Source

“… substances involved in biochemical reactions.” “Sources are biological locations where substances are found and their reactions take place, …” Critique: Organism, body part, cell type, are not primarily ‘sources’… Suggestion: ‘Source’ as role, ‘object’ as superclass instead False Ontological Parent GENIA: Source

Critique: What is meant by ‘Cell Type’ ? 1.The universal cell (instantiated by e.g. a particular T cell) 2.A meta level category (instantiated by e.g. the universal T cell) “A cell type, e.g. T- Lymphocyte, T-cell, astrocyte, fibroblast [sic]” Misconception of Type GENIA: Cell Type

Critique: What is meant by ‘Cell Type’ ? 1.The universal cell (instantiated by e.g. a particular T cell) 2.A meta level category (instantiated by e.g. the universal T cell) “A cell type, e.g. T- Lymphocyte, T-cell, astrocyte, fibroblast [sic]” Misconception of Type GENIA: Cell Type

Critique: 1. A protein family is not a ‘protein’ 2. Refers to a human-made classification grouping proteins of the same function / location / structure. Suggestion: Introduce classes ‘Protein Function’, ‘Protein Role’ and subclasses to classify proteins “A protein family or group, e.g. STATs” Misclassification GENIA: Protein Family or Group

Critique: 1. A protein family is not a ‘protein’ 2. Refers to a human-made classification grouping proteins of the same function / location / structure. Suggestion: Introduce classes ‘Protein Function’, ‘Protein Role’ and subclasses to classify proteins “A protein family or group, e.g. STATs” Misclassification GENIA: Protein Family or Group

Ontologically irrelevant, but necessary for exhaustively partitioning a domain Critique: Inconsequent use Missing definitions Suggestion: Consequent use of defined residual classes Inconsistent Use of Residual Classes

Ontologically irrelevant, but necessary for exhaustively partitioning a domain Critique: Inconsequent use Missing definitions Suggestion: Consequent use of defined residual classes Inconsistent Use of Residual Classes

“Tissue, e.g. peripheral blood, lymphoid tissue, vascular endothelium [sic]” Critique: Not clear what is an instance of tissue. 1.Totality of the mass/collective, e.g. all red blood cells (RBCs) 2.Any portion of it, e.g. RBC in a lab sample 3.The minimal constituent, e.g. a single RBC Suggestion: Strict distinction between singletons and collectives. New class ‘collective’ and respective sublcasses Collectives vs. their Elements GENIA: Tissue

“Tissue, e.g. peripheral blood, lymphoid tissue, vascular endothelium [sic]” Critique: Not clear what is an instance of tissue. 1.Totality of the mass/collective, e.g. all red blood cells (RBCs) 2.Any portion of it, e.g. RBC in a lab sample 3.The minimal constituent, e.g. a single RBC Suggestion: Strict distinction between singletons and collectives. New class ‘collective’ and respective sublcasses Collectives vs. their Elements GENIA: Tissue

Critique: ‘Amino Acid ‘ counterintuitive name (suggests single molecule, not chain) Suggestion: Remove class ‘Amino Acid’, use classes ‘Amino Acid Monomer’ and ‘Amino Acid Polymer’ with proper definitions “An amino acid molecule or the compounds that consist of amino acids.” “An amino acid monomer e.g., tyrosine, serin, tyr, ser ” Non-Conformable Naming Policy

“An amino acid molecule or the compounds that consist of amino acids.” “An amino acid monomer e.g., tyrosine, serin, tyr, ser ” Non-Conformable Naming Policy Critique: ‘Amino Acid ‘ counterintuitive name (suggests single molecule, not chain) Suggestion: Remove class ‘Amino Acid’, use classes ‘Amino Acid Monomer’ and ‘Amino Acid Polymer’ with proper definitions

Critique: Many taxonomic links ontologically incorrect. Example: part-of mistaken for is-a Suggestion: Reduce is-a to its proper meaning (subclass-of) introduce part-of and other relations Lack of Non-Taxonomic Relations

Critique: Many taxonomic links ontologically incorrect. Example: part-of mistaken for is-a Suggestion: Reduce is-a to its proper meaning (subclass-of) introduce part-of and other relations Lack of Non-Taxonomic Relations

From GENIA to BioTop

BioTop Upper level ontology for biology (focusing on biomedicine and molecular biology) Ontologically founded classes and relations Adding formal semantics to only verbally defined classes Language: OWL-DL Editing environment: Protégé

Basis: OBO Relation Ontology -properPartOf and hasProperPart -locatedIn and locationOf -derivesFrom -hasParticipant and participatesIn Refined partOf: -properPartOf and hasProperPart: irreflexive -componentOf and hasComponent: nontransitive -grainOf and hasGrain:nontransitive Additional: -hasFunction and functionOf BioTop Relations

Controversial: Referents in biological texts: Collectives (pluralities) or count objects? “Emerin is phosphorylated on serine 49 by protein kinase A” BioTop: “Collective of x hasGrain x” hasGrain non-transitive subrelation of hasPart cf. Schulz & Jansen, KR-MED 2006 Collectives and Relation hasGrain PKA or Emerin

Arginine Carbon Oxygen NH 2 Relation hasComponent Protein Non-transitive subrelation of hasPart Example: Definition of Protein -Protein hasPart only amino acid is not true (hasPart transitive) -Protein hasComponent only amino acid is true (hasComponent not transitive)

Classes defined by necessary and sufficient criteria Rationales -Precise understanding of meaning -More rigor in automated validation process (terminological classifier) Introduction of new classes required Example Nucleotide Full Definitions

Snapshot Rearranged Classes

Non-Physical Continuant -Biological Function -Biological Location Process -Biological Process New Branches in BioTop

??# Classes, ??# fully defined ??# Relations (evtl. Noch aufgeschluesselt) BioTop Results

Protein Function BioTop ↔ Molecular Function GO Cell Component BioTop ↔ Cellular Component GO Biological Process BioTop ↔ Biological Process GO Cell BioTop ↔ Cell Cell Ontology and Cell FMA Atom BioTop ↔ Atoms ChEBI Organic Compound BioTo ↔ Organic Molecular Entities ChEBI Tissue BioTop ↔ Tissue FMA DNA BioTop, RNA BioTop ↔ DNA SO, RNA SO Protein BioTop ↔ Protein SO Organism BioTop ↔ ?? NCBI Taxonomy Ontology Integration

NCBI Taxonomy BioTop ChEBI Cell Ontology Molecular Function (GO) Biological Process (GO) FMA Sequence Ontology Cellular Component (GO) Ontology Integration

BioTopGenia.owl : -Imports GENIA -Provides links from BioTop to GENIA classes Mapping to GENIA

Outlook Integration with BFO (work in process) Completion of textual and formal definitions Extension of process and function branch Integration of BioTop in OBO foundry Use of BioTop in text mining: Experimental validation of added value compared to GENIA / thesaurus approach Create evidence of whether Formal Ontologies better serve the needs of knowledge annotation / processing in biomedicine Informal Thesauri are sufficient

if inadequate if inconsistent Ontology Redesign is going on Analyze/ complete textual class definitions Analyze/ clean taxonomic relations Check ontological nature of classes Identify interfaces to existing biomedical ontologies Make classes disjoint (wherever possible) Check consistency Check inferred hierarchy for adequacy Add /change necessary and (wherever possible) sufficient conditions Select formal relations Connect classes to upper ontology Analyze/correct non- taxonomic relations, add descriptions

if inadequate if inconsistent Ontology Redesign is going on Analyze/ complete textual class definitions Analyze/ clean taxonomic relations Check ontological nature of classes Identify interfaces to existing biomedical ontologies Make classes disjoint (wherever possible) Check consistency Check inferred hierarchy for adequacy Add /change necessary and (wherever possible) sufficient conditions Select formal relations Connect classes to upper ontology Analyze/correct non- taxonomic relations, add descriptions