Stefan Schulz – Medical University of Graz (Austria) with input from

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
FP7 meeting - Gent - Carlos Rodríguez - April 18 WP4: Conceptual Mining from Text for Knowledge Engineering State of the Art WP Coordinators: Alfonso Valencia.
Catalina Martínez-Costa, Stefan Schulz: Ontology-based reinterpretation of the SNOMED CT context model Ontology-based reinterpretation of the SNOMED CT.
Schulz S, Boeker M – BioTopLite – An Upper Level Ontology for the Life Sciences – ODLS 2013 BioTopLite : An Upper Level Ontology for the Life Sciences.
Gene Ontology John Pinney
Developing a protein-interactions ontology Esther Ratsch European Media Laboratory.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
9/6/2001Database Management – Fall 2000 – R. Larson Information Systems Planning and the Database Design Process University of California, Berkeley School.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz,
Computational Biology and Informatics Laboratory Development of an Application Ontology for Beta Cell Genomics Based On the Ontology for Biomedical Investigations.
Ontology Development in the Sciences Some Fundamental Considerations Ontolytics LLC Topics:  Possible uses of ontologies  Ontologies vs. terminologies.
Stefan Schulz Medical Informatics Research Group
Applying Belief Change to Ontology Evolution PhD Student Computer Science Department University of Crete Giorgos Flouris Research Assistant.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
The Chronious Ontology Suite: Methodology and Design Principles Luc Schneider[1], Mathias Brochhausen[1,2] [1] Institute for Formal Ontology and Medical.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
A School of Information Science, Federal University of Minas Gerais, Brazil b Medical University of Graz, Austria, c University Medical Center Freiburg,
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Advanced topics in software engineering (Semantic web)
From GENIA to BioTop – Towards a Top-Level Ontology for Biology Stefan Schulz, Elena Beisswanger, Udo Hahn, Joachim Wermter, Anand Kumar, Holger Stenzhorn.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
A View-based Methodology for Collaborative Ontology Engineering (VIMethCOE) Ernesto Jiménez Ruiz Rafael Berlanga Llavorí Temporal Knowledge Bases Group.
Master headline RDFizing the EBI Gene Expression Atlas James Malone, Electra Tapanari
Mining the Biomedical Research Literature Ken Baclawski.
Towards a Top-Domain Ontology for Linking Biomedical Ontologies Holger Stenzhorn a,b Elena Beißwanger c Stefan Schulz a a Department of Medical Informatics,
Approach to building ontologies A high-level view Chris Wroe.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Stefan Schulz, Martin Boeker, Holger Stenzhorn: How Granularity Issues Concern Biomedical Ontology Integration Stefan Schulz, Martin Boeker, Holger Stenzhorn.
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Appendix 2 Automated Tools for Systems Development
Integrating SysML with OWL (or other logic based formalisms)
Formal ontologies vs. triple based KR gap or convergence?
The BioTop Family of Upper Level Ontological Resources for Biomedicine
Can SNOMED CT be harmonized with an upper-level ontology?
Modern Systems Analysis and Design Third Edition
Development of the Amphibian Anatomical Ontology
DEVELOPING AN OWL-DL ONTOLOGY FOR RESEARCH AND CARE OF
The Systems Engineering Context
Complexity Time: 2 Hours.
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Web Engineering.
Developing a protein-interactions ontology
Modified from slides from Jim Hu and Suzi Aleksander Spring 2016
Architecture for ICD 11 and SNOMED CT Harmonization
Tools of Software Development
ro.owl and shortcut relations
Harmonizing SNOMED CT with BioTopLite
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
CCO: concept & current status
2. An overview of SDMX (What is SDMX? Part I)
The Gene Ontology: an evolution
Ontology-Based Approaches to Data Integration
CA COST Action CA15205 Gene Regulation Ensemble Effort for the Knowledge Commons From GRO (Gene regulation ontology) to GRAO (Gene regulation application.
Stefan SCHULZ IMBI, University Medical Center, Freiburg, Germany
Chapter 11 user support.
Leadership Chapter 7 – Path-Goal Theory Northouse, 4th edition.
Department of Computer Science Abdul Wali Khan University Mardan
Chapter 4 System Modeling.
ODLS 2016 Workshop on Ontologies and Data in Life Sciences, Sep 29-30, Halle (Saale), Germany Ontological interpretation of biomedical database annotations.
Modern Systems Analysis and Design Third Edition
Presentation transcript:

The Gene Regulation Ontology (GRO): Quality, Requirements, Scope, Redesign Principles, Stefan Schulz – Medical University of Graz (Austria) with input from Astrid Lægreid, NTNU (Norway) Martin Kuiper, NTNU (Norway) Jesualdo Tomás Fernández Breis, University of Murcia (Spain) Chris Mungall, Bey Lab (U.S.) Michel Dumontier, University of Maastricht (The Netherlands)

History of GRO FP6 BootSTREP (2006 – 09) Design: Resources: Purpose: Focus: E. Coli gene regulation Design: OBO Foundry principles OWL-DL (Tbox only) Resources: GO, SO, ChEBI, NCBI, TransFac, RO UMLS and 150 MEDLINE abstracts Purpose: Semantic annotation of documents Basis for SWRL rules to derive biological knowledge Use: ISA software suite KINO search engine Annotations of GREC BioNLP Shared Task 2013

Revisiting GRO in GREEKC Creation of Google document on March 8, 2017: Issues addressed: Analysis of existing GRO 0.5 according to the OQUARE quality framework (Jesualdo Tomás Fernández Breis) Analysis of overlap of GRO 0.5 with other ontologies using OntoEnrich (Jesualdo Tomás Fernández Breis) Requirement analysis (Martin Kuiper, Astrid Lægreid) Start of redesign: GRO2, aligned with upper-level ontology BTL2 (Stefan Schulz) Scoping / Limitations (Chris Mungall)

GRO 0.5 x OQuaRE Ontology metrics: 507 classes, 24 object properties 9 datatype properties total of 2736 axioms OQuaRE metrics (detailed characteristics: 1  … 5): Very good (> 4): controlled vocabulary; consistency, redundancy Good (> 3 – 4): adaptability, analysability, testeability, changeability Fair (> 2 – 3): results representation, formal relation support, reusability, changeability, replaceability Poor (> 1 - 2): cohesion, availability OQuaRE assessment: GRO should be better related to other ontologies. Maintaineability should be improved 759 subclassOf axioms 65 equivalentClasses 91 disjointClasses

GRO 0.5 overlap (ontoEnrich) results of lexical alignment of GRO with OBO ontologies. Constraints: only full matches and a minimum coverage of 3%

GRO 0.5 overlap (ontoEnrich) Top 20 GRO classes with largest amount of full matches

GRO Requirement analysis (I) What GRO needs to cover Acknowledgements: Dietrich Rebholz-Schuhmann

GRO Content Requirements (I) Transcription factors (TFs) DNA-binding or protein-binding Complex (dimer, other) TF/Complex activity (target gene (TG) specificity, activating/repressing, other; - all these change with nature of complex; each TF promiscuous regarding complex formations) Posttranslational modifications (PTMs) (mostly phosphorylations), effect on activity Ligand association (for TFs that are ligand-regulated nuclear receptors) Localization (nuclear, cytoplasmic, other) Chromatin modifiers (proteins, non-coding RNAs, others) Similar list to the above for TFs

GRO Content Requirements (II) Chromatin 3D structure (nucleome) and dynamics (responses to internal and external cues), including interactions with nuclear lamina Nucleome functional consequences for gene regulation (which TFs can bind where, can participate in which complexes, etc.) Cellular context Cell type, origin (tissue, organ, developmental stage, etc.) Cell status (primary, cancer cell line, other) Cell state (proliferating, migrating, etc.)

GRO Functional requirements Structuring of content of knowledge bases Manual biocuration Text mining assisted biocuration Querying of knowledge bases Logical reasoning Building of molecular pathways linking intracellular signalling and gene regulation Taking cellular (tissue-/organ-/organismal-?) context into account Building and analysing directed networks (large, small) accounting for the integrated signalling and gene regulation molecular mechanisms in a cellular (tissue-/organ-/organismal-?) context

GRO Limitations Inherent limitations of ontologies Express what is universally true Borderline: including dispositions (some continuant has the ability to participate in processes of a certain kind) Representation of contingent / probabilistic / default knowledge outside the scope of ontologies Representation of quantitative information Representation of sequences – relation to SO? sequences as physical segments of polymers vs. sequences as abstract (information) entities Abox entities (individuals, instances)? Datatype properties? OWL full?

GRO2 methodology Goal: alignment with upper-level ontology BTL2 (BioTopLite v2) Reasons for upper-level alignment in general avoids creating an specific upper level for every ontology reduces the freedom of the modeller: improves interoperability reduced the number of primitives: less ambiguities Reasons for BTL2 in particular: good experiences with BTL2 in numerous projects aligned with BFO, but with relations highly axiomatised more "pragmatic" than BFO Alignment done by ontology expert (Stefan Schulz) Time needed (including documentation): 5 hours Tools: Protégé 5; text editor

GRO GRO2: steps 1-10 Import of BTL2 gro:roles under btl2:role molecular entities under btl2:monomolecular entity removal of disjoints that cause inconsistencies, e.g. complex molecular entity disjoint with protein complex molecular entity disjoint with protein domain molecular function under btl2:process Substitution of hasFunction (regarding BFO “activities”) by 'btl2:is bearer of' some   (btl2:disposition and ('btl2:has realization' only 'X activity'))) Function-activities related to their agents via ‘btl2:has agent’ substitution of other homonymous relations, e.g. hasPart by ‘btl2:has part’ (using text editor) processes that give rise to other processes: ‘btl2:has outcome’, e.g. ‘has patient’ some ‘gene expression’  'btl2:has outcome' some (btl2:causes some 'gene expression') Some category shifts, e.g. 'experimental method' subclass of btl2:action

GRO GRO2: Steps 11-14 btl2 agents and patients haven’t processes in the range, therefore, e.g. ‘has agent’ some transcription  ‘btl2:is caused by’ some transcription removal of datatype properties, e.g. hasPolarity; Values are modelled by DOLCE-style value regions, e.g. Decrease subclassOf 'btl2:is bearer of' some    (btl2:quality and ('btl2:projects onto' some 'negative polarity value region')) removal of redundant object properties:        fromSpecies some Eukaryote         ‘btl2:at some time’ some (‘btl2:is included in’ some Eukaryote) (there are alternative ways to express this, e.g. taxon qualities) encoding: like realization of functions / dispositions: existential quantification too strong ORG subclassOf encodes some Protein  ORG subclassOf 'btl2:is bearer of' some ('genetic information' and (btl2:represents only protein)) reduction to bearer of protein-representing information sufficient?

Ideal world / blue sky Computable, well-performing ontology that covers all aspects of gene regulation Addressing a broad range of use cases Well connected with other ontologies Upper level ontologies Domain ontologies Following standards and good design principles Self explaining labels Formal and textual definitions / elucidations Supporting synonyms Maintained by GR community / agreed quality criteria Tools (content maintenance, content acquisition), Benchmarks, competency questions Computable, ontology-based framework integrating bioontology and database content

Gaps Unclear scoping and specification for a Gene Regulation Ontology Better solved by combining existing ontologies OBO principles (?) Overlap with existing resources, risk of creating / maintaining redundant content Different opinions regarding importance and choice of an upper level ontology and design principles current upper level ontologies: e.g. lexical polysemy: "sequence": information entity vs. material entity? e.g. what are classes, what are relations, what are individuals? User friendly tooling Content acquisition, ontology creation and curation Quality assessment

GRO: Long-term perspective to discuss!