+ Introduction to anatomy ontology building David Osumi-Sutherland FlyBase (www.flybase.org)www.flybase.org Virtual Fly Brain (www.virtualflybrain.org)www.virtualflybrain.org.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Testing Relational Database
Knowledge Representation CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
+ OWL for annotators David Osumi-Sutherland. + What is OWL? Web Ontology Language Can express everything in OBO and more. Certified web standard Fast.
+ From OBO to OWL and back again – a tutorial David Osumi-Sutherland, Virtual Fly Brain/FlyBase Chris Mungall – GO/LBL.
Knowledge Representation
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
The problem How to integrate the massive amounts of data on Drosophila neurobiology to explore anatomy, formulate hypotheses and find reagents?
Representing Part Relationships Between Developing Structures.
+ From OBO to OWL and back again – a tutorial David Osumi-Sutherland, Virtual Fly Brain/FlyBase Chris Mungall – GO/LBL.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 6 Advanced Data Modeling.
Automated tools to help construction of Trait Ontologies Chris Mungall Monarch Initiative Gene.
Overview of Software Requirements
From SHIQ and RDF to OWL: The Making of a Web Ontology Language
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
GO Ontology Editing Workshop: Using Protege and OWL Hinxton Jan 2012.
Editing Description Logic Ontologies with the Protege OWL Plugin.
OBO-Edit tutorial David Osumi-Sutherland FlyBase / Virtual Fly Brain / OBO-Edit Working Group (OEWG)
DAML+OIL Ontology Tutorial Chris Wroe, Robert Stevens (Sean Bechhofer, Carole Goble, Alan Rector, Ian Horrocks….) University of Manchester.
CSCI-383 Object-Oriented Programming & Design Lecture 15.
The Foundational Model of Anatomy and its Ontological Commitment(s) Stefan Schulz University Medical Center, Freiburg, Germany FMA in OWL meeting November.
Protege OWL Plugin Short Tutorial. OWL Usage The world wide web is a natural application area of ontologies, because ontologies could be used to describe.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
8/11/2011 Web Ontology Language (OWL) Máster Universitario en Inteligencia Artificial Mikel Egaña Aranguren 3205 Facultad de Informática Universidad Politécnica.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
An (OBO) ontology is NOT a model of language, it is a model of reality. Words are ambiguous – especially in isolation. Take the word 'wing' what type of.
OWL and SDD Dave Thau University of Kansas
OWL 2 Web Ontology Language. Topics Introduction to OWL Usage of OWL Problems with OWL 1 Solutions from OWL 2.
Imports, MIREOT Contributors: Carlo Torniai, Melanie Courtot, Chris Mungall, Allen Xiang.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Configuration Management (CM)
Principles and Practice of Ontology Development: Making Definitions Computable Chris Mungall LBL.
+ CARO 2.0 & FUNCARO David Osumi-Sutherland. + Review of CARO (v1) Many definitions are complicated and opaque: ‘anatomical group: “[An] anatomical structure.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
BioHealth Informatics Group A Practical Introduction to Ontologies & OWL Session 2: Defined Classes and Additional Modelling Constructs in OWL Nick Drummond.
Objects & Dynamic Dispatch CSE 413 Autumn Plan We’ve learned a great deal about functional and object-oriented programming Now,  Look at semantics.
Shortcut relations. Relations used hemo-CL uses – capable_of – lacks_part (Ceusters et al) – has_plasma_membrane_part (Masci et al) – lacks_plasma_membrane_part.
GO terms implicitly refer to other term cysteine biosynthesis myoblast fusion hydrogen ion transporter activity snoRNA catabolism wing disc pattern formation.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
Based on “A Practical Introduction to Ontologies & OWL” © 2005, The University of Manchester A Practical Introduction to Ontologies & OWL Session 2: Defined.
Anatomy Ontology Community Melissa Haendel. The OBO Foundry More than just a website, it’s a community of ontology developers.
OilEd An Introduction to OilEd Sean Bechhofer. Topics we will discuss Basic OilEd use –Defining Classes, Properties and Individuals in an Ontology –This.
2nd Sept 2004UK e-Science all hands meeting1 Designing User Interfaces to Minimise Common Errors in Ontology Development Alan Rector, Nick Drummond, Matthew.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Expanding species-specific anatomy ontologies to include the cell ontology Melissa Haendel (1), Ceri Van Slyke (1), Chris Mungall (2), Peiran Song (1),
Cell Ontology Meeting, Jackson Labs May 2010 David Osumi-Sutherland.
+ From OBO to OWL and back again – a tutorial David Osumi-Sutherland, Virtual Fly Brain/FlyBase Chris Mungall – GO/LBL.
ONTOLOGY ENGINEERING Lab #3 – September 15,
Approach to building ontologies A high-level view Chris Wroe.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
1 SWE Introduction to Software Engineering Lecture 14 – System Modeling.
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Basic Formal Ontology Barry Smith August 26, 2013.
Using OWL 2 For Product Modeling David Leal Caesar Systems April 2009 Henson Graves Lockheed Martin Aeronautics.
Artificial Intelligence Knowledge Representation.
CSCI 383 Object-Oriented Programming & Design Lecture 15 Martin van Bommel.
Syntax and semantics >AMYLASEE1 TGCATNGY A very simple FASTA file.
Introduction to Ontology Introductions Alan Ruttenberg Science Commons.
Defects of UML Yang Yichuan. For the Presentation Something you know Instead of lots of new stuff. Cases Instead of Concepts. Methodology instead of the.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Development of the Amphibian Anatomical Ontology
ece 720 intelligent web: ontology and beyond
ro.owl and shortcut relations
The Gene Ontology: an evolution
Chapter 5 Advanced Data Modeling
University of Manchester
Presentation transcript:

+ Introduction to anatomy ontology building David Osumi-Sutherland FlyBase ( Virtual Fly Brain (

+ Take home messages An ontology is a classification There are lots of useful ways to classify stuff Maintaining multiple classification schemes by hand is impractical So you should automate it. Everybody makes mistakes So you should get the computer find errors for you Re-use other people’s work where possible import class hierarchies use common patterns Cautionary note – formal languages have limitations. Don’t expect to be able to express everything!

+ What is an ontology ? A set of defined, inter-related terms to use in annotation/metadata/knowledge bases. A classification A query-able store of (scientific) knowledge that uses logical inference.

+ What is an ontology ? A set of defined, inter-related terms to use in annotation/metadata/knowledge bases. A classification A query-able store of (scientific) knowledge that uses logical inference. depends on

+ What (use) is an ontology? A set of defined, inter-related terms to use in annotation. Annotation of papers; specimens; gene expression; phenotype… Use of common annotation terms across multiple databases allows easy shared integration. Relations between terms allow annotations to be grouped in scientifically meaningful ways requires an ontology to be an accurate and scientifically meaningful classification and store of scientific knowledge.

+ What is an ontology ? A classification There are lots of scientifically useful ways to classify a bit of anatomy. its parts and their arrangement its relation to other structures what is it: part of; connected to; adjacent to, overlapping? its shape its function its developmental origins its species or clade its evolutionary history?

+ What is an ontology ? The scientific knowledge an ontology contains can make the reasons for classification explicit. e.g. Any sense organ that functions in the detection of smell is an olfactory sense organ All large basiconic sensilla of the antenna function in detection of smell Therefore all large basiconic sensilla of the antenna are are olfactory sense organs

+ Virtual Fly Brain Demo

+ Why ontology development is like software or database development Ideal case – maintainable basic maintenance (e.g. correcting simple errors) is easy scalable grow your project as large as you need without breaking extensible easy to add new functionality without breaking existing integrate-able Can integrate easily with work of others – so you don’t have to solve all problems yourself

+ Why ontology development is like software or database development Ideal case – Future editors can build on your work maintainable – By multiple editors basic maintenance (e.g. correcting simple errors) is easy scalable – By multiple editors grow your project as large as you need without breaking extensible – By multiple editors easy to add new functionality without breaking existing integrate-able Can integrate easily with work of others – so you don’t have to solve all problems yourself

+ How not to build ontologies - The trap A small, simple ontology or program with one developer can get away with practices that a large one can not given shallow, single inheritance classification (each class has 0-1 superclasses) very few relationship types < 1000 terms. it is feasible to: have little annotation/documentation have no automated error checking have no automated classification keep redundancy to a minimum by hand

+ How not to build ontologies - The trap Small, simple ontologies and programs have a habit of growing large and complicated. Users demand lots more terms for annotation Users demand multiple axes of classification No scientific reason to favor one over another Users demand/editors favor multiple relationship types to record information they believe scientifically important. Editors/coders move on someone else has to continue their work. Is the documentation mainly in the old developers head?

+ How not to build ontologies - The trap Worst case scenario – the tangled pit of misery: Difficult, perhaps impossible to maintain or extend Tangled, convoluted, redundant structure with little or no documentation or annotation. Editing tends to inadvertently break previous functionality. Little or no error checking means you don't even notice when you break stuff. Users find out later. Even you can't easily edit what you built 6 months ago without getting confused and making a mess.

+ Avoiding tangled pits of misery There are no perfect answers, but these might help: good annotation and documentation; good, consistent style; avoidance of redundancy; let the computer keep track of things for you modularity; automate a consistent set of tests of existing functionality (j-unit / consistency); constant testing during development; design patterns.

+ Good Practice 1: Good annotation and documentation Clear textual definitions with references ensure accurate manual annotation make assertions of scientific fact trace-able serve as documentation for future ontology developers Also useful to record – for users and future developers: Experimental evidence for assertions of scientific fact Notes on confusing or conflicting usage of terms Reasons for design choices/compromises

+ Options for formalization OWL W3C standard Decidable Big open source community of tool developers multiple fast reasoners – getting better all the time Easy to read syntax – OWL Manchester syntax (OWL MS) OBO Best thought of as a subset of OWL, with which it is increasingly integrated Limited community of tool developers Easy(ish) to read syntax Common logic Very powerful. But easy to come up with solutions that can’t be usefully reasoned with.

+ Relationships are the formalized part of a definition. The criteria for class membership is recorded using textual definitions, at least some elements of which are formalized as relationships. name: insect wing def: “A membranous dorsal appendage or the meso- or metathorax that functions in flight.” [Snodgrass, 1935] is_a: appendage relationship: part_of thoracic segment relationship: has_function_in flight

+ Classification is transitive If A SubClass* of B and B SubClassOf C then A SubClassOf C All members of class A are members of class C. So, the definition of class C must apply to class A. * OWL (MS) SubClassOf ≅ OBO is_a

+ Classification is transitive ‘material anatomical entity’ <- is_a ‘sense organ’ <- is_a sensillum <- is_a ‘olfactory sensillum’ <- is_a ‘antennal basiconic sensillum’ ‘material anatomical entity’: “… has mass.” ‘sense organ’: “… functions in the detection of a stimulus involved in sensory perception.” sensillum: “A sense organ consisting of a small cluster of cells of various types.” ‘olfactory sensillum’: “… functions in the detection of smell” * OWL (MS) SubClassOf ≅ OBO is_a

+ class – class relationships are quantified Class:Class relationships are many to many Does the relation apply to all or just some of the class ? we specify this with quantifiers: ∀ : for all, all, only, every ∃ : there exists, some Cautionary note – Modeling knowledge as class hierarchies defined with quantified logic is an extremely useful but is limited. Don’t expect to be able to use if for everything you know! Expressivity of OWL is more limited still.

+ relationships specify necessary conditions for class membership Being part of an insect thorax is a necessary condition of being in the class ‘insect leg’. English: All insect legs are part of some (type of) insect thorax OBO (quantifiers hidden) name: insect leg relationship: part_of thorax OWL (MS): ‘insect wing’ SubClassOf part_of some thorax PL: ∀ leg(x), ∃ thorax(y) and part_of(x,y) * * ignoring time argument from OBO RO 2005

+ Classification is transitive If A SubClass* of B and B SubClassOf C then A SubClassOf C All members of class A are members of class C. So, the definition of class C must apply to class A. * OWL (MS) SubClassOf ≅ OBO is_a (all) leg part_of some thorax ‘front leg’ SubClassOf leg therefore (all) ‘front leg’part_of some thorax

+ Directionality and quantifiers True: all ‘insect wing’ part_of some ‘insect thorax’ False: all ‘insect thorax’ has_part some ‘insect wing’ True: all ‘claw’ connected_to some ‘tarsal segment’ False: all ‘tarsal segment’ connected_to some claw

+ It is difficult to keep track of multiple classification chains to: ensure completeness; avoid redundancy; avoid introducing error due to inheritance of classification criteria from a distant ancestor Manually maintaining an ontology with multiple classification schemes is impractical

+ Automating multiple classification. The scientific knowledge an ontology contains can make the reasons for classification explicit. e.g. Any sense organ that functions in the detection of smell is an olfactory sense organ All large basiconic sensilla of the antenna function in detection of smell Therefore all large basiconic sensilla of the antenna are are olfactory sense organs

+ Automating multiple classification. We can specify that some set of necessary conditions for class membership are sufficient to determine class membership English Any sense organ that functions in the detection of smell is an olfactory sense organ OWL (MS): olfactory sense organ’ EquivalentTo: sense organ that has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’ OBO name: olfactory sense organ intersection_of: sense organ intersection_of: has_function_in ‘detection of chemical stimulus involved in sensory perception of smell’

+ Automating multiple classification. ‘olfactory sense organ’ EquivalentTo: sense organ that has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’ ‘large basiconic sensillum of antenna’ SubClassOf: ‘sense organ’; SubClassOf has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’ Reasoner concludes: ‘large basiconic sensillum of antenna’ SubClassOf ‘olfactory sense organ’ Keene & Waddell, 2007

+ Use other people’s work to build your classification Gene Ontology classification of sensory processes:

+ Automating multiple classification.

+ Some extra OWL expressivity In OWL we can also specify number (cardinality): (all) insect: SubClassOf has_component exactly 6 leg

+ Error checking is essential – everybody makes mistakes Some classes don’t have instances in common. Nothing can be an oak tree and a fruit fly; an anatomical structure and a biological process. We say that such classes are disjoint Declaring classes to be disjoint allows reasoners to find contradictions. This is especially powerful when combined with domain and range constraints. This is your main means of error checking. Use it extensively. It also speeds up some reasoners.

+ Error checking - domain and range constraints ‘cortisol secretion’ SubClassOf ‘endocrine hormone secretion’ SubClassOf process ‘adrenal gland’ SubClassOf ‘endocrine gland’ SubClassOf structure structure DisjointWith process (nothing can be both a structure(adrenal gland) and a process (e.g. cortisol secretion) has_function_in domain: structure* range: process* if x has_function_in y then x must be an object and y must be a process. Now if I mistakenly add: cortisal secretion has_function_in some adrenal gland. Inconsistency: cortisol secretion SubClassOf structure and process * more strictly, structure= continuant; range = occurrent

+ Error checking is essential – everybody makes mistakes Some classes don’t have instances in common. Nothing can be an oak tree and a fruit fly; an anatomical structure and a biological process. We say that such classes are disjoint Declaring classes to be disjoint allows reasoners to find contradictions. This is especially powerful when combined with domain and range constraints. This is your main means of error checking. Use it extensively. It also speeds up some reasoners.

+ Reasoner assisted error checking by eye Keep an eye on classification inferred by the reasoner. Protégé shows inferred classification and inherited relationships – keep an eye on these

+ Reasoner assisted error checking by eye Run some test queries – do they give the answers you expect?

+ Mereology part_of is transitive If A part_of B part_of C part_of D Then A part_of D overlap is not transitive. If A overlaps B overlaps C then A may or may not overlap C B C D A ABC A B C

+ Transitivity of part_of Given (All) ‘insect coxa’ part_of some ‘insect leg’ (All) ‘insect leg’ part_of some ‘insect thoracic segment’ (All) ‘insect thoracic segment’ part_of some ‘insect thorax’ Then (All) ‘insect coxa’ part_of some ‘insect thorax’

+ Automating partonomy As for class – maintaining multiple overlapping part hierarchies by hand is hard. Some scope for auto-populating partonomies – e.g.- English Any anatomical structure that functions in endocrine hormone secretion is part of some endocrine system OWL (‘anatomical structure’ that has_function_in some ‘endocrine hormone secretion’) SubClassOf (part_of some ‘endocrine system’) OBO name: endocrine system component intersection_of: anatomical structure’ intersection_of: has_function_in ‘endocrine hormone secretion’ relationship: part_of endocrine system

+ Declaring spatial disjointness provides error checking for partonomy In OWL: part_of some X DisjointWith part_of some Y

+ Reasoning with overlap A overlaps B if and only if there exists some X and X part_of A and X part_of B rules: If X part_of A then X overlaps A If A has_part X then A overlaps A overlaps.* part_of.* has_part In OWL (MS) * = SubPropertyOf In OBO *= is_a A B X AB X

+ Reasoning with overlap More rules If A has_part X and X part_of B then X overlaps B If C has_part A and A overlaps B then C overlaps B If B overlaps A and A part_of C then B overlaps C In OWL (MS): has_part o part_of -> overlaps In OBO: name: overlaps holds_over_chain: has_part part_of A B X AB X AB X C

+ Image - Greg Jefferis Keene & Waddell, 2007

+ Shortcut relations In OWL, we can write compound class expressions: ‘antennal lobe projection neuron’ has_part some (soma that part_of some ‘antennal lobe cortex’) But these can quickly get long and verbose ‘‘DL1 adPN’ has_part some (potsynaptic membrane (GO) that part_of some (synapse (GO) that part_of some ‘DL1 glomerulus’)))

+ Shortcut relations Shortcut relations stand in for compound class expressions. ‘DL1 adPN’ has_part some (potsynaptic membrane (GO) that part_of some (synapse (GO) that part_of some ‘DL1 glomerulus’))) > ‘DL1 adPN’ has_postsynaptic_terminal_in some ‘DL1 glomerulus’ Can be expanded if detail needed. Provides rigorous documentation of meaning.

+ Where to start? Make a flat list of the terms you need and list the types of classification you want to use to link them together. Has someone already formalized this type of classification? If so, use their pattern. If not – draft some formalizations yourself: Are any simplifications justifiable – or likely to be too misleading? DON’T FORMALIZE FOR THE SAKE OF IT! Some classifications are hard to formalize well – or may be best left to human judgment. Import upper classifications and relations Import classifications to root for all foreign terms used. Work with ontologists to formally define relations where possible But don’t let this become a road block!

+ Technical issues Imports: Importing whole ontologies is easy in both OBO and OWL But importing large ontologies is impractical in both Generating simple slices of OBO ontologies is easy (have perl scripts, happy to share) Generating slices of OWL ontologies – some tools (Ontofox), but still need work.

+ Developing nested ontologies CARO VAO Present TAOModularized ontology

+ Resources CARO – upper ontology new version being prepared out soon. Some standard patterns using qualities FUNCARO provides standard patterns for representing function using CARO + GO ro.owl new home for OBO relations – particularly shortcut relations. Imports fundamental relations from BFO (basic formal ontology)

+ There are lots of scientifically useful ways to classify a bit of anatomy: parts and their arrangement - its relation to other structures what is it: part of; connected to; adjacent to, overlapping? its shape its function its developmental origins its species or clade its evolutionary history? Multiple classification

+ type of classificationrelationobject of relation what parts does it have?has_part has_component (for counts) anatomical entity what is it part of?part_ofanatomical entity quality (e.g.- shape)has_qualityPATO term functionhas_function_in capable_of (?) GO perhaps behavior ontologies? developmental origindevelops_fromanatomical entity developmental fatedevelops_intoanatomical entity connectivity (e.g.- muscle/tendon to bone) connected_toanatomical entity evolutionary origindervied_by_descent_from ? homolgous_to ? anatomical entity species/clade/taxonin_taxon ?species/clade/taxon

+ Avoiding tangled pits of misery There are no perfect answers, but these might help: You do this my hand good annotation and documentation; good, consistent style; Automated classification and consistency checking gives: avoidance of redundancy computer keeps track of things for you automation a consistent set of tests of existing functionality (j-unit / consistency); constant testing during development; Importing useful slices of other ontologies gives you: modularity; Upper ontologies give you: design patterns

+ Take home messages An ontology is a classification There are lots of useful ways to classify stuff Maintaining multiple classification schemes by hand is impractical So you should automate it. Everybody makes mistakes Let the computer find errors for you Use the reasoner to test as you build Re-use other people’s work where possible import class hierarchies use common patterns Cautionary note – formal languages have limitations. Don’t expect to be able to express everything!

+ Acknowledgments Virtual Fly Brain - Michael Ashburner, Cahir O’Kane, Douglas Armstrong, Simon Reeve, Nestor Milyaev FlyBase HAO – Andy Deans/Matt Yoder/Jim Balhoff Chris Mungall, LBL Berkeley Melissa Haendel, eagle-I Alan Ruttenberg, SUNY Buffalo Barry Smith, SUNY Buffalo Robert Stevens, (Co-ode; OWL-API) Manchester University BBSRC (grant award BB/G02247X/1)

+

+ Drosophila anatomy ontology as an example Circa 2006: tangled pit of misery term. 6% definitions. Many of them not suitable (give example). Sufficient inconsistency that not reliable for grouping terms / reasoning - give examples Sufficiently incomplete that most queries/groupings missed very many terms - use mechanosensory bristle (or something similar) as example. Editing a nightmare - unclear what original reasons were for relationships. For any term - not clear what relations already inferred or how to

(0% inferred)2011 (100% inferred) sense organ chemosensory organ1496 gustatory organ049 olfactory organ037