Presentation is loading. Please wait.

Presentation is loading. Please wait.

+ Introduction to anatomy ontology building David Osumi-Sutherland FlyBase (www.flybase.org)www.flybase.org Virtual Fly Brain (www.virtualflybrain.org)www.virtualflybrain.org.

Similar presentations


Presentation on theme: "+ Introduction to anatomy ontology building David Osumi-Sutherland FlyBase (www.flybase.org)www.flybase.org Virtual Fly Brain (www.virtualflybrain.org)www.virtualflybrain.org."— Presentation transcript:

1 + Introduction to anatomy ontology building David Osumi-Sutherland FlyBase (www.flybase.org)www.flybase.org Virtual Fly Brain (www.virtualflybrain.org)www.virtualflybrain.org

2 + Take home messages An ontology is a classification There are lots of useful ways to classify stuff Maintaining multiple classification schemes by hand is impractical So you should automate it. Everybody makes mistakes So you should get the computer find errors for you Re-use other people’s work where possible import class hierarchies use common patterns Cautionary note – formal languages have limitations. Don’t expect to be able to express everything!

3 + What is an ontology ? A set of defined, inter-related terms to use in annotation/metadata/knowledge bases. A classification A query-able store of (scientific) knowledge that uses logical inference.

4 + What is an ontology ? A set of defined, inter-related terms to use in annotation/metadata/knowledge bases. A classification A query-able store of (scientific) knowledge that uses logical inference. depends on

5 + What (use) is an ontology? A set of defined, inter-related terms to use in annotation. Annotation of papers; specimens; gene expression; phenotype… Use of common annotation terms across multiple databases allows easy shared integration. Relations between terms allow annotations to be grouped in scientifically meaningful ways requires an ontology to be an accurate and scientifically meaningful classification and store of scientific knowledge.

6 + What is an ontology ? A classification There are lots of scientifically useful ways to classify a bit of anatomy. its parts and their arrangement its relation to other structures what is it: part of; connected to; adjacent to, overlapping? its shape its function its developmental origins its species or clade its evolutionary history?

7 + What is an ontology ? The scientific knowledge an ontology contains can make the reasons for classification explicit. e.g. Any sense organ that functions in the detection of smell is an olfactory sense organ All large basiconic sensilla of the antenna function in detection of smell Therefore all large basiconic sensilla of the antenna are are olfactory sense organs

8 + Virtual Fly Brain Demo

9 + Why ontology development is like software or database development Ideal case – maintainable basic maintenance (e.g. correcting simple errors) is easy scalable grow your project as large as you need without breaking extensible easy to add new functionality without breaking existing integrate-able Can integrate easily with work of others – so you don’t have to solve all problems yourself

10 + Why ontology development is like software or database development Ideal case – Future editors can build on your work maintainable – By multiple editors basic maintenance (e.g. correcting simple errors) is easy scalable – By multiple editors grow your project as large as you need without breaking extensible – By multiple editors easy to add new functionality without breaking existing integrate-able Can integrate easily with work of others – so you don’t have to solve all problems yourself

11 + How not to build ontologies - The trap A small, simple ontology or program with one developer can get away with practices that a large one can not given shallow, single inheritance classification (each class has 0-1 superclasses) very few relationship types < 1000 terms. it is feasible to: have little annotation/documentation have no automated error checking have no automated classification keep redundancy to a minimum by hand

12 + How not to build ontologies - The trap Small, simple ontologies and programs have a habit of growing large and complicated. Users demand lots more terms for annotation Users demand multiple axes of classification No scientific reason to favor one over another Users demand/editors favor multiple relationship types to record information they believe scientifically important. Editors/coders move on someone else has to continue their work. Is the documentation mainly in the old developers head?

13 + How not to build ontologies - The trap Worst case scenario – the tangled pit of misery: Difficult, perhaps impossible to maintain or extend Tangled, convoluted, redundant structure with little or no documentation or annotation. Editing tends to inadvertently break previous functionality. Little or no error checking means you don't even notice when you break stuff. Users find out later. Even you can't easily edit what you built 6 months ago without getting confused and making a mess.

14 + Avoiding tangled pits of misery There are no perfect answers, but these might help: good annotation and documentation; good, consistent style; avoidance of redundancy; let the computer keep track of things for you modularity; automate a consistent set of tests of existing functionality (j-unit / consistency); constant testing during development; design patterns.

15 + Good Practice 1: Good annotation and documentation Clear textual definitions with references ensure accurate manual annotation make assertions of scientific fact trace-able serve as documentation for future ontology developers Also useful to record – for users and future developers: Experimental evidence for assertions of scientific fact Notes on confusing or conflicting usage of terms Reasons for design choices/compromises

16 + Options for formalization OWL W3C standard Decidable Big open source community of tool developers multiple fast reasoners – getting better all the time Easy to read syntax – OWL Manchester syntax (OWL MS) OBO Best thought of as a subset of OWL, with which it is increasingly integrated Limited community of tool developers Easy(ish) to read syntax Common logic Very powerful. But easy to come up with solutions that can’t be usefully reasoned with.

17 + Relationships are the formalized part of a definition. The criteria for class membership is recorded using textual definitions, at least some elements of which are formalized as relationships. name: insect wing def: “A membranous dorsal appendage or the meso- or metathorax that functions in flight.” [Snodgrass, 1935] is_a: appendage relationship: part_of thoracic segment relationship: has_function_in flight

18 + Classification is transitive If A SubClass* of B and B SubClassOf C then A SubClassOf C All members of class A are members of class C. So, the definition of class C must apply to class A. * OWL (MS) SubClassOf ≅ OBO is_a

19 + Classification is transitive ‘material anatomical entity’ <- is_a ‘sense organ’ <- is_a sensillum <- is_a ‘olfactory sensillum’ <- is_a ‘antennal basiconic sensillum’ ‘material anatomical entity’: “… has mass.” ‘sense organ’: “… functions in the detection of a stimulus involved in sensory perception.” sensillum: “A sense organ consisting of a small cluster of cells of various types.” ‘olfactory sensillum’: “… functions in the detection of smell” * OWL (MS) SubClassOf ≅ OBO is_a

20 + class – class relationships are quantified Class:Class relationships are many to many Does the relation apply to all or just some of the class ? we specify this with quantifiers: ∀ : for all, all, only, every ∃ : there exists, some Cautionary note – Modeling knowledge as class hierarchies defined with quantified logic is an extremely useful but is limited. Don’t expect to be able to use if for everything you know! Expressivity of OWL is more limited still.

21 + relationships specify necessary conditions for class membership Being part of an insect thorax is a necessary condition of being in the class ‘insect leg’. English: All insect legs are part of some (type of) insect thorax OBO (quantifiers hidden) name: insect leg relationship: part_of thorax OWL (MS): ‘insect wing’ SubClassOf part_of some thorax PL: ∀ leg(x), ∃ thorax(y) and part_of(x,y) * * ignoring time argument from OBO RO 2005

22 + Classification is transitive If A SubClass* of B and B SubClassOf C then A SubClassOf C All members of class A are members of class C. So, the definition of class C must apply to class A. * OWL (MS) SubClassOf ≅ OBO is_a (all) leg part_of some thorax ‘front leg’ SubClassOf leg therefore (all) ‘front leg’part_of some thorax

23 + Directionality and quantifiers True: all ‘insect wing’ part_of some ‘insect thorax’ False: all ‘insect thorax’ has_part some ‘insect wing’ True: all ‘claw’ connected_to some ‘tarsal segment’ False: all ‘tarsal segment’ connected_to some claw

24 + It is difficult to keep track of multiple classification chains to: ensure completeness; avoid redundancy; avoid introducing error due to inheritance of classification criteria from a distant ancestor Manually maintaining an ontology with multiple classification schemes is impractical

25 + Automating multiple classification. The scientific knowledge an ontology contains can make the reasons for classification explicit. e.g. Any sense organ that functions in the detection of smell is an olfactory sense organ All large basiconic sensilla of the antenna function in detection of smell Therefore all large basiconic sensilla of the antenna are are olfactory sense organs

26 + Automating multiple classification. We can specify that some set of necessary conditions for class membership are sufficient to determine class membership English Any sense organ that functions in the detection of smell is an olfactory sense organ OWL (MS): olfactory sense organ’ EquivalentTo: sense organ that has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’ OBO name: olfactory sense organ intersection_of: sense organ intersection_of: has_function_in ‘detection of chemical stimulus involved in sensory perception of smell’

27 + Automating multiple classification. ‘olfactory sense organ’ EquivalentTo: sense organ that has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’ ‘large basiconic sensillum of antenna’ SubClassOf: ‘sense organ’; SubClassOf has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’ Reasoner concludes: ‘large basiconic sensillum of antenna’ SubClassOf ‘olfactory sense organ’ Keene & Waddell, 2007

28 + Use other people’s work to build your classification Gene Ontology classification of sensory processes:

29 + Automating multiple classification.

30 + Some extra OWL expressivity In OWL we can also specify number (cardinality): (all) insect: SubClassOf has_component exactly 6 leg

31 + Error checking is essential – everybody makes mistakes Some classes don’t have instances in common. Nothing can be an oak tree and a fruit fly; an anatomical structure and a biological process. We say that such classes are disjoint Declaring classes to be disjoint allows reasoners to find contradictions. This is especially powerful when combined with domain and range constraints. This is your main means of error checking. Use it extensively. It also speeds up some reasoners.

32 + Error checking - domain and range constraints ‘cortisol secretion’ SubClassOf ‘endocrine hormone secretion’ SubClassOf process ‘adrenal gland’ SubClassOf ‘endocrine gland’ SubClassOf structure structure DisjointWith process (nothing can be both a structure(adrenal gland) and a process (e.g. cortisol secretion) has_function_in domain: structure* range: process* if x has_function_in y then x must be an object and y must be a process. Now if I mistakenly add: cortisal secretion has_function_in some adrenal gland. Inconsistency: cortisol secretion SubClassOf structure and process * more strictly, structure= continuant; range = occurrent

33 + Error checking is essential – everybody makes mistakes Some classes don’t have instances in common. Nothing can be an oak tree and a fruit fly; an anatomical structure and a biological process. We say that such classes are disjoint Declaring classes to be disjoint allows reasoners to find contradictions. This is especially powerful when combined with domain and range constraints. This is your main means of error checking. Use it extensively. It also speeds up some reasoners.

34 + Reasoner assisted error checking by eye Keep an eye on classification inferred by the reasoner. Protégé shows inferred classification and inherited relationships – keep an eye on these

35 + Reasoner assisted error checking by eye Run some test queries – do they give the answers you expect?

36 + Mereology part_of is transitive If A part_of B part_of C part_of D Then A part_of D overlap is not transitive. If A overlaps B overlaps C then A may or may not overlap C B C D A ABC A B C

37 + Transitivity of part_of Given (All) ‘insect coxa’ part_of some ‘insect leg’ (All) ‘insect leg’ part_of some ‘insect thoracic segment’ (All) ‘insect thoracic segment’ part_of some ‘insect thorax’ Then (All) ‘insect coxa’ part_of some ‘insect thorax’

38 + Automating partonomy As for class – maintaining multiple overlapping part hierarchies by hand is hard. Some scope for auto-populating partonomies – e.g.- English Any anatomical structure that functions in endocrine hormone secretion is part of some endocrine system OWL (‘anatomical structure’ that has_function_in some ‘endocrine hormone secretion’) SubClassOf (part_of some ‘endocrine system’) OBO name: endocrine system component intersection_of: anatomical structure’ intersection_of: has_function_in ‘endocrine hormone secretion’ relationship: part_of endocrine system

39 + Declaring spatial disjointness provides error checking for partonomy In OWL: part_of some X DisjointWith part_of some Y

40 + Reasoning with overlap A overlaps B if and only if there exists some X and X part_of A and X part_of B rules: If X part_of A then X overlaps A If A has_part X then A overlaps A overlaps.* part_of.* has_part In OWL (MS) * = SubPropertyOf In OBO *= is_a A B X AB X

41 + Reasoning with overlap More rules If A has_part X and X part_of B then X overlaps B If C has_part A and A overlaps B then C overlaps B If B overlaps A and A part_of C then B overlaps C In OWL (MS): has_part o part_of -> overlaps In OBO: name: overlaps holds_over_chain: has_part part_of A B X AB X AB X C

42 + Image - Greg Jefferis Keene & Waddell, 2007

43 + Shortcut relations In OWL, we can write compound class expressions: ‘antennal lobe projection neuron’ has_part some (soma that part_of some ‘antennal lobe cortex’) But these can quickly get long and verbose ‘‘DL1 adPN’ has_part some (potsynaptic membrane (GO) that part_of some (synapse (GO) that part_of some ‘DL1 glomerulus’)))

44 + Shortcut relations Shortcut relations stand in for compound class expressions. ‘DL1 adPN’ has_part some (potsynaptic membrane (GO) that part_of some (synapse (GO) that part_of some ‘DL1 glomerulus’))) > ‘DL1 adPN’ has_postsynaptic_terminal_in some ‘DL1 glomerulus’ Can be expanded if detail needed. Provides rigorous documentation of meaning.

45 + Where to start? Make a flat list of the terms you need and list the types of classification you want to use to link them together. Has someone already formalized this type of classification? If so, use their pattern. If not – draft some formalizations yourself: Are any simplifications justifiable – or likely to be too misleading? DON’T FORMALIZE FOR THE SAKE OF IT! Some classifications are hard to formalize well – or may be best left to human judgment. Import upper classifications and relations Import classifications to root for all foreign terms used. Work with ontologists to formally define relations where possible But don’t let this become a road block!

46 + Technical issues Imports: Importing whole ontologies is easy in both OBO and OWL But importing large ontologies is impractical in both Generating simple slices of OBO ontologies is easy (have perl scripts, happy to share) Generating slices of OWL ontologies – some tools (Ontofox), but still need work.

47 + Developing nested ontologies CARO VAO Present TAOModularized ontology

48 + Resources CARO – upper ontology new version being prepared out soon. Some standard patterns using qualities FUNCARO provides standard patterns for representing function using CARO + GO ro.owl new home for OBO relations – particularly shortcut relations. Imports fundamental relations from BFO (basic formal ontology)

49 + There are lots of scientifically useful ways to classify a bit of anatomy: parts and their arrangement - its relation to other structures what is it: part of; connected to; adjacent to, overlapping? its shape its function its developmental origins its species or clade its evolutionary history? Multiple classification

50 + type of classificationrelationobject of relation what parts does it have?has_part has_component (for counts) anatomical entity what is it part of?part_ofanatomical entity quality (e.g.- shape)has_qualityPATO term functionhas_function_in capable_of (?) GO perhaps behavior ontologies? developmental origindevelops_fromanatomical entity developmental fatedevelops_intoanatomical entity connectivity (e.g.- muscle/tendon to bone) connected_toanatomical entity evolutionary origindervied_by_descent_from ? homolgous_to ? anatomical entity species/clade/taxonin_taxon ?species/clade/taxon

51 + Avoiding tangled pits of misery There are no perfect answers, but these might help: You do this my hand good annotation and documentation; good, consistent style; Automated classification and consistency checking gives: avoidance of redundancy computer keeps track of things for you automation a consistent set of tests of existing functionality (j-unit / consistency); constant testing during development; Importing useful slices of other ontologies gives you: modularity; Upper ontologies give you: design patterns

52 + Take home messages An ontology is a classification There are lots of useful ways to classify stuff Maintaining multiple classification schemes by hand is impractical So you should automate it. Everybody makes mistakes Let the computer find errors for you Use the reasoner to test as you build Re-use other people’s work where possible import class hierarchies use common patterns Cautionary note – formal languages have limitations. Don’t expect to be able to express everything!

53 + Acknowledgments Virtual Fly Brain - Michael Ashburner, Cahir O’Kane, Douglas Armstrong, Simon Reeve, Nestor Milyaev FlyBase HAO – Andy Deans/Matt Yoder/Jim Balhoff Chris Mungall, LBL Berkeley Melissa Haendel, eagle-I Alan Ruttenberg, SUNY Buffalo Barry Smith, SUNY Buffalo Robert Stevens, (Co-ode; OWL-API) Manchester University BBSRC (grant award BB/G02247X/1)

54 +

55 + Drosophila anatomy ontology as an example Circa 2006: tangled pit of misery. 6500 term. 6% definitions. Many of them not suitable (give example). Sufficient inconsistency that not reliable for grouping terms / reasoning - give examples Sufficiently incomplete that most queries/groupings missed very many terms - use mechanosensory bristle (or something similar) as example. Editing a nightmare - unclear what original reasons were for relationships. For any term - not clear what relations already inferred or how to

56 + 2008 (0% inferred)2011 (100% inferred) sense organ835759 chemosensory organ1496 gustatory organ049 olfactory organ037


Download ppt "+ Introduction to anatomy ontology building David Osumi-Sutherland FlyBase (www.flybase.org)www.flybase.org Virtual Fly Brain (www.virtualflybrain.org)www.virtualflybrain.org."

Similar presentations


Ads by Google