Download presentation
Presentation is loading. Please wait.
Published byMaximillian Walters Modified over 9 years ago
1
Dave Thau PASI, Costa Rica, June 7, 2008 Ontologies in Ecology and Biodiversity Informatics Dave Thau With some slides by Shawn Bowers and Josh Madin gratefully reused with permission
2
Dave Thau PASI, Costa Rica, June 7, 2008 Four Chapters I.What are ontologies and why should we care? II.Some nitty gritty III.Ontologies in ecology and biodiversity informatics IV.Tools
3
Dave Thau PASI, Costa Rica, June 7, 2008 Talk Goals Learn about ontology successes Learn basic terminology / buzz words Get a sense for ontology development See how they apply to ecology and biodiversity Learn what remains to be done –Bottom line: A LOT!
4
Dave Thau PASI, Costa Rica, June 7, 2008 Ontology Defined Trapeziid CrabPincer AcrophoraOcean lives in has part lives in
5
Dave Thau PASI, Costa Rica, June 7, 2008 notebook The Way It’s Been
6
Dave Thau PASI, Costa Rica, June 7, 2008 The Plan How are the finches doing these days? 1.Find data sets: “give me all data sets describing finch abundance” Finches R’ Us World Finch Database Finch Fancy Repository 2. Find analysis: “find a way to plot their distribution” Plotter Workflow 3.Integrate data, plug it in, get results
7
Dave Thau PASI, Costa Rica, June 7, 2008 Where Ontologies Can Help Finches R’ Us World Finch Database Plotter Workflow Finch Fancy Repository Finding the right Data sets Integrating the data Finding a good analysis and making sure data fits the analysis Making the results discoverable
8
Dave Thau PASI, Costa Rica, June 7, 2008 Other Ways Ontologies Help Crystalize knowledge Lay open assumptions Makes for great parties
9
Dave Thau PASI, Costa Rica, June 7, 2008 Simple Assembly Assembly With Switch Assembly-1 Instance Of Subclass Of Successes
10
Dave Thau PASI, Costa Rica, June 7, 2008 The GO Ontology: www.geneontology.org
11
Dave Thau PASI, Costa Rica, June 7, 2008 Gene Ontology widely adopted AgBase
12
Dave Thau PASI, Costa Rica, June 7, 2008 GO Over 25,000 terms 19 Contributing groups GO Annotations UniProtKBO13035GO:0004098 UniProtKBO13035GO:0004336 UniProtKBO13035GO:0004348 Total manual GO annotations - 388,633 Total proteins with manual annotations – 80,402 Total number distinct proteins – 2,971,374 Total number taxa – 129,318 GO Stats I
13
Dave Thau PASI, Costa Rica, June 7, 2008 Ontologies and You User of “invisible” ontologies –like search User of created ontologies –annotating data sets Collaborator in ontology creation –biologist working with ontologist Hands-on ontology builder –you’ll need more than a 1 hour talk…
14
Dave Thau PASI, Costa Rica, June 7, 2008 Chapter I Summary Ontologies can help –Locate data –Add semantics to data –Integrate data –Clarifiy domains There are already good examples –In genomics –In biomedical field –In engineering
15
Dave Thau PASI, Costa Rica, June 7, 2008 The Nitty Gritty XML, RDF, OWL and other 3 letter words Ontology Basics Reasoning with Ontologies
16
Dave Thau PASI, Costa Rica, June 7, 2008 XML, DTDs, XML Schema Not good for machines tools can’t automatically process how do you know it’s valid?
17
Dave Thau PASI, Costa Rica, June 7, 2008 XML XML, XML Schema hya 1.5 11 … Col.,Ht.,Crabs hya,1.5,11 XML Schema string float integer
18
Dave Thau PASI, Costa Rica, June 7, 2008 XML and XML Schema Now any machine can validate an XML document, given a schema Languages to translate XML to PDF or HTML exist But…. Can’t relate things –Like “the data in this file relates to study X”
19
Dave Thau PASI, Costa Rica, June 7, 2008 The Resource Description Framework (RDF) –individuals (objects), properties, and classes RDF and RDF Schema livesIn My Crab That Coral A. CoralT.Crab livesIn type Coral subClassOf Crab subClassOf
20
Dave Thau PASI, Costa Rica, June 7, 2008 RDF is Useful GO is available in RDF FOAF - Friend of a Friend –For example, go to –http://xml.mfd-consult.dk/foaf/explorer/ –Enter: http://hello.typepad.com/foaf.rdf RSS - Really Simple Syndication –It’s probably in your browser –Yahoo pipes rss blender Person knows David Jacobs randomwalks.com imgname Jesse James Garrett name blog.jjg.net homepage David Jacobs I work in New York City with filmmakers, activists and educators. Jesse James Garrett FOAF seeAlso
21
Dave Thau PASI, Costa Rica, June 7, 2008 Basic Ontology Building Blocks Instances –The actual things of interest –For example, a specimen (that crab) Classes (concepts) –A set of instances that share certain characteristics –For example, the set of all crabs is-a –A is-a B means every instance of A is also an instance of B –A might have additional characteristics; more restrictions Properties (has-a / part-of) –Represent a characteristic –e.g., has Wings, has-color Yellow crab isa crab T.crab has-color crab color The crab that bit me
22
Dave Thau PASI, Costa Rica, June 7, 2008 Example of Pollution Ontology
23
Dave Thau PASI, Costa Rica, June 7, 2008 Classes versus Instances - tricky! –If A is-a B, then every A is B –Every human, in this case, must also be a species –But “John” is not a species Species Human John is-a instance Species Human John Species Human John Human (Guarino)
24
Dave Thau PASI, Costa Rica, June 7, 2008 is-a is not part-of –What are essential properties of Cars? E.g., that they accommodate people? –Are these also essential for Engines? Car Engine part-of Car Engine [Guarino] Wheel part-of Engine Car part-of
25
Dave Thau PASI, Costa Rica, June 7, 2008 Limitations of RDF-based Ontologies No constraints - –“all red things have the color property with value red” –“Costa Rica has only one President” Can’t create definitions by combining other definitions –Mother = Parent and Female Can’t say concepts are equivalent or disjoint
26
Dave Thau PASI, Costa Rica, June 7, 2008 OWL - The Web Ontology Language Three different kinds –Lite - limited, but still powerful –DL - very expressive, can still reason –Full - extremely expressive, but unreasonable Example Reasoning OWL –If all apples are red, and apples and manzanas are the same, then all manzanas are red
27
Dave Thau PASI, Costa Rica, June 7, 2008 Reasoning about Taxonomy Peet’s 2005 Ranunculus data set: 9 Taxonomies 654 Taxa 704 Relations visualization by Martin Graham
28
Dave Thau PASI, Costa Rica, June 7, 2008 Is This Right? Peet, 2005: B.1948:R.h.stolonifer is congruent to K.2004:R.h.stolonifer B.1948:R.h.typicus is congruent to K.2004:R.h.typicus B.1948:R. hydrocharoides is congruent to K.2004:R. hydrocharoides The most likely fix here is to change the congruence relation between the top two nodes to instead state that Benson's R. hydrocharoides includes Kartesz's Ranunculus hydrocharoides Ranunculus hydrocharoides R.h. var natans R.h. var natans R.h. var stolonifer R.h. var stolonifer R.h. var typicus R.h. var typicus Ranunculus hydrocharoides Ranunculus hydrocharoides R.h. var stolonifer R.h. var stolonifer R.h. var typicus R.h. var typicus Assuming disjoint children and complete partitioning of parents ⊋ Benson, 1948Kartesz, 2004
29
Dave Thau PASI, Costa Rica, June 7, 2008 Getting Crazy with Properties Properties can be: –Transitive (a is inCountry b, b is inCountry c..) –Inverse (a partOf b, b has_part a) –Functional (dave’s birthMother is vera) –Inverse functional (dave’s ssn is ….) And you can say stuff like –Apples are only red –Some apples are red –Crabs have 2 pincers
30
Dave Thau PASI, Costa Rica, June 7, 2008 Chapter II Summary XML is about syntax RDF is about relationships OWL is about more complex constraints Tips: –If A is-a B, then every instance of A is also an instance of B –Keep classes and instances separate –is-a is not part-of
31
Dave Thau PASI, Costa Rica, June 7, 2008 Chapter III: Ontologies in Ecology GO and friends are successful but.. Hard to represent processes –Show me studies about the flow of nitrogen in highly saline lakes, starting with lake-side nitrate Can’t be used for data integration Ecologists use complex models that involve many relations beyond is-a and part of relations
32
Dave Thau PASI, Costa Rica, June 7, 2008 Reminder:Where Ontology Can Help Crystalizing domain knowledge Marking up metadata and data sets Marking up analyses, and analysis components
33
Dave Thau PASI, Costa Rica, June 7, 2008 Marking Up Metadata and Data Taxonomic Working Group Standards http://rs.tdwg.org/ontology/voc/ Geo.owl Species.owl Vegetation.owl Geography.owl Water.owl Ecosystem.owl Alternet http://www5.umweltbundesamt.at/ALTERNet
34
Dave Thau PASI, Costa Rica, June 7, 2008 Metadata and Data with OBOE Example data set: the abundance of Trapeziid crabs in coral colonies (Stewart et al. 2006)
35
Dave Thau PASI, Costa Rica, June 7, 2008 Metadata and Data with OBOE Two measurements of the organism: the name … : Organism ofEntity : Observation: Measurement hasMeasurement : TaxonName ofCharacteristic : TaxonCatalog usesStandard “Acropora hyacinthus” hasValue
36
Dave Thau PASI, Costa Rica, June 7, 2008 Two measurements of the organism: the name … height : Organism ofEntity : Observation: Measurement hasMeasurement : TaxonName ofCharacteristic : TaxonCatalog usesStandard “Acropora hyacinthus” hasValue : Measurement: Height ofCharacteristic : Meter usesStandard “1.25” hasValue “0.01” hasPrecision hasMeasurement Metadata and Data with OBOE
37
Dave Thau PASI, Costa Rica, June 7, 2008 : Organism ofEntity : Observation: Measurement hasMeasurement : TaxonName ofCharacteristic : TaxonCatalog usesStandard “Acropora hyacinthus” hasValue : Measurement: Height ofCharacteristic : Meter usesStandard “1.25” hasValue “0.01” hasPrecision hasMeasurement : Observation hasContext : Measurement: TaxonName ofCharacteristic : TaxonCatalog usesStandard “Trapeziid crab” hasValue : Measurement: Abundance ofCharacteristic : Individual usesStandard “11” hasValue hasMeasurement Metadata and Data with OBOE
38
Dave Thau PASI, Costa Rica, June 7, 2008 : Coral ofEntity : Observation hasMeasurement : Measurement: Diameter ofCharacteristic : Meter usesStandard “1.25” hasValue “0.01” hasPrecision (a) : Animal ofEntity : Observation hasMeasurement : Measurement: ColonyDiamater ofCharacteristic : Centimeter usesStandard “320” hasValue “10” hasPrecision (b) Integration of data sets given their observation semantics Data Integration with OBOE
39
Dave Thau PASI, Costa Rica, June 7, 2008 : Coral ofEntity : Observation hasMeasurement : Measurement: Diameter ofCharacteristic : Meter usesStandard “1.25” hasValue “0.01” hasPrecision (a) : Animal ofEntity : Observation hasMeasurement : Measurement: ColonyDiamater ofCharacteristic : Centimeter usesStandard “320” hasValue “10” hasPrecision (b) Integration involves data set observation structures is-a : Length hasDimension Data Integration with OBOE
40
Dave Thau PASI, Costa Rica, June 7, 2008 : Animal ofEntity : Observation hasMeasurement : Measurement: Diameter ofCharacteristic : Meter usesStandard “1.3” hasValue “0.1” hasPrecision (c) And then applying appropriate conversions, etc. (a) “3.2” (b) Data Integration with OBOE
41
Dave Thau PASI, Costa Rica, June 7, 2008 Marking up Analyses Scientific Workflow Systems help: –Make analyses reproducible –Make parts of analyses reusable But… –100’s of workflows and templates –1000’s of actors (e.g. actors for web services, data analytics, …) Need to find what you want
42
Dave Thau PASI, Costa Rica, June 7, 2008 Semantic Type Annotation in Kepler Component input and output port annotation Each port can be annotated with multiple terms from multiple ontologies Annotations are stored within the actor metadata
43
Dave Thau PASI, Costa Rica, June 7, 2008 Chapter III Summary Taxonomies and partonomies are useful but limiting We saw a couple of ontologies for –Representing a domain –Describing data Again, the focus is always on discovery, integration and reuse
44
Dave Thau PASI, Costa Rica, June 7, 2008 Tools For RDF: –Simile : simile.mit.edu - nice RDF tools For OWL: –Protégé : protege.stanford.edu For reasoning: –Pellet: http://www.mindswap.org/2003/pellet/ –Jena: http://jena.sourceforge.net/inference/
45
Dave Thau PASI, Costa Rica, June 7, 2008 Protégé
46
Dave Thau PASI, Costa Rica, June 7, 2008 OWLViz Tab
47
Dave Thau PASI, Costa Rica, June 7, 2008 Summing Up Ontologies are useful for –Data discovery –Data integration –Terminology regulation –Analysis Reuse Ontology in ecology and biodiversity is just getting started
48
Dave Thau PASI, Costa Rica, June 7, 2008 Lastly: Back to the Goals Learn about ontology successes Learn basic terminology / buzz words Get a sense for ontology development See how and where they apply to ecology and biodiversity studies Learn what remains to be done –Bottom line: A LOT!
49
Dave Thau PASI, Costa Rica, June 7, 2008 Some References Practical guides/references –Protégé. Open source ontology editor. http://protege.stanford.edu/http://protege.stanford.edu/ –CO-ODE. Various resources on ontologies, tutorials, best-practices, etc. http://www.co-ode.org/http://www.co-ode.org/ –W3C Semantic Web Activity. Various pointers, standardization efforts, etc. http://www.w3.org/2001/sw/ http://www.w3.org/2001/sw/ –OWL Resources: OWL-Guide (http://www.w3.org/TR/owl-guide/), OWL-Reference (http://www.w3.org/TR/owl-ref/)http://www.w3.org/TR/owl-guide/http://www.w3.org/TR/owl-ref/ –Pizza Tutorials. http://www.co-ode.org/resources/tutorials/http://www.co-ode.org/resources/tutorials/ Academic Papers/Collections –Bard and Rhee. Ontologies in biology: Design, applications and future challenges. Nature Reviews, Genetics, vol. 5, 2004. –The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nature Genet. 25: 25-29, 2000 –Barry Smith, http://ontology.buffalo.edu/smith/, various papers on ontologies (even for ecology)http://ontology.buffalo.edu/smith/ –Sowa, J. F. Knowledge Representation: Logical, Philosophical, and Computational Foundations. PWS Publishing Co., Boston, 1999. –Baader F., Calvanese D., McGuinness D., Nardi D., and Patel-Schneider P. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge Univ. Press, 2003. –Thomas R. Gruber. Toward principles for the design of ontologies used for knowledge sharing. In Formal Ontology in Conceptual Analysis and Knowledge Representation, Kluwer Academic Publishers, 1993. –Nicola Guarino. Formal ontology and information systems. In Proc. of Formal Ontology in Information Systems, IOS Press, pp. 3-15, 1998.
50
Dave Thau PASI, Costa Rica, June 7, 2008 Exercise: Ontology Engineering 1. Choose the specific “domain” you want to tackle: Based on a specific collection of data that you are familiar with Based on an existing project/experiment you are working on or understand Focus on use: data set markup or describing a domain 2. Define (a part of) an ontology for the domain Start with the classes Then arrange into an isa hierarchy Then add properties between the classes If you feel mighty, try some property constraints 3. Capture your ontology on whiteboard, poster board, or cmap tool as one or more diagram Transitive Inverse Functional Inverse Functional All Apples have a color Some Apples have a color All apples are red Some apples are red Crabs have 2 pincers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.