Toward Using Ontologies to Reason About Disagreeing Taxonomic Experts Dave Thau UC Davis
NeSC RDF Workshop June 8, 20062/25 Why Did The Chicken Cross The Road? To get to the other side. To boldly go where no chicken has gone before. To prove it could never reach the other side. Chickens, over great periods of time, have been naturally selected so that they are now predisposed to cross roads. Zeno of Elea
RDF Workshop June 8, 20063/25 Why did the taxonomists cross the road? So they could properly identify the chicken
RDF Workshop June 8, 20064/25 Overview Quick primer on taxonomy Some types of disagreements between experts Problems this causes Using an ontology to represent taxonomic opinions Using the ontology to compare experts’ theories
RDF Workshop June 8, 20065/25 Linnaean Taxonomy Basics Ranks: kingdom, phylum, class, order, family, genus, species, variety (and others!) Canidae CanisVulpes Nyctereutes Canis familiaris Family Rank Genus Rank Species Rank Canis lupus Canis latrans
RDF Workshop June 8, 20066/25 Things you may not know There is no big list of all the known species in the world This is partly because people don’t agree on the definitions of the species, genera, etc. Estimates are that 6% of the known taxa are changed every year This has been going on since Linnaeus published his classification scheme in 1735
RDF Workshop June 8, 20067/25 Ranunculus aquatilis R.a. var aquatilis R.a. var diffusus R.a. var hispidulus FNA-03, 1997 Ranunculus aquatilis R.a. var capillaceus Benson, 1948 Types of Disagreement: The Basics R.a. var calvescens This results in 5 12 (more than 240 million) possible sets of relationships. B A A B A B BA A overlap B BA A disjoint B B A A B
RDF Workshop June 8, 20068/25 Types of Disagreement - Splitting and Lumping Ranunculus flammula R.f. var genuiinus R.f. var ovalis Benson, 1948 Ranunculus flammula R.f. var filiformis Kartesz, 2004 Peet, 2005: B.1948:R.flammula is congruent to K.2004:R.flammula B.1948:R.f. genuiinus is included in K.2004:R.f.flammula B.1948:R.f.ovalis is included in K.2004:R.flammula B.1948:R.f.filifomis is congruent to K.2004:R.f.filiformis R.f. var filiformis R.f. var flammula
RDF Workshop June 8, 20069/25 Ranunculus glaberrimus R.g. var reconditus R.g. var ellipticus R.g. var typicus Benson, 1948 Ranunculus glaberrimus R.g. var ellipticus R.g. var glaberrimus Kartesz, 2004 Types of Disagreement – Differing Extents Peet, 2005: B.1948:R. glaberriums contains K.2004:R. glaberrimus B.1948:R.g.ellipticus is congruent to K.2004:R.g.ellipticus B.1948:R.g.typicus is congruent to K.2004:R.h.blaberrimus B.1948:R.g.reconditus is congruent to K.2004:R.tritenatus
RDF Workshop June 8, /25 Impact on Data Analysis Can’t find data –If A B, a search on A should retrieve B Can’t aggregate data –If B A, you should be able to combine data from B into A
RDF Workshop June 8, /25 What to do in case of conflicting experts? Just listen to one expert you like Pick an expert you like and everyone who agrees with this expert (and each other) Choose experts who form the largest set of agreeing experts Choose experts whose opinions encompass the smallest or largest number of taxa
RDF Workshop June 8, /25 How can we find out which experts agree? Represent taxonomy using logic Use the logic to determine relations between expert opinions (theories) –Two theories may conflict –Two theories may be equivalent –One theory may encompass another
RDF Workshop June 8, /25 Representation Details Based on the Taxon Concept Schema (TCS) Represented using Description Logic –(OWL DL)
RDF Workshop June 8, /25 Example Ontology Specimen Ranunculus (Kartesz, 2004) hasSpecies Ranunculus glaberrimus (Kartesz, 2004) Things in the species Ranunculus glaberrimus Things in the genus Ranunculus Taxon Taxon Description hasGenus
RDF Workshop June 8, /25 Fundamental Assumptions Each Taxa class has at least one instance Each Taxa class is defined as the union of its subclasses A class’s subclasses are defined to be mutually disjoint
RDF Workshop June 8, /25 Questions Ontology Can Answer Find the subclasses of a class Make sure the taxonomy is consistent See if two classes are equivalent Can also use it to compare expert opinions
RDF Workshop June 8, /25 Compatible Theories A theory is one expert’s set of classes and relations and all they imply. A set of theories is compatible if –Each theory is consistent and –The correspondences between classes in the theories do not cause inconsistency.
RDF Workshop June 8, /25 Ranunculus hydrocharoides R.h. var natans R.h. var stolonifer R.h. var typicus Benson, 1948 Ranunculus hydrocharoides R.h. var stolonife r R.h. var typicus Kartesz, 2004 Example Incompatibility Peet, 2005: B.1948:R.h.stolonifer is congruent to K.2004:R.h.stolonifer B.1948:R.h.typicus is congruent to K.2004:R.h.typicus B.1948:R. hydrocharoides is congruent to K.2004:R. hydrocharoides
RDF Workshop June 8, /25 Example Incompatibility Peet, 2005: B.1948:R. macranthus contains K.2004: R. petiolaris B.1948:R. petiolaris is contained by K. petiolaris Ranunculus petiolaris … Benson, 1948 Ranunculus petiolaris Kartesz, 2004 Ranunculus macranthus … B.48:R. petiolaris K.04:R. petiolaris B.48:R. macranthus contradicts B.48:R. macranthus and B.48:R. petiolaris are disjoint.
RDF Workshop June 8, /25 Inferring Unstated Correspondences Ranunculus arizonicus R.a. var chihuahua R.a. var typicus Benson, 1948 Ranunculus arizonicus Kartesz, 2004 Peet, 2005: B.1948:R.a.typicus is included in K.2004:R. arizonicus B.1948:R. arizonicus is congruent to K.2004:R. arizonicus
RDF Workshop June 8, /25 Given two compatible theories, T and T’: –The theories are equivalent if each class in theory T is equivalent to one class in T’ (and vice versa). – T is smaller than ( ) T’ if each class in T either equals or is contained by a class in T’. Comparing Theories
RDF Workshop June 8, /25 Example of Theory Ordering A BCD A BC A BCE T1 T2T3 T1 T2 T3
RDF Workshop June 8, /25 Whom to believe? Just listen to one expert you like –Easy! Don’t need any reasoning Pick an expert you like and everyone who can agree with this expert –Choose all experts with theories equivalent to the expert you like Choose experts who form the largest set of agreeing experts –Find largest equivalence class Choose experts whose opinions form the smallest or largest number of taxa –Bigger theories account for more taxa
RDF Workshop June 8, /25 Future Work Vetting the ontology Adding ‘intelligence’ to tools which build correspondences Implementing authority picker in a workflow system Efficient algorithm for determining theory hierarchy
RDF Workshop June 8, /25 Thanks! Questions? I’d like to acknowledge: –Bertram Ludäscher, Shawn Bowers, Serguei Krivov, Richard Waldinger for many discussions on this topic. –Jessie Kennedy, Robert Kukla, Trevor Patterson, Martin Graham for their work on the Taxon Concept Schema –Bob Peet for the Ranunculus data set –Kirsten Menger-Anderson for Chicken Drawing –NSF, under SEEK awards , , , and
RDF Workshop June 8, /25 R. aquatilis R. trichophyllus Where In Greece Can I Find Ranunculus aquatilis?
RDF Workshop June 8, /25 Beginnings of Biological Taxonomy Egypt, 1500 BC: Ebers medical papyrus, classification of medical plants Greece, 300 BC: Aristotle and Theophrastus China, 200 BC: Erh-ya dictionary (second century BC)