First Insights into the Library Track of the OAEI Dominique Ritze Mannheim University Library
Motivation Publication x subject (thesaurus 2): ontology alignment Ontology Mapping Search 0 results Thesaurus 1Thesaurus 2 Ontology Mapping Ontology Alignment Ontology Mapping Search Publication x subject (thesaurus 1): ontology alignment =
Overview Ontology Matching OAEI Thesaurus vs. Ontology OAEI Library Track 2012 Lessons learned and Future Work
Ontology Matching Person Author PCMember Document Paper Review People Author Reviewer Doc Paper reviews writes reviews … CommitteeMember
Ontology Matching Evaluation Tool O1 R A O2 m Test Result
Ontology Alignment Evaluation Initiative (OAEI) Annual campaign started 2005 Different tracks/datasets Benchmark, Anatomy, Conference, Multifarm, Large BioMed, Library, Instance Matching 21 submitted systems (2012) Goal: Improving the performances of the ontology matching field Through comparison of algorithms New challenges for the systems
Thesaurus = Ontology? SKOSOWL skos:conceptowl:class skos:prefLabel skos:alternativeLabel rdfs:label skos:scopeNote skos:notation rdfs:comment A skos:narrower BA rdfs: subClassOf B A skos:broader BB rdfs:subClassOf A skos:relatedrdfs:seeAlso Commodities Germany Ananas Tropical Fruit Metal Product -> Metal
OAEI Library Track Are current state-of-the-art ontology matching tools able to match thesauri? Dominique Ritze, Kai Eckert, Benjamin Zapilko, Joachim Neubert
Data Set Thesaurus for economics (STW) concepts with additional keywords (EN, DE) Thesaurus for the Sociel Sciences (TheSoz) concepts with additional keywords (EN, DE, FR) Reference alignment manually created in 2006 Both actively used in libraries for keyword indexing
Execution 7GB Debian machine Timeframe 1 week 13 of the 21 submitted systems were able to generate an alignment No system had a heap space problem Evaluation: Precision, Recall, F-Measure, Runtime
Results How to evaluate the results? F-Measure of 0.67 good? SystemPrecisionRecallF-MeasureTime (s)Size1:1 GOMMA ServOMapLt LogMap ServOMap yes YAM LogMapLt G02A Hertuda WeSeE yes HotMatch yes CODI yes MapSSS yes AROMA Optima
Results SystemPrecisionRecallF-MeasureTime (s)Size1:1 MatcherPref MatcherDE MatcherAll GOMMA ServOMapLt LogMap … MatcherEN CODI yes MapSSS yes AROMA Optima
Manual Evaluation Between 38 and 269 new correct correspondences found per matcher Up to half of the correspondences correct Many new correspondences are quite simple Some more “complex” and interesting ones Automated production = CAM Several incorrect ones if the labels are quite similar Difficult to distinguish the names of countries, their inhabitants and the languages
Lessons Learned Transformation SKOS to OWL causes some problems, especially regarding the labels Ontology matching systems are nevertheless able to match the thesauri and even discover unknown correct correspondences Interest of the community in this topic
Future Work Update reference alignment adapted results SKOS import for matching systems Use instance data to match thesauri? Other thesauri?
Thank you for your attention!