12th of October, 2006KEG seminar1 Combining Ontology Mapping Methods Using Bayesian Networks Ontology Alignment Evaluation Initiative 'Conference' Track Ondřej Šváb Vojtěch Svátek
12th of October, 2006KEG seminar2 Overview Ontology Mapping Combining Ontology Mapping Methods Using Bayesian Networks –String distance metrics –Mapping patterns OAEI –Our track – conference domain –Evaluation
12th of October, 2006KEG seminar3 Ontology Mapping Ontology Mapping = discovering of Semantic correspondencies (equivalence, subsumption)
12th of October, 2006KEG seminar4 Classification of ontology mapping techniques
12th of October, 2006KEG seminar5 Modelling of interdependencies (1) Using Bayesian Networks String distance metrics from SecondString library (mapping methods) Training data, pairs of concepts from ontologies ekaw.owl a confOf.owl from OntoFarm collection –798 pairs Bayesian network –nodes: mapping justification by each mapping method –Classification node: „align“ (true, false)
12th of October, 2006KEG seminar6 Modelling of interdependencies (2) Two tested Bayesian Networks (two corresponding classifiers) –Naive Bayesian Structure Probability distributions learned from data –Learned Bayesian Structure Learned both CPT and structure
12th of October, 2006KEG seminar7 Evaluation of models One-leave-out method (798x) Evaluation: precision, recall Precision more important than recall –3:2 (precision weight 0,6), 4:1 (0,8) –C = P*a + R*b, kde a, b jsou váhy –higher C, better classifier
12th of October, 2006KEG seminar8 73% precision, 60% recall, 88% accuracy at 80% threshold
12th of October, 2006KEG seminar9 84% precision, 53% recall, 89% accuracy at 60% threshold Align ci. CharJaccard, Monge-Elkan, Levenshtein | TFIDF, SmithWaterman, Jaccard, Jaro, SLIM
12th of October, 2006KEG seminar10 Evaluation (c = P*a + R*b) Naive bayes Jaccard BN 2
12th of October, 2006KEG seminar11 Mapping patterns (1) Capturing structures using mapping patterns Mapping pattern between ontologies
12th of October, 2006KEG seminar12 Mapping patterns (2) Mapping pattern Part of Bayesian Network
12th of October, 2006KEG seminar13 Conclusions & Future works Combination of string-based methods is not promising Implementation of low-level „string based justifications“ of mapping – suffix, prefix, identical names Capturing context – Employ methods working with structures of ontologies (graph-based), mapping patterns Not only equivalence relations, but also discovery subsumption relations – using linguistic sources, like WordNet
12th of October, 2006KEG seminar14 Ontology Alignment Evaluation Initiative 'Conference' Track
12th of October, 2006KEG seminar15 OAEI 2006 at ISWC’06 Evaluation initiative in Ontology matching Since 2004 In 2006 OAEI workshop at Ontology matching workshop, ISWC Four tracks (six data sets) –Benchmark (biblio), –Expressive ontologies: anatomy (2 ontologies 10k classes), jobs (jobs and jobs seekers, real world case) –Directory (web sites directory) – 4 thousand elementary test, Food data set– SKOS thesaurus about food with other food ontologies
12th of October, 2006KEG seminar16 Conference track Coordinated by UEP Free exploration by participants within 10 ontologies Domain: conference organisation No a priori reference alignment Participants: 6 research groups
12th of October, 2006KEG seminar17 Ontologies in track
12th of October, 2006KEG seminar18 Participants (1) Combination of methods: lexicographic and contextual ISLab –1:1 matching approach –Linguistic technique - thesaurus of terms and weighted terminological relationships is exploited –Contextual technique - semantic relation in an ontology RiMOM –Ontology alignment defined as a directional one –Matchers: Name-based (also NLP methods), Instance-based, Description-based, Taxonomy context-based, Constraints-based CtxMatch –DL formulas –Not only eq., also subsumption, disjointness, intersection
12th of October, 2006KEG seminar19 Participants (2) COMA++ –Extension of COMA Automs –Lexical matching method, LSI, structural matching algorithm Falcon –elementary matchers: string-based, graph- based
12th of October, 2006KEG seminar20 Evaluation (1) Personal judgement of organisers interesting individual correspondences (inverse compound names, eg. PC_Member = Member_PC), synonyms Mapping errors: subsumption, inversion role, siblings, lexical confusion Mapping between relation and class, eg. has_an_ and
12th of October, 2006KEG seminar21 Evaluation (2)
12th of October, 2006KEG seminar22 Evaluation (3) Subsumption error –Author,Paper_Author –Conference_Trip, Conference_part Inversion role error –abstract_of_paper,reviewerOfPaper error Siblings –ProgramCommittee,Technical_commitee Lexical confusion error –program,Program_chair Relation – Class mapping –has_enddate,Date –hasTitle,Title; hasSurname,Surname
12th of October, 2006KEG seminar23 Evaluation (4)
12th of October, 2006KEG seminar24 Summary How to evaluate this track? –Interesting mappings Recall?