Ontology Matching Basics Ontology Matching by Jerome Euzenat and Pavel Shvaiko Parts I and II 11/6/2012Ontology Matching Basics - PL, CS 6521
1 - Applications 1.1Ontology engineering 1.2Information integration 1.3Peer-to-peer information sharing 1.4Web service composition 1.5Autonomous communication systems 1.6Navigation and query answering on the web 11/6/2012Ontology Matching Basics - PL, CS 6522
11/6/2012Ontology Matching Basics - PL, CS 6523
11/6/2012Ontology Matching Basics - PL, CS 6524
11/6/2012Ontology Matching Basics - PL, CS 6525
11/6/2012Ontology Matching Basics - PL, CS 6526
11/6/2012Ontology Matching Basics - PL, CS 6527
2 – The matching problem 2.1Vocabularies, schemas and ontologies 2.2Ontology language 2.3Types of heterogeneity 2.4Terminology 2.5The ontology matching problem 11/6/2012Ontology Matching Basics - PL, CS 6528
2.1 Vocabularies, schemas and ontologies Tags and folksonomies Directories Relational database schemas XML schemas Conceptual models Ontologies – model-theoretic semantics, “ontologies are logic theories” 11/6/2012Ontology Matching Basics - PL, CS 6529
2.2 Ontology language (OWL) Entities: – Classes – Individuals – Relations – Datatypes – Data values Entity relations – Specialization – Exclusion – Instantiation 11/6/2012Ontology Matching Basics - PL, CS 65210
11/6/2012Ontology Matching Basics - PL, CS 65211
11/6/2012Ontology Matching Basics - PL, CS 65212
2.4 - Terminology 11/6/2012Ontology Matching Basics - PL, CS 65213
2.5 – The ontology mapping problem 11/6/2012Ontology Matching Basics - PL, CS 65214
11/6/2012Ontology Matching Basics - PL, CS 65215
11/6/2012Ontology Matching Basics - PL, CS 65216
11/6/2012Ontology Matching Basics - PL, CS 65217
2.3 – Types of heterogeneity Syntactic heterogeneity – Not expressed in the same ontology language Terminological heterogeneity – Variation in names for the same entity Conceptual heterogeneity – Differences in coverage, granularity, or perspective Semiotic (pragmatic) heterogeneity – How entities are interpreted by people 11/6/2012Ontology Matching Basics - PL, CS 65218
3 – Classification of ontology matching techniques 3.1Matching dimensions - Input dimensions - Process dimensions - Output dimensions 3.2Classification of matching approaches - Exhaustivity - Disjointedness - Homogeneity - Saturation 3.3Other classifications - Horizontal: data, ontology, and context layers - Vertical: syntactic, pragmatic, conceptual 11/6/2012Ontology Matching Basics - PL, CS 65219
11/6/2012Ontology Matching Basics - PL, CS 65220
Element-level techniques String-based techniques Language-based techniques Constraint-based techniques Linguistic resources Alignment reuse Upper level and domain specific formal ontologies 11/6/2012Ontology Matching Basics - PL, CS 65221
Structure-level techniques Graph-based techniques Taxonomy-based techniques Repository of structures Model-based techniques Data analysis and statistical techniques 11/6/2012Ontology Matching Basics - PL, CS 65222
4 – Basic techniques 4.1Similarity, distances and other measures 4.2Name-based techniques 4.3Structure-based techniques 4.4Extensional techniques 4.5Semantic-based techniques 11/6/2012Ontology Matching Basics - PL, CS 65223
4.2 – Name-based techniques Problem: synonyms and homonyms (polysemy) String-based methods – Normalization – String equality – Substring test – Edit, token-based, and path distances Language-based methods – Intrinsic methods – Extrinsic methods 11/6/2012Ontology Matching Basics - PL, CS 65224
11/6/2012Ontology Matching Basics - PL, CS 65225
4.3 – Structure-based techniques Internal structure – Property comparison – Datatype comparison – Domain comparison – Comparing multiplicities and properties – Other features Relational structure – Maximum common directed subgraph problem – Taxonomic structure – Mereologic structure – Relation similarities 11/6/2012Ontology Matching Basics - PL, CS 65226
11/6/2012Ontology Matching Basics - PL, CS 65227
11/6/2012Ontology Matching Basics - PL, CS 65228
4.4 – Extensional techniques Common extension comparison – Hamming distance – Jaccard similarity – Formal concept analysis – intent and extent Instance identification techniques Disjoint extension comparison – Statistical approach – Similarity-based extension comparison – Matching-based comparison 11/6/2012Ontology Matching Basics - PL, CS 65229
4.5 – Semantic-based techniques Model-theoretic, deductive methods Act to amplify seeding alignments Techniques based on external ontologies Deductive techniques – Propositional satisfiability – Modal satisfiability – Description logic techniques 11/6/2012Ontology Matching Basics - PL, CS 65230
5 – Matching strategies 5.1Matcher composition 5.2Similarity aggregation 5.3Global similarity computation 5.4Learning methods 5.5Probabilistic methods 5.6User involvement and dynamic composition 5.7Alignment extraction 11/6/2012Ontology Matching Basics - PL, CS 65231
11/6/2012Ontology Matching Basics - PL, CS 65232
11/6/2012Ontology Matching Basics - PL, CS 65233
11/6/2012Ontology Matching Basics - PL, CS 65234
11/6/2012Ontology Matching Basics - PL, CS 65235
5.4 – Learning methods Bayes learning WHIRL learner Neural networks Decision trees Stacked generalization 11/6/2012Ontology Matching Basics - PL, CS 65236
11/6/2012Ontology Matching Basics - PL, CS 65237
5.5 Probabilistic methods Bayesian networks 11/6/2012Ontology Matching Basics - PL, CS 65238
5.6 – User involvement and dynamic composition Providing input – Ontologies, parameters, initial alignment Manual matcher composition – Assemble from libraries – Examine results and iterate – Apply to application Relevance feedback 11/6/2012Ontology Matching Basics - PL, CS 65239
5.7 – Alignment extraction Select on similarity, extract, and filter Thresholds Strengthening and weakening Optimizing the result 11/6/2012Ontology Matching Basics - PL, CS 65240
11/6/2012Ontology Matching Basics - PL, CS Fig displays a fictitious example involving several of the methods. It (i) runs several basic matchers in parallel, (ii) aggregates their results, (iii) selects some correspondences on the basis of their (dis)similarity, (iv) extracts an alignment, (v) uses a semantic algorithm to amplify the selected alignment, and (vi) reiterate this process if necessary.