Download presentation
Presentation is loading. Please wait.
Published bySimon Carpenter Modified over 9 years ago
1
Ontology Alignment/Matching Prafulla Palwe
2
Agenda ► Introduction Being serious about the semantic web Living with heterogeneity Heterogeneity problem I have a plan for you I have a plan for you ► Matching Problem Matching Operation Motivation Schema Matching Vs Ontology Matching Correspondence Alignment ► Matching Process Sequential composition Parallel composition ► Application Domains Traditional Emergent ► Classification Matching Dimensions ► Basic Techniques Element Level Structure Level ► Summary and Challenges
3
Introduction ► ► Being serious about the semantic web - It is not one guy's ontology It is not several guys' common ontology It is many guys and girls' many ontologies So it is a mess, but a meaningful mess
4
Introduction ► ► Living with heterogeneity - The semantic web will be: ► ► Huge ► ► Dynamic ► ► Heterogeneous These are not bugs, they are features. We must learn to live with them.
5
Introduction ► ► Heterogeneity problem – Resources being expressed in different ways must be reconciled before being used. Mismatch between formalized knowledge can occur when: ► ► different languages are used; ► ► different terminologies are used; ► ► different modeling is used.
6
Introduction ► I have a plan for you – Reconciliation
7
Matching Problem ► Matching Operation Definition – Matching operation takes as input ontologies, each consisting of a set of discrete entities (e.g., tables, XML elements, classes, properties) and determines as output the relationships (e.g., equivalence, subsumption) holding between these entities
8
Matching Problem ► Motivation – 2 XML Schemas 2 Ontologies
9
Matching Problem
17
► Schema mapping Vs ontology mapping Differences - ► Schemas often do not provide explicit semantics for their data Relational schemas provide no generalization ► Ontologies are logical systems that constrain the meaning Ontology definition as set of logical axioms Commonalities - ► Schemas and ontologies provide a vocabulary of terms that describes the domain of interest ► Schemas and ontologies constrain the meaning of terms used in the vocabulary.
18
Matching Problem ► Correspondence Definition – Given 2 ontologies O and O’, a correspondence between M between O and O’ is a 5-uple : such that: ► id is a unique identifier of the correspondence. ► e and e’ are entities of O and O’ (e.g. XML Elements, classes) ► R is a relation (e.g. equivalence (=), disjointness (_|_)) ► n is a confidence measure in some mathematical structure (typically in the [0,1] range)
19
Matching Problem ► Alignment Definition – Given 2 ontologies O and O’, an alignment A between O and O’: ► Is a set of correspondence on O and O’ ► With some cardinality: 1-1, 1-* etc. ► Some additional metadata (method, date, properties etc)
20
Matching Process
26
► General Basic Matching Process
27
Matching Process ► Sequential Composition
28
Matching Process ► Parallel composition
29
Matching Process ► Similarity Filter, alignment extractor and alignment filter –
30
Matching Process ► Aggregation Operations – There are many different ways to aggregate matcher results, usually depending on confidence/similarity: ► Triangular norms (min, weighted products) useful for selecting only the best results ► Multidimensional distances (Eudidean distance, weighted sum) useful for taking into account all dimensions ► Fuzzy aggregation (min, weighted average) useful for aggregating competing algorithms and averaging their results ► Other specific measures (e.g., ordered weighted average)
31
Application Domains ► Traditional - Ontology evolution Schema integration Catalog integration Data integration
32
Application Domains ► Ontology Evolution
33
Application Domains ► Catalog Integration
34
Application Domains ► Emergent P2P information sharing Agent communication Web service composition Query answering on the web
35
Application Domains ► P2P information sharing
36
Application Domains ► Web Service Composition
37
Application Domains ► Agent communication
38
Classifications ► Matching Dimensions Input Dimensions ► Underlying models (e.g. XML, OWL) ► Schema Level Vs Instance Level Process Dimensions ► Approximate Vs Exact ► Interpretation of the input Output Dimensions ► Cardinality ► Equivalence Vs Diverse relations ► Graded Vs Absolute Confidence
39
Classifications ► Three Layers Upper Layer ► Granularity of match ► Interpretation of the input information Middle Layer ► Represents classes of elementary (basic) matching techniques Lower Layer ► Based on the kind of input which is used by elementary matching techniques
40
Classifications ► Classification of schema based techniques
41
Basic Techniques ► Element Level Techniques String based – Prefix - ► Takes an input 2 strings and checks whether the first string starts with the second ► e.g. net = network but also hot = hotel Suffix – ► Takes an input 2 strings and checks whether the first string ends with the second ► e.g. ID = PID but also word = sword Edit Distance – ► Takes as input 2 strings and calculates the number of edit operations (insertion,deletion,substitution) of characters required to transform one string into other normalized by length of the max string. ► editDistance(NKN, Nikon) = 0.4
42
Basic Techniques Language based – Tokenization – ► Parses names into tokens by recognizing punctuation, cases ► Hands-Free_Kits ► Hands-Free_Kits Lemmatization – ► Analyses morphologically tokens in order to find all their possible basic forms ► Kits Kit Elimination – ► Discards empty tokens that are articles, prepositions, conjuctions ► a, the, by, type of, their, from
43
Basic Techniques ► Structure Level Techniques Ontologies are viewed as graph-like structure containing terms and their inter-relationships. Taxonomy based ► Bounded path matching These take 2 paths with links between classes defined by the hierarchical relations, compare terms and their positions along these paths and identify similar terms. ► Super(sub)-concept rules If super concepts are the same, the actual concepts are similar to each other
44
Basic Techniques Tree based Children ► 2 non leaf schema elements are structurally similar if their immediate children sets are highly similar Leaves ► 2 non leaf schema elements are structurally similar if their leaf sets are highly similar, even if their immediate children are not.
45
Basic Techniques
48
Summary and Challenges ► Summary Ontology Matching and alignment is the process of developing the common or most common structure/semantic terms out of 2 or more different ontologies/structures/schemas. Different efficient and complex algorithms using basic techniques of matching process, can be developed for matching and alignment generation. ► Challenges Developing generic and highly efficient matching and alignment generation algorithms.
49
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.