Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.

Ontology Alignment/Matching Prafulla Palwe

Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem  I have a plan for you  I have a plan for you ► Matching Problem  Matching Operation  Motivation  Schema Matching Vs Ontology Matching  Correspondence  Alignment ► Matching Process  Sequential composition  Parallel composition ► Application Domains  Traditional  Emergent ► Classification  Matching Dimensions ► Basic Techniques  Element Level  Structure Level ► Summary and Challenges

Introduction ► ► Being serious about the semantic web -   It is not one guy's ontology   It is not several guys' common ontology   It is many guys and girls' many ontologies   So it is a mess, but a meaningful mess

Introduction ► ► Living with heterogeneity -   The semantic web will be: ► ► Huge ► ► Dynamic ► ► Heterogeneous   These are not bugs, they are features.   We must learn to live with them.

Introduction ► ► Heterogeneity problem –   Resources being expressed in different ways must be reconciled before being used.   Mismatch between formalized knowledge can occur when: ► ► different languages are used; ► ► different terminologies are used; ► ► different modeling is used.

Introduction ► I have a plan for you – Reconciliation

Matching Problem ► Matching Operation  Definition – Matching operation takes as input ontologies, each consisting of a set of discrete entities (e.g., tables, XML elements, classes, properties) and determines as output the relationships (e.g., equivalence, subsumption) holding between these entities

Matching Problem ► Motivation –  2 XML Schemas  2 Ontologies

Matching Problem

► Schema mapping Vs ontology mapping  Differences - ► Schemas often do not provide explicit semantics for their data  Relational schemas provide no generalization ► Ontologies are logical systems that constrain the meaning  Ontology definition as set of logical axioms  Commonalities - ► Schemas and ontologies provide a vocabulary of terms that describes the domain of interest ► Schemas and ontologies constrain the meaning of terms used in the vocabulary.

Matching Problem ► Correspondence  Definition –  Given 2 ontologies O and O’, a correspondence between M between O and O’ is a 5-uple : such that: ► id is a unique identifier of the correspondence. ► e and e’ are entities of O and O’ (e.g. XML Elements, classes) ► R is a relation (e.g. equivalence (=), disjointness (_|_)) ► n is a confidence measure in some mathematical structure (typically in the [0,1] range)

Matching Problem ► Alignment  Definition –  Given 2 ontologies O and O’, an alignment A between O and O’: ► Is a set of correspondence on O and O’ ► With some cardinality: 1-1, 1-* etc. ► Some additional metadata (method, date, properties etc)

Matching Process

► General Basic Matching Process

Matching Process ► Sequential Composition

Matching Process ► Parallel composition

Matching Process ► Similarity Filter, alignment extractor and alignment filter –

Matching Process ► Aggregation Operations –  There are many different ways to aggregate matcher results, usually depending on confidence/similarity: ► Triangular norms (min, weighted products) useful for selecting only the best results ► Multidimensional distances (Eudidean distance, weighted sum) useful for taking into account all dimensions ► Fuzzy aggregation (min, weighted average) useful for aggregating competing algorithms and averaging their results ► Other specific measures (e.g., ordered weighted average)

Application Domains ► Traditional -  Ontology evolution  Schema integration  Catalog integration  Data integration

Application Domains ► Ontology Evolution

Application Domains ► Catalog Integration

Application Domains ► Emergent  P2P information sharing  Agent communication  Web service composition  Query answering on the web

Application Domains ► P2P information sharing

Application Domains ► Web Service Composition

Application Domains ► Agent communication

Classifications ► Matching Dimensions  Input Dimensions ► Underlying models (e.g. XML, OWL) ► Schema Level Vs Instance Level  Process Dimensions ► Approximate Vs Exact ► Interpretation of the input  Output Dimensions ► Cardinality ► Equivalence Vs Diverse relations ► Graded Vs Absolute Confidence

Classifications ► Three Layers  Upper Layer ► Granularity of match ► Interpretation of the input information  Middle Layer ► Represents classes of elementary (basic) matching techniques  Lower Layer ► Based on the kind of input which is used by elementary matching techniques

Classifications ► Classification of schema based techniques

Basic Techniques ► Element Level Techniques  String based –  Prefix - ► Takes an input 2 strings and checks whether the first string starts with the second ► e.g. net = network but also hot = hotel  Suffix – ► Takes an input 2 strings and checks whether the first string ends with the second ► e.g. ID = PID but also word = sword  Edit Distance – ► Takes as input 2 strings and calculates the number of edit operations (insertion,deletion,substitution) of characters required to transform one string into other normalized by length of the max string. ► editDistance(NKN, Nikon) = 0.4

Basic Techniques  Language based –  Tokenization – ► Parses names into tokens by recognizing punctuation, cases ► Hands-Free_Kits  ► Hands-Free_Kits   Lemmatization – ► Analyses morphologically tokens in order to find all their possible basic forms ► Kits  Kit  Elimination – ► Discards empty tokens that are articles, prepositions, conjuctions ► a, the, by, type of, their, from

Basic Techniques ► Structure Level Techniques  Ontologies are viewed as graph-like structure containing terms and their inter-relationships.  Taxonomy based ► Bounded path matching  These take 2 paths with links between classes defined by the hierarchical relations, compare terms and their positions along these paths and identify similar terms. ► Super(sub)-concept rules  If super concepts are the same, the actual concepts are similar to each other

Basic Techniques  Tree based  Children ► 2 non leaf schema elements are structurally similar if their immediate children sets are highly similar  Leaves ► 2 non leaf schema elements are structurally similar if their leaf sets are highly similar, even if their immediate children are not.

Basic Techniques

Summary and Challenges ► Summary  Ontology Matching and alignment is the process of developing the common or most common structure/semantic terms out of 2 or more different ontologies/structures/schemas.  Different efficient and complex algorithms using basic techniques of matching process, can be developed for matching and alignment generation. ► Challenges  Developing generic and highly efficient matching and alignment generation algorithms.

Thank You

Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.

Similar presentations

Presentation on theme: "Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.

Similar presentations

Presentation on theme: "Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem."— Presentation transcript:

Similar presentations

About project

Feedback