LDK R Logics for Data and Knowledge Representation Semantic Matching.

Slides:



Advertisements
Similar presentations
Discovering Missing Background Knowledge in Ontology Matching Pavel Shvaiko 17th European Conference on Artificial Intelligence (ECAI06) 30 August 2006,
Advertisements

Comparative Succinctness of KR Formalisms Paolo Liberatore.
S-Match: an Algorithm and an Implementation of Semantic Matching Fausto Giunchiglia July 2004, Hannover, Germany work in collaboration with Pavel Shvaiko.
S-Match: an Algorithm and an Implementation of Semantic Matching Pavel Shvaiko 1 st European Semantic Web Symposium, 11 May 2004, Crete, Greece paper with.
1 A Description Logic with Concrete Domains CS848 presentation Presenter: Yongjuan Zou.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Partners Using NLP Techniques for Meaning Negotiation Bernardo Magnini, Luciano Serafini and Manuela Speranza ITC-irst, via Sommarive 18, I Trento-Povo,
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,
Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 2 nd European Semantic Web Conference (ESWC), 1 June 2005, Crete, Greece work in collaboration.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Reasoning with context in the Semantic Web … or contextualizing ontologies Fausto Giunchiglia July 23, 2004.
SCHEMA-BASED SEMANTIC MATCHING Pavel Shvaiko joint work on “semantic matching” with Fausto Giunchiglia and Mikalai Yatskevich joint work on “ontology matching”
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
Logics for Data and Knowledge Representation Semantic Matching.
BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entry-Relationship (EER) Model.
LDK R Logics for Data and Knowledge Representation Lightweight Ontologies.
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.
LDK R Logics for Data and Knowledge Representation Modeling First version by Alessandro Agostini and Fausto Giunchiglia Second version by Fausto Giunchiglia.
LDK R Logics for Data and Knowledge Representation Application of (Ground) ClassL.
Dimitrios Skoutas Alkis Simitsis
Pattern-directed inference systems
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Description Logics: Logic foundation of Semantic Web Semantic.
Logics for Data and Knowledge Representation Applications of ClassL: Lightweight Ontologies.
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Logics for Data and Knowledge Representation
Element Level Semantic Matching Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan Paper by Fausto.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
LDK R Logics for Data and Knowledge Representation ClassL (Propositional Description Logic with Individuals) 1.
LDK R Logics for Data and Knowledge Representation ClassL (part 2): Reasoning with a TBox 1.
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
ece 627 intelligent web: ontology and beyond
LDK R Logics for Data and Knowledge Representation Propositional Logic Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Presented by Kyumars Sheykh Esmaili Description Logics for Data Bases (DLHB,Chapter 16) Semantic Web Seminar.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
LDK R Logics for Data and Knowledge Representation Description Logics: family of languages.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 4- 1.
Logics for Data and Knowledge Representation ClassL (part 1): syntax and semantics.
Of 24 lecture 11: ontology – mediation, merging & aligning.
1 Commonsense Reasoning in and over Natural Language Hugo Liu Push Singh Media Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA.
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
Logics for Data and Knowledge Representation
ece 720 intelligent web: ontology and beyond
ece 627 intelligent web: ontology and beyond
Chapter 7: Entity-Relationship Model
Element Level Semantic Matching
WELCOME TO COSRI IBADAN
Logics for Data and Knowledge Representation
Information Retrieval
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Information Networks: State of the Art
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Presentation transcript:

LDK R Logics for Data and Knowledge Representation Semantic Matching

Outline  Introduction  Matching Problems  Graphs Matching  Syntactic vs. Semantic  Semantic Relations  Semantic Matching in ClassL 2

Introduction  An important problem on graphs that we will apply to richer representations like concept hierarchies, classifications, schemas and ontologies is “matching.”  This problem is also popular in semantic networks, where one may want to check whether a particular concept is present.  E.g. 3

Date: a matching for marriage? 4

Introduction (cont’)  Some popular situations that can be modeled as a matching problem are: - Concept matching in semantic networks. - Schema matching in distributed databases. - Ontology matching (ontology “alignment”) in the Semantic Web

The Semantic Matching Problem (Example) 6 ? ? ?

Outline  Introduction  Matching Problems  Graphs Matching  Syntactic vs. Semantic  Semantic Relations  Classification matching  L.O. and ClassL  Semantic Matching in ClassL 7

Matching Problems  There are two kinds of matching:  Syntactic: matching of nodes as objects or strings (so, as such, without meaning).  Semantic: matching of nodes as concepts.  A Matching Problem (syntactic or semantic) is a problem on graphs summarized as: Given two finite graphs, is there a matching between the (nodes of the) two graphs? 8

Matching Problems  A problem of matching can be decomposed in two steps: 1. Extract the graphs from the conceptual models under consideration; 2. Match the resulting graphs.  Below we show some examples of step 1. (We follow [Giunchiglia & Shvaiko, 2007].) 9

Relational DB Schemas  Let us consider the following relational database (RDB) model, say “BANK”: 10 (Giunchiglia & Shvaiko, 2007)

Relational DB Schemas: Representation 1  We can represent the RDB model “BANK” as a graph (a tree) with root “BANK”:  The RDB model is first partitioned into relations, then attributes and data instances. 11 (Giunchiglia & Shvaiko, 2007)

Relational DB Schemas: Representation 2  We can represent the RDB model “BANK” as a graph (a tree) with root “BANK”:  The model is partitioned into relations, then into tuples, attributes and data instances. 12 (Giunchiglia & Shvaiko, 2007)

Relational DB Schemas: NOTEs  Which of the two representations is more preferable depends on the concrete task?  It is always possible to transform one representation into the other.  In contrast to the example of RDB “BANK”, DB schemas are seldom trees. More often, DB schemas are translated into Directed Acyclic Graphs (DAG’s). 13

OODB Schemas  Let us consider the RDB “BANK” in terms of an object-oriented DB (OODB) schema: BRANCH (Street, City, Zip) PERSON (F_Name, L_Name) STAFF : PERSON (Position, Salary, Manager)  The resulting graph is: 14 (Giunchiglia & Shvaiko, 2007)

OODB Schemas: NOTEs  OODB schemas capture more semantics than the relational DBs.  In particular, an OODB schema:  explicitly expresses subsumption relations between elements;  admits special types of arcs for part/whole relationships in terms of aggregation and composition. 15

RDB vs. OODB 16

Semi-structured Data  Neither RDBs nor OODBs capture all the features of semi-structured or unstructured data (Buneman, 1997):  semi-structured data do not possess a regular structure (schemaless);  the “structure” of semi-structured data could be partial or even implicit.  Typical examples are: HTML and XML. 17

XML Schemas  XML schemas can be represented as DAGs.  The graph from the RDB “BANK” could also be obtained from an XML schema. 18 (Giunchiglia & Shvaiko, 2007)

XML Schemas: NOTEs  Often XML schemas represent hierarchical data models.  In this case the only relationships between the elements are {is-a}.  Attributes in XML are used to represent extra information about data. There are no strict rules telling us when data should be represented as elements, or as attributes. 19

Concept Hierarchies  Def. A concept hierarchy is a semi-formal conceptualization of an application domain in terms of concepts and relationships. 20 (Giunchiglia & Shvaiko, 2007)

Concept Hierarchies: NOTEs  Examples are classification hierarchies, e.g., anddirectories (catalogs).  Classification hierarchies / Web directories are sometimes referred to as lightweight ontologies ( Uschold & Gruninger, 2004 ). However:  They are not ontologies, as they lack of a formal semantics (semi-formal vs formal.)  They don’t formalize class instances. 21

Semantic Matching  Introduction  Matching Problems  Graphs Matching  Syntactic vs. Semantic  Semantic Relations  Semantic Matching via SAT in ClassL 22

How to deal with these heterogeneity?  Matching: given two graph-like structures (e.g., concept hierarchies or ontologies), produce a mapping between the nodes of the graphs that semantically correspond to each other 23

Syntactic vs. Semantic 24 Matching Semantic Matching Syntactic Matching Relations are computed between labels at nodes R = {x  [0,1]} Relations are computed between concepts at nodes R = { =, ⊑, ⊒, , ⊓ } Note: all previous systems are syntactic… Note: needed for proper labeling of context mappings

Semantic Matching Mapping element is a 4-tuple, where  ID ij is a unique identifier of the given mapping element;  n 1 i is the i-th node of the first graph;  n 2 j is the j-th node of the second graph;  R specifies a semantic relation between the concepts at the given nodes 25 Semantic Matching: Given two graphs G1 and G2, for any node n 1 i  G1, find the strongest semantic relation R’ holding with node n 2 j  G2 Computed R’s, listed in the decreasing binding strength order: equivalence { = }; more general/specific { ⊒, ⊑ }; mismatch {  }; overlapping { ⊓ }.

Example ? = ? ?  Ste p 4 4 Images Europe Italy Austria Italy Europe Wine and Cheese Austria Pictures

S-Match Algorithm Four Macro Steps Given two labeled trees T1 and T2, do: 1. For all labels in T1 and T2 compute concepts at labels 2. For all nodes in T1 and T2 compute concepts at nodes 3. For all pairs of labels in T1 and T2 compute relations between concepts at labels 4. For all pairs of nodes in T1 and T2 compute relations between concepts at nodes Steps 1 and 2 constitute the preprocessing phase, and are executed once and each time after the schema/ontology is changed (OFF- LINE part) Steps 3 and 4 constitute the matching phase, and are executed every time the two schemas/ontologies are to be matched (ON - LINE part)

Step 1: compute concepts at labels  The idea:  Translate natural language expressions into internal formal language  Compute concepts based on possible senses of words in a label and their interrelations  Preprocessing:  Tokenization. Labels (according to punctuation, spaces, etc.) are parsed into tokens. E.g., Wine and Cheese  ;  Lemmatization. Tokens are morphologically analyzed in order to find all their possible basic forms. E.g., Images  Image;  Building atomic concepts. An oracle (WordNet) is used to extract senses of lemmatized tokens. E.g., Image has 8 senses, 7 as a noun and 1 as a verb;  Building complex concepts. Prepositions, conjunctions, etc. are translated into logical connectives and used to build complex concepts out of the atomic concepts E.g., C Wine and Cheese = ⊔, where ⋃ is a union of the senses that WordNet attaches to lemmatized tokens

Step 2: compute concepts at nodes  The idea: extend concepts at labels by capturing the knowledge residing in a structure of a graph in order to define a context in which the given concept at a label occurs  Computation (basic case): Concept at a node for some node n is computed as an intersection of concepts at labels located above the given node, including the node itself 29 Wine and Cheese Italy Europe Austria Pictures C 4 = C europe ⊓ C Pictures ⊓ C Italy

Step 3: compute relations between concepts at labels  The idea: Exploit a priori knowledge, e.g., lexical, domain knowledge with the help of element level semantic matchers  Results of step 3: 30 Italy Europe Wine and Cheese Austria Pictures Europe Italy Austria Images T1T2  = C Italy =  C Austria = C Europe = C Images C Austria C Italy C Pictures C Europe T1 C Wine C Cheese

Step 4? Wait!  What we have done in Step 3? Find semantic relations.  What for? To build the semantic relation bases for further matching.  What is a ‘relation base’ in the logic sense? TBox! We are building the TBox! 31

Step 4: compute relations between concepts at nodes 32 The idea: Reduce the matching problem to a validity problem  Context: the relations between concepts at labels (from step 3)  Wffrel (C1i, C2j): relation to be proved (with: C1i in Tree1 and C2j in Tree2)  Translate into propositional logic:  C1i = C2j is translated into C1i  C2j  C1i subsumes C2j is translated into C1i  C2j  C1i  C2j is translated into ¬ (C1i  C2j)  Prove Context  Wffrel ( C1 i, C2 j ) is valid with Rel={=, ⊑,⊒,⊥,⊓ }  A propositional formula is valid iff its negation is unsatisfiable (SAT deciders are sound and complete … )

Pseudo code of Step4 CHANGE LINE15  1. i, j, N1, N2: int;  2. context, goal: wff;  3. n1, n2: node;  4. T1, T2: tree of (node);  5. relation = {=, ⊑, ⊒,  };  6. ClabMatrix(N1, N2), CnodMatrix(N1, N2), relation: relation  7. function mkCnodMatrix(T1, T2, ClabMatrix) {  8. for (i = 0; i < N1; i++) do  9. for (j = 0; j < N2; j++) do  10. CnodMatrix(i, j):=NodeMatch(T1(i),T2(j), ClabMatrix)}  11. function NodeMatch(n1, n2, ClabMatrix) {  12. context:=mkcontext(n1, n2, ClabMatrix, context);  13. foreach (relation in ) do {  14. goal:= w2r(mkwff(relation, GetCnod(n1), GetCnod(n2));  15. if VALID(mkwff( , context, goal))  16. return relation;}  17. return IDK;} Validity Reasoning Contex|=goal?

Example  Example. Suppose we want to check if C1 2 = C T2  = C1 4  C1 3 = C1 2 C1 1 C2 5 C2 4 C2 3 C2 2 C2 1 T1 (C1 Images  C2 Pictures )  (C1 Europe  C2 Europe )  (C1 2  C2 2 ) Context Goal

Examples 35 Italy Europe Wine and Cheese Austria Pictures Europe Italy Austria Images T1T2 = Italy Europe Wine and Cheese Austria Pictures Europe Italy Austria Images T1T2 Italy Europe Wine and Cheese Austria Pictures Europe Italy Austria Images T1T2  Italy Europe Wine and Cheese Austria Pictures Europe Italy Austria Images T1T2

Another Example: Concept Hierarchies 36 (Serafini et. al., 2003)

Example (cont’)  Suppose we want to discover the relation R between Chat and Forum in the Google directory (left) and Chat and Forum in the Yahoo directory (right): 37

Example (cont’)  Step 1 : transformation of nodes n 1 = Chat and Forum and n 2 = Chat and Forum to propositions, P and Q; selection of the portion T of knowledge (background theory) relevant to the application of a SAT solver.  WordNet is used at this step to build T.  Step 2 : select the relation R between n 1 and n 2 among the semantic relations of interest. 38

Example (cont’)  Step 3 : The question “Is Chat and Forum less general than Chat and Forum?” becomes the SAT problem “Is T |= P ⊑ Q?” where: P = (art#1 ⊓ literature#2 ⊓ (chat#1 ⊔ forum#1)), Q = (art#1 ⊔ humanities#1) ⊓ humanities#1 ⊓ (chat#1 ⊔ forum#1), T = { art#1 ⊑ humanities#1, humanities#1 ⊒ literature#2 } 39

References ( Preprints available at )  F. Giunchiglia, P. Shvaiko, “Semantic matching.” Knowledge Engineering Review, 18(3): ,  L. Serafini, P. Bouquet, B. Magnini, S. Zanobini, “An Algorithm for Matching Contextualized Schemas via SAT.” IRST Technical Report , ITC, January  F. Giunchiglia, M. Marchese, I. Zaihrayeu. “Encoding Classifications into Lightweight Ontologies.” J. of Data Semantics VIII, Springer-Verlag LNCS 4380, pp 57-81,