Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Eduard C. Dragut Ramon Lawrence Eduard C. Dragut Ramon Lawrence.

Slides:



Advertisements
Similar presentations
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Advertisements

So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Presented by: Thabet Kacem Spring Outline Contributions Introduction Proposed Approach Related Work Reconception of ADLs XTEAM Tool Chain Discussion.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
ISBN Chapter 3 Describing Syntax and Semantics.
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Semantic description of service behavior and automatic composition of services Oussama Kassem Zein Yvon Kermarrec ENST Bretagne France.
Merging Models Based on Given Correspondences Rachel A. Pottinger Philip A. Bernstein.
Dynamic Ontologies on the Web Jeff Heflin, James Hendler.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
Integrating data sources on the World-Wide Web Ramon Lawrence and Ken Barker U. of Manitoba, U. of Calgary
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
SEQUOIAS YR-SOC'07 - Leicester June A NOVEL APPROACH TO WEB SERVICES DISCOVERY Marco Comerio Università di Milano-Bicocca
By ANDREW ZITZELBERGER A Framework for Extraction Ontology Based Information Management.
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Zebin Chen Week 7 (Nov. 9)
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
Kmi.open.ac.uk Semantic Execution Environments Service Engineering and Execution Barry Norton and Mick Kerrigan.
Page 1 Multidatabase Querying by Context Ramon Lawrence, Ken Barker Multidatabase Querying by Context.
1 CIS607, Fall 2004 Semantic Information Integration Presentation by Xiaofang Zhang Week 7 (Nov. 10)
Describing Syntax and Semantics
Automatic Data Ramon Lawrence University of Manitoba
INTEGRATION INTEGRATION Ramon Lawrence University of Iowa
ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 6: General Schema Manipulation Operators PRINCIPLES OF DATA INTEGRATION.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.
An Introduction to Description Logics. What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.
Ming Fang 6/12/2009. Outlines  Classical logics  Introduction to DL  Syntax of DL  Semantics of DL  KR in DL  Reasoning in DL  Applications.
Page 1 Composing Mappings between Schemas using a Reference Ontology - ODBASE’04 - Eduard Dragut, Ramon Lawrence Composing Mappings between Schemas using.
Building an Ontology of Semantic Web Techniques Utilizing RDF Schema and OWL 2.0 in Protégé 4.0 Presented by: Naveed Javed Nimat Umar Syed.
Intelligent Database Systems Lab Presenter: WU, JHEN-WEI Authors: Rodrigo RizziStarr, Jose´ Maria Parente de Oliveira IS Concept maps as the first.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
More on “The Huddersfield Method” A lightweight, pattern-driven method based on SSM, Domain Driven Design and Naked Objects.
Semantic Enrichment of Ontology Mappings: A Linguistic-based Approach Patrick Arnold, Erhard Rahm University of Leipzig, Germany 17th East-European Conference.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Dimitrios Skoutas Alkis Simitsis
Merging Source Query Interfaces on Web Databases Eduard C. Dragut (speaker) Wensheng Wu Prasad Sistla Clement Yu Weiyi Meng Eduard C. Dragut (speaker)
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
ece 627 intelligent web: ontology and beyond
Presented by Kyumars Sheykh Esmaili Description Logics for Data Bases (DLHB,Chapter 16) Semantic Web Seminar.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Ontology Technology applied to Catalogues Paul Kopp.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
COP Introduction to Database Structures
The Semantic Web By: Maulik Parikh.
Cross-Ontological Relationships
Web Ontology Language for Service (OWL-S)
Ontology.
CSc4730/6730 Scientific Visualization
[jws13] Evaluation of instance matching tools: The experience of OAEI
Block Matching for Ontologies
Information Networks: State of the Art
SECTION 4: OO METHODOLOGIES
Presentation transcript:

Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Eduard C. Dragut Ramon Lawrence Eduard C. Dragut Ramon Lawrence University of Illinois at Chicago University of British Columbia Okanagan University of Illinois at Chicago University of British Columbia Okanagan ODBASE 2006, Montpellier, France

Page 2 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Talk Overview  Introduction  Background  Model and Mapping representation systems  Proposed Mapping Representation System  Invert and Compose operator definitions and properties  Mappings Composition  Experiment  Estimate the quality of the proposed system

Page 3 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships  Models  denote a representation of a domain in a formal language (e.g., EER, Relational, Description Logic)  has two components [Russell et al 2003]  terminological (or metadata)  This is the focus of this work and talk.  extensional (i.e. facts or instances)  Mappings  describe how two models are related to each other Introduction - Terminology

Page 4 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships  Ways to define mappings between models  binary relationships morphismsinter-schema correspondences  called morphisms [Melnik et al 2003] or inter-schema correspondences [Popa et al. 2002]  mapping using a helper model  [Bernstein et al. 2003]  mapping as queries  [Madhavan et al. 2003, Berstein et al. 2006]  Our work falls in the class of the first two types of mappings. metadata level mappings.  We call them metadata level mappings.  They are not concerned with the instances of a model. Introduction - Mappings

Page 5 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships  Examples of models  diagrams, interface definitions, database schemas, web site layouts, control flow, XML schemas  Applications of mappings  mapping between XML schemas to drive message translation;  schema and database integration;  mapping between ontologies to help in the process of merging and alignment Introduction - Examples

Page 6 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships  The creation will be rarely completely automated.  General strategy is to semi-automatically build mappings matchings  use heuristics to generate matchings (e.g. name similarity)  [Rahm and Bernstein 2001, Shvaiko and Euzenat 2005] (surveys)  translate matches into formulas  E.g., Clio project [Popa et al. 2002]  generate new mappings from existing mappings  Composition  E.g, [Madhavan et al. 2003, Berstein et al. 2006]  Invert  E.g, [Fagin 2006]  Semi-automatic tools can significantly speed up the process. Background - Mapping creation

Page 7 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Background - Morphisms  Mapping:  is just a set of binary relations between the elements of two models  is a set of pairs  Advantages/Disadvantages  their expressiveness is enough for certain classes of problems and they exhibit certain mathematical properties [Melnik et al. 2003]  main drawback  assumes similarity to be transitive

Page 8 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Background – Morphisms problems  Composition  ○ =  due to transitivity assumption  Problems with this technique  Whenever m:1 correspondence is composed with a 1:n correspondence, the composition result is a cross-product; many being false positives.  It may miss or suggest false relationships.  Legend:  Blue  correct  Red  false positive or missed

Page 9 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Background – Mapping with helper models  Algorithm (for right compose) [Bernstein et al 2002]  copy the right hand side mapping  for each mapping element, m, on the right, i.e. in map2  compute its Input(m)  for each mapping element, m, on the right, i.e. in map2  set its domain to the union of the domains of Input(m)  Example:

Page 10 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Background – Mapping with helper models  Composition result  Problems with this technique  It may miss or suggest false relationships.  Legend:  Red  missed relationships

Page 11 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships The Objectives driving motivation  The driving motivation  The need for a mapping definition subsuming the relationship kinds that the state of the art matching algorithms discover with high precision.  Investigate to what extent a set of operations over this mapping definition can be defined. a mapping representation  Provide a mapping representation at the metadata level combining the advantages of morphisms and mappings with helper models.  The former has good mathematical properties.  The latter is more expressive. a compose algorithm  Provide a compose algorithm that exploits the semantic relationships within the mapping expression to produce correct semantic relationships whenever these can be determined automatically and to isolate those instances that require human intervention.

Page 12 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Proposed Mappings Representation  Model  A model has similar expressiveness as an EER model and is consistent with the definition of model used in previous work on model management. [Bernstein et al 2002, Pottinger and Bernstein 2003]  Mapping Representation directed, kinded binary relationship  A mapping consists of a set of mapping elements, each mapping element is a directed, kinded binary relationship between a pair of elements not in the same model:  Triplets of form, type = {IsA, AKindOf, HasA, PartOf, =, Contains, ContainedBy, Unknown, Complex}  Comments  Some of these types were introduced in other works.  E.g, [Euzenat 2004, Giunchiglia et al. 2004, Pottinger and Bernstein 2003, Xu and Embley 2003, Wu et al. 2004]

Page 13 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Proposed Mappings Representation  An example Unknown Complex  Most of the relationship kinds in the mapping representation are well- known except for Unknown and Complex , means that the relationship between concept a and b is not precisely known. , the relationship between concept a and b may require a functional specification: a = f(b)  e.g., Price = PriceVat(VAT + 1)

Page 14 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Operators - Invert  Each of the relationship types introduced have well defined inversion properties:  IsA inverted is AKindOf, HasA inverted is PartOf, Contains inverted is ContainedBy invert for mapping elements  Definition [ invert for mapping elements ]:  Consider m =. Then its corresponding inverted mapping element, denoted m -1, is given by the following expression:  Mathematical form: -1 =  E.g. -1 = invert for mappings  Definition [ invert for mappings ]:  Given two models A and B and a mapping, map, between them, the invert of map denoted by map -1, is defined from B to A and its expression is given by: map -1 = { | map}

Page 15 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Operators - Compose  Composing two mappings involves defining a composition operation between the elements of the mappings (i.e. between triplets of form )  Example  ○ =

Page 16 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Compose Properties  Remarks:  The result of composing two mappings where mapping elements are expressed as triplets is closed.  Mapping composition is symmetric in this framework: ( ○ ) -1 = ○ does not produce false correspondences  The result of composing two mappings does not produce false correspondences between the elements of the two models, i.e. it does not suggest false directed, kinded relationships.  The Compose operator uses the Unknown relationship to indicate when it is not possible (in general) to suggest a relationship type given only the information expressed in the two mappings.

Page 17 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Experiment - Setup  Experiment goal:  Show that the composition framework is robust when applied to real world application and that we are able to correctly identify problematic cases.  We compare it against mappings as morphisms.  Five real-world XML schemas in the purchase order domain: CIDR, Excel, Noris, Paragon, and Apertum from  They were used in other projects:  [Dragut and Lawrence 2004, Madhavan et al. 2001]  And a reference ontology  to which each XML schema is manually mapped both using morphisms [Dragut and Lawrence 2004] and using the new mapping definition.

Page 18 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Experiment - Setup  Example of XML schemas:  XML Excel and CIDR schemas

Page 19 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Experiment - Intermediary model  Comments:  The intermediary model does not have all concepts in the schemas (e.g. unitOfMeasure, count, and VAT).  The intermediary model is structurally different from the five schemas considered and it is defined using OWL.

Page 20 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Experiment - Methodology  Step 1: map the five schemas to the intermediary model:  First, using morphisms  Second, using the proposed mapping  Step 2: apply the compose operators to compute direct mappings between the schemas  First, employing composition over morphisms  Second, using the new compose operator  Step 3: measure the quality of the two compositions in terms of Precision, Recall, and Overall User Effort.  A new metric is introduced User Effort.

Page 21 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Experiment - Stats  Overall after composition was computed  CIDR, Excel, Noris, Paragon, and Apertum are assigned numbers 1, 2, 3, 4, and 5 respectively.

Page 22 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships  User effort  User effort is the % of mappings that must be validated by a user.  For morphisms, user effort is 100% as there is no way to distinguish true over false relationships. average  In our framework, it is the ratio of the number of Unknown relationships to the number of all produced relationships. On average it is only 19%. Experiment - Stats

Page 23 E. Dragut and R. Lawrence - Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships End Thank you for your time and patience!