A S EMANTIC A PPROACH TO D ISCOVERING S CHEMA M APPING Yuan An, Alex Borgida, Renee J. Miller, and John Mylopoulos Presented by: Kristine Monteith.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

BAH DAML Tools XML To DAML Query Relevance Assessor DAML XSLT Adapter.
Conceptual Design using the Entity-Relationship Model
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
DATA MODELS A collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. Provide a way to describe.
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
INFS614, Fall 08 1 Relational Algebra Lecture 4. INFS614, Fall 08 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of.
CS34311 The Entity- Relationship Model Part 4.. CS34312 Coming up with a good design for your application Guidelines via examples.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
D ATABASE S YSTEMS I A DMIN S TUFF. 2 Mid-term exam Tuesday, Oct 2:30pm Room 3005 (usual room) Closed book No cheating, blah blah No class on Oct.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Databases Revision.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Merging Models Based on Given Correspondences Rachel A. Pottinger Philip A. Bernstein.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
System Concepts and Architecture Rose-Hulman Institute of Technology Curt Clifton.
1 Data Modeling Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. Entity-Relationship Model Database Management Systems I Alex Coman, Winter.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Zinovy Diskin and Juergen Dingel Queen’s University Kingston, Ontario, Canada Mappings, maps and tables: Towards formal semantics for associations in UML.
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Chapter 4 The Relational Model Pearson Education © 2014.
Chapter 4 The Relational Model.
Relational Data Model. A Brief History of Data Models  1950s file systems, punched cards  1960s hierarchical  IMS  1970s network  CODASYL, IDMS 
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
1 The Relational Database Model. 2 Learning Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical.
9/7/2012ISC329 Isabelle Bichindaritz1 The Relational Database Model.
Data integration and transformation 3. Data Exchange Paolo Atzeni Dipartimento di Informatica e Automazione Università Roma Tre 28/10/2009.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Relational Database. Database Management System (DBMS)
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
ICS 321 Fall 2011 The Relational Model of Data (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 8/29/20111Lipyeow.
1 Conceptual Design using the Entity- Relationship Model.
Repetition af Domæne model. Artifact influence emphasizing the Domain Model.
Databases Illuminated Chapter 3 The Entity Relationship Model.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Discovering, Maintaining, and Using Semantics for Database Schemas Yuan An, Ph.D. iSchool at Drexel February 23, 2009 CS Department at Villanova Univ.
CMPT 258 Database Systems Relational Algebra (Chapter 4)
CPSC 603 Database Systems Lecturer: Laurie Webster II, M.S.S.E., M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 4 Introduction to a First Course in Database Systems.
Top-K Generation of Integrated Schemas Based on Directed and Weighted Correspondences by Ahmed Radwan, Lucian Popa, Ioana R. Stanoi, Akmal Younis Presented.
Relational Algebra p BIT DBMS II.
Working with XML. Markup Languages Text-based languages based on SGML Text-based languages based on SGML SGML = Standard Generalized Markup Language SGML.
Logical Database Design and the Relational Model.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
COP Introduction to Database Structures
Logical Database Design and the Rational Model
Conceptual Design & ERD Modelling
COP Introduction to Database Structures
Chapter 2: Relational Model
RELATION.
Entity-Relationship Model
Relational Algebra Chapter 4 1.
Relational Algebra Chapter 4, Part A
Associative Query Answering via Query Feature Similarity
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Relational Algebra 1.
LECTURE 3: Relational Algebra
Entity Relationship Diagrams
Relational Algebra Chapter 4 1.
Relational Algebra Chapter 4, Sections 4.1 – 4.2
02 - The Relational Database Model
Presentation transcript:

A S EMANTIC A PPROACH TO D ISCOVERING S CHEMA M APPING Yuan An, Alex Borgida, Renee J. Miller, and John Mylopoulos Presented by: Kristine Monteith

O VERVIEW Goal of the paper: Matching schemas with more than just simple element correspondence (e.g. Can we improve on a naïve mapping?)

OVERVIEW Approach: Derive a conceptual model for the semantics in a table and match the conceptual model in the source schema to the conceptual model in the target schema e.g. Can we figure out that a source schema like this: can match a target schema like this: hasBookSoldAt(aname,sid)

E XAMPLE 1

B ASELINE SOLUTION : R EFERENTIAL I NTEGRITY CONSTRAINTS Find correspondences v1: connect person.pname to hasBookAt.aname v2: connect bookstore.sid and hasBookSoldAt.sid Create logical relations using referential constraints S1: person(pname) |X| writes(pname, bid) |X| book(bid) S2: book(bid) |X| soldAt(bid,sid) |X| bookstore(sid) S3: person(name) S4: bookstore(sid) Look at target T1: hasBookSoldAt(aname,sid) Look at each pair of source and target relations and check to see which are “covered”

A SK THE USER ABOUT THE FOLLOWING : Doesn’t present an entire tuple to match the target query: hasBookSoldAt(aname,sid)

W HAT THIS PAPER SEEKS TO ACCOMPLISH : Generate the following: compose “writes” and “soldAt” to produce a new semantic connection between “person” and “bookstore”

A PPROACH : R EPRESENTING S EMANTICS OF S CHEMAS Create a Conceptual Model (CM) graph Create nodes for classes and attributes Create directed edges for relationships and inverses C1 ---ISA--- C2subclasses C ---p--- Drelationships C ---p->-- Dfunctional relationships o Duplicate concept nodes to represent recursive relationships

G ENERATING M APPING C ANDIDATES Problem description Inputs: A source relational schema S and a target relational schema T A concept model (G S and G T respectively) associated with each relational schema via table semantic mappings A set of correspondences L linking a set L(S) of columns in S to a set L(T) of columns in T Goal: A pair of expressions which are “semantically similar” in terms of modeling the subject matter

M ARKED N ODES The set L(S) of columns gives rise to a set C S of marked class nodes in the graph G S Likewise, the set L(T) gives rise to a set C T of marked class nodes in the graph G T

B ASIC A LGORITHM Create conceptual subgraphs find a subgraph D 1 connecting concept nodes in C S, and a subgraph D 2 connecting concept nodes in C T such that D 1 and D 2 are “semantically similar Suggest possible mapping candidates translate D 1 and D 2 into algebraic expressions E 1 and E 2 and return the triple as a mapping candidate

C REATING CONCEPTUAL SUBGRAPHS Notice simple matches a node v in C S corresponds to a node u in C T when v and u have attributes that are associated with corresponding columns via the table semantics More complicated rules The connections (v 1,v 2 ) and (u 1,u 2 ) should be “semantically similar” or at least “compatible” (cardinality constraints, relationships like “is-a” or “part of”) Use edges from pre-selected trees Represent “intuitively meaningful” concepts Favor smaller trees (Occam’s razor) Other considerations Favor lossless joins Reject contradictions

E XAMPLE Looking for a functional tree with a root corresponding to the anchor Proj

E XAMPLE Notice simple matches Find a tree with minimal cost (edges in pre-selected trees don’t contribute to cost) Find a tree containing the most number of edges in the pre-selected trees Project ---controlledBy->-- Department --hasManager->-- Employee

M ORE COMPLICATED E XAMPLE Same Answer: Project ---controlledBy->-- Department --hasManager->-- Employee Still looking for low-cost, minimal trees to connect Employee to Project

D EALING WITH N - ARY R ELATIONS StoreSells(Person, Product)

C ONSIDERATIONS FOR R EIFIED R ELATIONSHIPS A path of length 2 passing through a reified relationship node should be considered to be length 1 The semantic category of a target tree rooted at a reified relationship induces preferences for similarly rooted (minimal) functional trees in the source (cardinality restrictions, number of roles, subclass relationship to top level ontology concept)

O BTAINING R ELATIONAL E XPRESSIONS

E XPERIMENTAL R ESULTS

A VERAGE PRECISION

A VERAGE RECALL

C ONCLUSIONS Semantic approach performs at least as well as the RIC-based approach on datasets studied These approaches made significant improvements in some cases Many of the datasets did not have complicated schema; a semantic approach didn’t provide as much benefit in those cases

S TRENGTHS /W EAKNESSES Strengths Lots of examples Provides a useful solution to a common problem Weaknesses Formalism sometimes made things more complicated rather than more clear Assumes a lot of background knowledge

F UTURE W ORK Embed this functionality into pre-existing mapping tools (they suggest Clio since a lot of their work is based off of this) Add negation to semantic representation Investigate more complex semantic mappings

Q UESTIONS ???