Schema Mapping as Query Discovery Renee J. Miller Laura M. Haas Mauricio A. Hernandez Presented by: Helen Chen.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Advertisements

IS698: Database Management Min Song IS NJIT. The Relational Data Model.
1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal
BUSINESS DRIVEN TECHNOLOGY Plug-In T4 Designing Database Applications.
Review Indra Budi Fakultas Ilmu Komputer UI 2 Database Introduction Database vs File Processing Main purpose of database Database Actors.
The Relational Database Model
A Next Wave of Challenges in the Junction of Information Management (esp. Integration) and the Web Yannis Papakonstantinou Associate Prof., CSE, UCSD.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
A Graphical Environment to Query XML Data with XQuery
Introduction to Databases Transparencies
Page 1 Multidatabase Querying by Context Ramon Lawrence, Ken Barker Multidatabase Querying by Context.
Database Systems More SQL Database Design -- More SQL1.
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
Modeling & Designing the Database
Data Modeling 1 Yong Choi School of Business CSUB.
Data Modeling 1 Yong Choi School of Business CSUB.
Security and Integrity
Yong Choi School of Business CSUB
CS 405G: Introduction to Database Systems 16. Functional Dependency.
A Brief Summary for Exam 1 Subject Topics Propositional Logic (sections 1.1, 1.2) –Propositions Statement, Truth value, Proposition, Propositional symbol,
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 5: Schema Matching and Mapping PRINCIPLES OF DATA INTEGRATION.
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
Data integration and transformation 3. Data Exchange Paolo Atzeni Dipartimento di Informatica e Automazione Università Roma Tre 28/10-4/11/2009.
DBSQL 3-1 Copyright © Genetic Computer School 2009 Chapter 3 Relational Database Model.
DATA-DRIVEN UNDERSTANDING AND REFINEMENT OF SCHEMA MAPPINGS Data Integration and Service Computing ITCS 6010.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
Information System Development Courses Figure: ISD Course Structure.
Minor Thesis A scalable schema matching framework for relational databases Student: Ahmed Saimon Adam ID: Award: MSc (Computer & Information.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
METU Department of Computer Eng Ceng 302 Introduction to DBMS The Relational Algebra by Pinar Senkul resources: mostly froom Elmasri, Navathe and other.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria.
Data integration and transformation 3. Data Exchange Paolo Atzeni Dipartimento di Informatica e Automazione Università Roma Tre 28/10/2009.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
A language to describe software texture in abstract design models and implementation.
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
Methodology - Conceptual Database Design
FEN Introduction to the database field:  The Relational Model Seminar: Introduction to relational databases.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
The Relational Algebra and Calculus
Methodology – Physical Database Design for Relational Databases.
Chapter 6 The Relational Algebra Copyright © 2004 Ramez Elmasri and Shamkant Navathe.
Chapter 3 Part II Describing Syntax and Semantics.
Al-Maarefa College for Science and Technology INFO 232: Database systems Chapter 3 “part 2” The Relational Algebra and Calculus Instructor Ms. Arwa Binsaleh.
27/3/2008 1/16 A FRAMEWORK FOR REQUIREMENTS ENGINEERING PROCESS DEVELOPMENT (FRERE) Dr. Li Jiang School of Computer Science The.
1 CS 430 Database Theory Winter 2005 Lecture 4: Relational Model.
DISCRETE COMPUTATIONAL STRUCTURES CSE 2353 Fall 2010 Most slides modified from Discrete Mathematical Structures: Theory and Applications by D.S. Malik.
Data Modeling Yong Choi School of Business CSUB. Part # 2 2 Study Objectives Understand concepts of data modeling and its purpose Learn how relationships.
 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Advance Database Systems Query Optimization Ch 15 Department of Computer Science The University of Lahore.
SIMILE Objectives, Current Status, and Demonstration Stephen J. Garland, MIT CSAIL Mick Bass, HP Labs DSpace User Group Meeting Cambridge, MA March 11,
1 Discovering Calendar-based Temporal Association Rules SHOU Yu Tao May. 21 st, 2003 TIME 01, 8th International Symposium on Temporal Representation and.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Week 2 Lecture The Relational Database Model Samuel ConnSamuel Conn, Faculty Suggestions for using the Lecture Slides.
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
Counterexample-Guided Abstraction Refinement By Edmund Clarke, Orna Grumberg, Somesh Jha, Yuan Lu, and Helmut Veith Presented by Yunho Kim Provable Software.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
CIS 207 The Relational Database Model
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
UNIFIED PROCESS.
A Brief Summary for Exam 1
Relational Database Design
Query Optimization.
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Presentation transcript:

Schema Mapping as Query Discovery Renee J. Miller Laura M. Haas Mauricio A. Hernandez Presented by: Helen Chen

Introduction Modern applications need schema mappings Current schema mapping process is done manually In Clio, schema mapping = query discovery –Modern DBMS manage not only data but also queries

Introduction (cont’) Schema mappings cannot be fully automated Outside sources are needed Clio is a prototype tool for semi-automated schema mapping/query discovering

Characteristics of Clio Clio is VC driven VCs are an appropriate abstraction for eliciting information from the user or DBA Using reasoning about queries and query containment can help the user derive correct schema mappings

Principle in Mapping Construction All possible values in source  target –Use union rather than join A value from the source  target –Use join rather than cross product Override the principles is permitted once

Search Space Vertical compositions (join) Requires to consider mappings between schemas with constraints and dependencies Horizontal compositions (set operators) Source and target schemas do not represent the same information

Query Discovery Notation Let S 1, … S n represent the n source relation Let T 1, … T m represent the m target relation Use symbol A to denote source attributes –The domain of an attribute A is denoted dom(A) –The meta-data associated with A is denoted  (A) Use symbol B to denote target attributes

Query Discovery Notation (cont’) Value correspondence i = –A function (f i ) q >=1 f i : dom(A 1 ) x … dom(A q ) x  (A 1 ) x …  (A q )  dom(B) –A filter (p i ) p i : dom(A 1 ) x … dom(A r ) x  (A 1 ) x …  (A r )  boolean

Core Query Discovery Algorithm Potential Sets P Candidate Sets G A Cover  All f i All source relations All p i

Example Consider the following value correspondences –f 1 : S 1.A  T.C –f 2 : S 2.A  T.D –f 3 : S 2.B  T.C –All three filters are True

Example (cont’) P = {{ 1, 2 },{ 2, 3 },{ 1 },{ 2 },{ 3 }} G = {{ 1, 2 },{ 2, 3 },{ 1 },{ 2 },{ 3 }} Cover  1 = {{ 1, 2 },{ 2, 3 }}  2 = {{ 1 },{ 2, 3 }} … SQL Query

Another Example f 1 : PayRate(HrRate)*WorkdOn(Hrs)  Personnel(Sal) q 1 : SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W WHERE P.Rank = W.ProjRank q 2 : SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W, Student S WHERE P.Rank = W.ProjRank AND S.Yr = P.Rank

Another Example (cont’) q 3 : SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W, Student S WHERE P.Rank = W.ProjRank AND S.Yr = P.Rank UNION ALL SELECT Sal FROM Professor f 1 : PayRate(HrRate)*WorkdOn(Hrs)  Personnel(Sal) p 1 : True f 2 : Professor(Sal)  Personnel(Sal) p 2 : True  = {{ 1 }, { 2 }}

Incremental Query Discovery Algorithm SQL Query   i+1 …  ii Add/Delete a Value Correspondence ’’

Conclusion Schema mapping construction process is searching for the most reasonable mapping Clio uses VCs to help users create schema mappings Clio can produce both flat and nested relational targets VC framework can be extended to both GAV and LAV

Limitation VCs are entered by user of linguistic techniques – semi-automated