Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria.

Slides:



Advertisements
Similar presentations
Matrix Schema Tutorial Presented at the: IX European Banking Supervisors XBRL Workshop & Tutorial In: Paris On: 29th September 2008 By: Michele Romanelli.
Advertisements

The transformation of an ER or EER model into a relational model
DIMNet Workshop 7 & 8/10/2002 AutoMed: Automatic generation of Mediator tools for heterogeneous database integration Alex Poulovassilis (Birkbeck College)
Using AutoMed Metadata in Data Warehousing Environments Hao FanAlexandra Poulovassilis School of Computer Science & Information Systems Birkbeck college,
19 January 2007 Data Quality Meeting Alex Poulovassilis.
SeLeNe Kick-off Meeting 15-16/11/2002 SeLeNe-related Research At Birkbeck Alex Poulovassilis and Peter T.Wood Database and Web Technologies Group School.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
Data Access & Integration in the ISPIDER Proteomics Grid N. Martin – A. Poulovassilis – L. Zamboulis
SQL Lecture 10 Inst: Haya Sammaneh. Example Instance of Students Relation  Cardinality = 3, degree = 5, all rows distinct.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
SPRING 2004CENG 3521 The Relational Model Chapter 3.
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Paea LePendu Week 8 (Nov. 16)
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
1 Dan Quinlan, Markus Schordan, Qing Yi Center for Applied Scientific Computing Lawrence Livermore National Laboratory Semantic-Driven Parallelization.
Proof by Deduction. Deductions and Formal Proofs A deduction is a sequence of logic statements, each of which is known or assumed to be true A formal.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Powerpoint 2006 PRESENTATION The University of Auckland New Zealand Marsden Fund A PVS Approach to Verifying ORA-SS Data Models Scott Uk-Jin Lee 1, Gillian.
Query Processing Presented by Aung S. Win.
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
The Relational Model These slides are based on the slides of your text book.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
DATA-DRIVEN UNDERSTANDING AND REFINEMENT OF SCHEMA MAPPINGS Data Integration and Service Computing ITCS 6010.
Normalization Transparencies
Dimitrios Skoutas Alkis Simitsis
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
1 The Relational Model. 2 Why Study the Relational Model? v Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy.
Database Design. Referential Integrity : data in a table that links to data in another table must always work in such a way that following the link will.
Nikitas N. Karanikolas, Maria Nitsiou, Emmanuel J. Yannakoudakis and Christos Skourlas CUDL Language Semantics, Liven Up the FDB Data Model.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 10 Normalization Pearson Education © 2009.
Aberdeen, 28/1/2003 AutoMed: Automatic generation of Mediator tools for heterogeneous data integration Alex Poulovassilis School of Computer Science and.
Chapter 9 Logical Database Design : Mapping ER Model To Tables.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
WXGE 6101 DATABASE CONCEPTS & IMPLEMENTATIONS. Lesson Overview The Relational Model Terminology of relational model. Properties of database relations.
Automating DAML-S Web Services Composition Using SHOP2 Based on an article by Dan Wu, Bijan Parsia, Evren Sirin, James Hendler and Dana Nau in Proceedings.
Chapter 2 Relational Database Design and Normalization August
HNDIT23082 Lecture 06:Software Maintenance. Reasons for changes Errors in the existing system Changes in requirements Technological advances Legislation.
Visit to HP Labs, 22/10/2002 Heterogeneous information integration Alex Poulovassilis Database and Web Technologies Group School of Computer Science and.
44271: Database Design & Implementation Physical Data Modelling Ian Perry Room: C49 Tel Ext.: 7287
CHAPTER 2 : RELATIONAL DATA MODEL Prepared by : nbs.
Logical Design 12/10/2009GAK1. Learning Objectives How to remove features from a local conceptual model that are not compatible with the relational model.
FEN Introduction to the database field: The development process Seminar: Introduction to relational databases Development process: Analyse.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Objectives of Normalization  To create a formal framework for analyzing relation schemas based on their keys and on the functional dependencies among.
Chapter 3 The Relational Model. Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Entity-Relationship Modelling
Translation of ER-diagram into Relational Schema
Entity-Relationship Modelling
Dealing with Uniqueness Constraint in Query Optimization
Data Manipulation using Relational Algebra
Normalisation to 3NF.
Semantic Markup for Semantic Web Tools:
Relational Database Design
Lecture 06:Software Maintenance
Presentation transcript:

Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria Poulovassilis Presented by Suman Paladugu

Introduction A new approach to data integration called both as view (BAV) BAV is based on the use of reversible sequences of schema transformations Derive GAV and LAV view definitions from BAV schema transformation sequences Support of BAV in the evolution of both global and local schemas Implementation of the BAV approach within the AutoMed system

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept)

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) G1 Student (id, name, left, degree) ={x, y, z, w | (x, y, z, w, -)  ug Λ (x, -, -, -)  phd V (x, y, z, w)  phd Λ w=‘phd’ }

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) G2 monitors (sno, id) = {x, y | (x, -, -, -y)  ug Λ (x, -, -, -)  phd V (x, y)  supervises}

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) G3 staff (sno, sname, dept) = {x, y, z | (x, y)  tutor Λ (x, -, -)  supervisor V (x, y)  supervisor}

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L1 tutor (sno, sname) = {x, y | (x, y, -)  staff Λ (x, z)  monitors Λ (z, -, -, w)  student Λ w  ‘phd’}

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L2 ug (id, name, left, degree, sno) = {x, y, z, w, v | (x, y, z )  student Λ (v, x)  monitors Λ w  ‘phd’}

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L3 phd (id, name, left, title) = {x, y, z, w | (x, y, z, v)  student Λ v = ‘phd’ Λ w = null}

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L4 supervises (sno, id) = {x, y | (x, y)  monitors Λ (x, -, -, z)  student Λ z = ‘phd’}

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L5 supervisor (sno, sname, dept) = {x, y, z | (x, y, z)  staff Λ (x, w,)  monitors Λ (w, -, -, v)  student Λ v = ‘phd’}

Example Local and Global Schemas S g student (id, name, left #,degree) monitors (sno,id) staff (sno, sname, dept#) S 1 ug (id, name, left #, degree, sno) tutor (sno, sname) S 2 phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept)

Evolution Problems of GAV and LAV GAV not ready to support the evolution of local schema In LAV, changes to a local schema impact only on the derivation rules defined for that schema But there is a problem for LAV

BAV Integration Common Data Model- HDM. In LAV, changes to a local schema impact only on the derivation rules defined for that schema Schemas are incrementally transformed by applying to them a sequence of primitive transformation stepst 1, t 2, t 3 …… t n. Intermediate (and final) schemas may contain constructs of more than one modeling language.

BAV Integration … Contd Each add or del transformation is accompanied by a query specifying the extent of the new/deleted construct in terms of the rest of the constructs in the schema. This allows automatic translation of data and queries between schemas linked by a transformation pathway e.g. for global query processing

Example: A Simple Relational Model k 1, k 2, k 3 …… k n, n≥1, are the primary key attributes a 1, a 2, a 3 …… a m, m ≥ 0, are the non-primary key attributes

Primitive Transformation of this Model addRel(( (R, k 1, k 2, k 3 …… k n ),q)) adds to the schema a new relation R addAtt(( R, a), c, q)) adds to the schema a non- primary key attribute for relation R delRel (((R, k 1, k 2, k 3 …… k n ),q)) deletes relation R delAtt (((R, a ),c, q))

Primitive Transformation of this Model extRel(( R, k 1, k 2, k 3 …… k n )), q)) extAtt(( R, a )), c, q)) conRel(( R, k 1, k 2, k 3 …… k n )), q)) conAtt(( R, a)), c, q))

BAV integration of S 1 and S 2 into S g : ‘ add ’ Steps

BAV integration of S 1 and S 2 into S g : ‘ delete ’ and ‘ contract ’ Steps

Correspondence between GAV/LAV The ‘add’ steps correspond to GAV since global schema constructs are being defined in terms of local ones The ‘del’ and ‘con’ steps correspond to LAV since local schema constructs are being defined in terms of global ones

Correspondence between BAV and GAV/LAV GAV or LAV definition can be converted into a partial BAV definition Complete GAV or LAV definition can be derived from a BAV definition. BAV thus combines the benefits of GAV and LAV in the sense that any reasoning or processing which is possible with the view definitions of GAV or LAV will also be possible with the BAV definition

Deriving BAV from GAV GAV definition is derived using some of the information present in BAV definition: First, Decomposition rule applied to each GAV rule G 1 -- generates 1-4, G 2 --8, and G 3 generates 5-7 Second, each construct c of type T in the source schema is removed using transformation step of form con T( c, void). conAtt((( tutor, sname)), notnull, void))) conAtt((( phd, title)), notnull, void))) conRel((( phd, id)), void)

Deriving BAV from LAV LAV definition is also derived using some of the information present in BAV definition: L 1 to L 5 -- generates reverse transformation steps of All the BAV transformations steps generated must be ‘ extend ’ rather than ‘ add ’ ones extRel((( phd, id)), {x |x  (( student, id))} V (x, ‘phd’)  ((student, degree)) }) extAtt((( tutor, sname)), notnull, {x, y |(x, y)  ((staff, sname)) x  ((tutor, sno))})

Deriving GAV from BAV Take the subset, G, of the add and ext steps in the transformation sequence from S 1 U S 2 U ……S g Take each addRel/extRel step in G, together with all addAtt/extAtt steps for the same relation Form a join of the schemes ((R, a 1 ))…… ((R, a m )) to restore relation R.

Example

Example Contd …

Deriving LAV from BAV Take the subset, L, of the del and con steps on constructs of S i in the transformation sequence from S 1 U S 2 U ……S g. Construction of the LAV view definitions from L proceeds in a similar fashion to the construction of GAV view definitions E.g. the steps forming L for schema S i are 9 – 15 above Rule L 1 can then be derived from 9 – 10 and rule L 2 from 11 – 15

BAV support for Global Schema Evolution If a global schema S evolves to a new schema, S’ the evolution is specified as a transformation pathway S  S’ Three possible steps: 1.If t is an add or del, then S’ is semantically equivalent to S. 2. If t is a contract, then there will be information that used to be present in S no longer available from S’. 3. If t is an extend transformation then domain knowledge is required to determine if the new construct in S’ can in fact be completely derived from the local data sources

BAV support for Local Schema Evolution Little Complex compared to the previous one Suppose that some local schema S evolves, to S’. The evolution is again defined as a transformation pathway S  S ’ Each transformation step, t, in this pathway is again considered in turn As with global schema evolution, only if t is an extend is domain knowledge required

The AutoMed Architecture

Conclusions GAV and LAV views can be derived from a BAV specification BAV thus combines the benefits of GAV and LAV, in that any reasoning or processing which is possible with GAV or LAV view definitions will also be possible with a BAV specification A key advantage of BAV is that it readily supports the evolution of both local and global schemas, allowing transformation pathways and schemas to be incrementally modified

Questions? Questions?

Thank You Thank You