Schema Mapping as Query Discovery Renee J. Miller Laura M. Haas Mauricio A. Hernandez Presented by: Helen Chen
Introduction Modern applications need schema mappings Current schema mapping process is done manually In Clio, schema mapping = query discovery –Modern DBMS manage not only data but also queries
Introduction (cont’) Schema mappings cannot be fully automated Outside sources are needed Clio is a prototype tool for semi-automated schema mapping/query discovering
Characteristics of Clio Clio is VC driven VCs are an appropriate abstraction for eliciting information from the user or DBA Using reasoning about queries and query containment can help the user derive correct schema mappings
Principle in Mapping Construction All possible values in source target –Use union rather than join A value from the source target –Use join rather than cross product Override the principles is permitted once
Search Space Vertical compositions (join) Requires to consider mappings between schemas with constraints and dependencies Horizontal compositions (set operators) Source and target schemas do not represent the same information
Query Discovery Notation Let S 1, … S n represent the n source relation Let T 1, … T m represent the m target relation Use symbol A to denote source attributes –The domain of an attribute A is denoted dom(A) –The meta-data associated with A is denoted (A) Use symbol B to denote target attributes
Query Discovery Notation (cont’) Value correspondence i = –A function (f i ) q >=1 f i : dom(A 1 ) x … dom(A q ) x (A 1 ) x … (A q ) dom(B) –A filter (p i ) p i : dom(A 1 ) x … dom(A r ) x (A 1 ) x … (A r ) boolean
Core Query Discovery Algorithm Potential Sets P Candidate Sets G A Cover All f i All source relations All p i
Example Consider the following value correspondences –f 1 : S 1.A T.C –f 2 : S 2.A T.D –f 3 : S 2.B T.C –All three filters are True
Example (cont’) P = {{ 1, 2 },{ 2, 3 },{ 1 },{ 2 },{ 3 }} G = {{ 1, 2 },{ 2, 3 },{ 1 },{ 2 },{ 3 }} Cover 1 = {{ 1, 2 },{ 2, 3 }} 2 = {{ 1 },{ 2, 3 }} … SQL Query
Another Example f 1 : PayRate(HrRate)*WorkdOn(Hrs) Personnel(Sal) q 1 : SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W WHERE P.Rank = W.ProjRank q 2 : SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W, Student S WHERE P.Rank = W.ProjRank AND S.Yr = P.Rank
Another Example (cont’) q 3 : SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W, Student S WHERE P.Rank = W.ProjRank AND S.Yr = P.Rank UNION ALL SELECT Sal FROM Professor f 1 : PayRate(HrRate)*WorkdOn(Hrs) Personnel(Sal) p 1 : True f 2 : Professor(Sal) Personnel(Sal) p 2 : True = {{ 1 }, { 2 }}
Incremental Query Discovery Algorithm SQL Query i+1 … ii Add/Delete a Value Correspondence ’’
Conclusion Schema mapping construction process is searching for the most reasonable mapping Clio uses VCs to help users create schema mappings Clio can produce both flat and nested relational targets VC framework can be extended to both GAV and LAV
Limitation VCs are entered by user of linguistic techniques – semi-automated