1 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 Answering queries across mappings Grigoris Karvounarakis University of Pennsylvania WPE-II Presentation.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1 A B C
Computer Graphics: 2D Transformations
Scenario: EOT/EOT-R/COT Resident admitted March 10th Admitted for PT and OT following knee replacement for patient with CHF, COPD, shortness of breath.
Simplifications of Context-Free Grammars
Variations of the Turing Machine
Angstrom Care 培苗社 Quadratic Equation II
AP STUDY SESSION 2.
1
Select from the most commonly used minutes below.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Objectives: Generate and describe sequences. Vocabulary:
1 Special Angle Values. 2 Directions A slide will appear showing a trig function with a special angle. Say the value aloud before the computer can answer.
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.
Custom Services and Training Provider Details Chapter 4.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt RhymesMapsMathInsects.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Break Time Remaining 10:00.
EE, NCKU Tien-Hao Chang (Darby Chang)
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
Bright Futures Guidelines Priorities and Screening Tables
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
TESOL International Convention Presentation- ESL Instruction: Developing Your Skills to Become a Master Conductor by Beth Clifton Crumpler by.
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.5 Dividing Polynomials Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
Chapter 1: Expressions, Equations, & Inequalities
1..
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.
: 3 00.
5 minutes.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Essential Cell Biology
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Clock will move after 1 minute
PSSA Preparation.
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
9. Two Functions of Two Random Variables
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
FIGURE 3-1 Basic parts of a computer. Dale R. Patrick Electricity and Electronics: A Survey, 5e Copyright ©2002 by Pearson Education, Inc. Upper Saddle.
Presentation transcript:

1 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 Answering queries across mappings Grigoris Karvounarakis University of Pennsylvania WPE-II Presentation

2 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Data integration I1I1 I2I2 InIn S 2 S n S 1 Heterogeneous data sources Global mediated schema (virtual) T... Query Q MappingsM1M1 M2M2 MnMn

3 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Data exchange I S SourceTarget T M TT J J is a data exchange solution if: h I,J i ² M J ²  T

4 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Query answering (basic problem setting) I S SourceTarget T M Query Q Given source and target schemas (S, T), mapping M, source instance(s) I and a query Q T (over the target), evaluate Q (using data from I)  Query reformulation: Compute a reformulation Q’ of Q that only refers to source relations  Data exchange: Compute a data exchange solution J, such that Q can be evaluated directly on J

5 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Outline Preliminaries  Mapping languages  Semantics of query answering Query reformulation Query answering using data exchange Comparison

6 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Mapping languages Two approaches:  Containment between conjunctive queries  Dependencies (logical assertions)

7 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Query containment Definition: A query Q 1 is contained in a query Q 2, denoted by Q 1 v Q 2, if for all database instances I: Q 1 (I) µ Q 2 (I). Two queries Q 1 and Q 2 are equivalent, if Q 1 v Q 2 and Q 2 v Q 1. In the case where Q 1 and Q 2 are over different schemas, related through mapping M:  M ² Q 1 v Q 2 if 8 I,J: h I,J i ² M: Q 1 (I) µ Q 2 (J)

8 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Containment mappings General form ( GLAV) :  Q S (x,y) v Q T (x,z) (sound – Open World Assumption)  Q S (x,y) ´ Q T (x,z) (exact – Closed World Assumption)  Q S, Q T are conjunctions of relational atoms over S,T resp. Special cases:  GAV (global-as-view): target is specified as a view of the source(s)  Q S (x,y) v T(x) (sound – OWA)  Q S (x,y) ´ T(x) (exact – CWA)  LAV (local-as-view): sources are specified as views of the virtual mediated schema  S(x) v Q T (x,y) (sound – OWA)  S(x) ´ Q T (x,y) (exact – CWA)

9 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Dependencies Tuple-generating dependencies (tgds): 8 x,z  (x,z)   y  (x, y) (where ,  are conjunctions of relational atoms and x,y,z are vectors of variables) Equality-generating dependencies (egds): 8 x  (x)  x i = x j

10 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Data exchange schema mappings Source-to-target tgds: 8 x,z  (x,z)   y  (x, y)   is a conjunction of atoms over S and  is a conjunction of atoms over T Target tgds  Both ,  are conjunctions of atoms over T Target egds 8 x  (x)  x i = x j   is a conjunction of atoms over T

11 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Containment mappings vs. source-to-target tgds A source-to-target tgd of the form: 8 x,z Q S (x,z)   y Q T (x, y) is equivalent to the sound GLAV mapping: Q S (x,z) v Q T (x, y) Sound GAV and LAV mappings can also be expressed by source-to-target tgds. But exact mappings also include a target-to-source direction:  E.g.: S(x,z) ´ T 1 (x,y), T 2 (y,z) is equivalent to: 8 x,z S(x,z)   y T 1 (x, y) Æ T 2 (y,z) (source-to-target) and 8 x,y,z T 1 (x, y) Æ T 2 (y,z)  S(x,z) (target-to-source)

12 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Incompleteness Mappings do not specify target instance completely  E.g.: 8 x,z S(x,z) ! 9 y T(x,y) Æ T(y,z) does not specify the values of y I S SourceTarget T M J2J2 J1J1 J3J3... E.g., if I = {S(a,b)}: J 1 = {T(a,a),T(a,b)} J 2 = {T(a,b),T(b,b)} J 3 = {T(a,X),T(X,b)} J 4 = {T(a,X),T(X,b), T(a,Y),T(Y,b)}...

13 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Semantics of query answering What do we expect as answers to queries over the target schema? “Possible worlds” semantics: for every instance I of S, consider all possible instances J of the target schema T such that h I,J i ² M Convention: certain answers certain M,I (Q T ) =  J: h I,J i ² M Q T (J)

14 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Outline Preliminaries  Mapping languages  Semantics of query answering Query reformulation Query answering using data exchange Comparison

15 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Equivalent reformulation Definition: Q’ S is an equivalent reformulation of Q T across M (denoted M ² Q T ´ Q’ S ) if, for every pair of instances I,J of S,T s.t. h I,J i² M: Q’ S (I) = Q T (J)

16 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Equivalent reformulations may not exist Any reformulation over S can only return values v such that T(v,v) But there are instances J, s.t. T contains tuples in which a  b S(c) T(a,b) 8 x S(x) $ T(x,x) Q(x) :- T(x,y) … even if the mapping is exact

17 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Contained reformulation Definition: Q’ S is an contained reformulation of Q T across M (denoted M ² Q’ S v Q T ) if, for every pair of instances I,J of S,T s.t. h I,J i² M: Q’ S (I) µ Q T (J)

18 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Maximally-contained reformulation Definition: Q S max is a maximally-contained reformulation of Q T across M if:  M ² Q S max v Q T and  Q’ S v Q S max, for every Q’ S s.t. M ² Q’ S v Q T The union of all contained reformulations is a maximally-contained reformulation: Q S max ´ reform M (Q T ) ´  Q’ S : M ² Q’ S v Q T Q’ S

19 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Maximally-contained reformulations compute certain answers Proposition ([AD98],[FKMP03],[T05]): Let certain M (Q) = I. certain M,I (Q) Then: certain M (Q) ´ reform M (Q) (i.e.,: 8 I, reform M (Q)(I) = certain M,I (Q) ) Note that the above holds for any mapping (i.e., not necessarily conjunctive)

20 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Reformulation algorithms (GAV) Sound/exact GAV mappings: e.g. Q S (x,y) v T(x) Reformulation:  for every relation T i (x) of the target schema, let r i be the set of rules with T i on their head (maybe > 1).  Let Q T i (x) be the union of the conjunctive queries in the body of the rules in r i  Substitute T i (x) atoms in Q by Q T i (x)

21 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Reformulation algorithms (LAV/GLAV) Sound LAV/GLAV mappings: r: S 1 (x,y),…,S n (x,y) v T 1 (x,z), …, T m (x,z) (note: T i ’s are not necessarily distinct relational atoms) (equivalent tgd: 8 x,y S 1 (x,y),…,S n (x,y) ! T i (x,z),…, T m (x,z)) Inverse rules ([DG97]):  For every rule r and every i 2 [1..m] define a rule: T i (x, f r,z 1 (x,y), …, f r,z k (x,y)) :- S 1 (x,y),…,S n (x,y) (tgd: 8 x,y S 1 (x,y),…,S n (x,y) ! T i (x,f r,z 1 (x,y),…, f r,z k (x,y)) skolemization of existential variables)

22 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Inverse rules: Example r: S 1 (x,y),S 2 (y,w) v T 1 (x,z),T 1 (z,w) Inverse rules:  T 1 (x,f r,z (x,y,w)) :- S 1 (x,y),S 2 (y,w)  T 1 (f r,z (x,y,w),w) :- S 1 (x,y),S 2 (y,w) Observe that the same skolem term (f r,z (x,y,w)) represents the common existential variable (z) of the two atoms

23 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Query reformulation using inverse rules Create a logic program P Q composed by:  the query Q  the inverse rules of all mappings M Let P(I) be the result of the evaluation of the composition of a logic program P with a set of facts I Theorem ([DG97,AD98]): Let P Q + be a logic program s.t. for every set of facts I, P Q + (I) is the result of discarding all tuples that contain skolem terms from P Q (I). Then:  P Q + is a maximally-contained reformulation  P Q + (I) = certain M,I (Q)

24 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Peer Data Management Systems I1I1 I2I2 InIn P 2 P n P 1... I3I3 P 3 LAV source-to-peer mappings P2P mappings: inclusion (sound) or equality (exact) GLAV + definitional (GAV) Queries can be issued at any peer Every peer can be both source and target w.r.t. different mappings Pairs of peers may be indirectly connected (by paths of mappings) S n S 1 S 2 S 3 M n3 M 31 M 23 M 12

25 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Simple PDMS example I1 I1 S1 ProjMem Area r 1 :S1(n,p,a) µ ProjMem(n,p),Area(p,a) SamePro j Author I2 I2 r 2 : S2(n1,n2) µ Author(n1,p), Author(n2,p) r 0 : SameProj(n1,n2,p) = ProjMem(n1,p),ProjMem(n2,p) Q(n1,n2) :- SameProj(n1,n2,p), Author(n1,p),Author(n2,p) P 1 P 2 S 1 S2

26 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Mapping Graph ProjMem Area SameProj Author r2r2 r0ar0a r0br0b r1r1 r1r1 r 1 : S1(n,p,a) µ ProjMem(n,p),Area(p,a) r 2 : S2(n1,n2) µ Author(n1,p),Author(n2,p) r 0 a: SameProj(n1,n2,p) ¶ ProjMem(n1,p),ProjMem(n2,p) r 0 b: SameProj(n1,n2,p) µ ProjMem(n1,p),ProjMem(n2,p) I1 I1 I2 I2 S1 S2 P 1 P 2

27 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Query answering in PDMS Theorem: ([HIST05]) In general, query answering in PDMS is undecidable  Reason: cycles in mapping graph For acyclic mapping graph: query answering is in PTIME Still in PTIME, for a limited form of cycles (i.e., exact mappings with some restrictions)  Allows chains of sound (“LAV”) mappings and exact (“GAV”) mappings without projections Piazza reformulation algorithm  Sound and complete for acyclic mapping graph and limited form of cycles  Sound, in general (computes subset of certain answers)

28 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Piazza reformulation algorithm (1) q: Q(n1,n2) :- SameProj(n1,n2,p), Author(n1,w), Author(n2,w) q SameProj(n1,n2,p)Author(n1,w)Author(n2,w) r0r0 ProjMem(n1, p)ProjMem(n2, p) ir 1 a S1(n1, p,_) S1(n2, p,_) ir 1 a ir 2 a S2(n1, n2) S2(n2, n1) ir 2 b ir 2 a S2(n1, n2) S2(n2, n1) ir 2 b r 1 : S1(n,p,a) µ ProjMem(n,p),Area(p,a) r 0 : SameProj(n1,n2,p) :- ProjMem(n1,p), ProjMem(n2,p) ir 1 a: ProjMem(n,p) :- S2(n,p,a) r 2 : S2(n1,n2) µ Author(n1,p), Author(n2,p) ir 2 a: Author(n1,f(n1,n2)) :- S2(n1,n2) ir 2 b: Author(n2,f(n1,n2)) :- S2(n1,n2)

29 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Piazza reformulation algorithm (2) q Q(n1,n2) r0r0 SameProj(n1,n2,p)Author(n1,w)Author(n2,w) ProjMem(n1, p)ProjMem(n2, p) ir 1 a S1(n1, p,_) S1(n2, p,_) ir 1 a ir 2 a S2(n2, n1) S2(n1, n2) ir 2 b ir 2 a S2(n1, n2) S2(n2, n1) ir 2 b Q(n1,n2) :- (S1(n1,p,_) Æ S1(n2,p,_)) Æ (S2(n1,n2) [ S2(n2,n1)) Æ (S2(n2,n1) [ S2(n1,n2))

30 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Piazza reformulation algorithm (2) q Q(n1,n2) r0r0 SameProj(n1,n2,p)Author(n1,w)Author(n2,w) ProjMem(n1, p)ProjMem(n2, p) ir 1 a S1(n1, p,_) S1(n2, p,_) ir 1 a ir 2 a S2(n2, n1) S2(n1, n2) ir 2 b ir 2 a S2(n1, n2) S2(n2, n1) ir 2 b Q(n1,n2) :- (S1(n1,p,_) Æ S1(n2,p,_)) Æ (S2(n1,n2) [ S2(n2,n1)) Æ (S2(n2,n1) [ S2(n1,n2))

31 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Piazza reformulation algorithm (2) q Q(n1,n2) r0r0 SameProj(n1,n2,p)Author(n1,w)Author(n2,w) ProjMem(n1, p)ProjMem(n2, p) ir 1 a S1(n1, p,_) S1(n2, p,_) ir 1 a ir 2 a S2(n2, n1) S2(n1, n2) ir 2 b ir 2 a S2(n1, n2) S2(n2, n1) ir 2 b Q(n1,n2) :- (S1(n1,p,_) Æ S1(n2,p,_)) Æ (S2(n1,n2) [ S2(n2,n1)) Æ (S2(n2,n1) [ S2(n1,n2))

32 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Piazza reformulation algorithm (2) q Q(n1,n2) r0r0 SameProj(n1,n2,p)Author(n1,w)Author(n2,w) ProjMem(n1, p)ProjMem(n2, p) ir 1 a S1(n1, p,_) S1(n2, p,_) ir 1 a ir 2 a S2(n2, n1) S2(n1, n2) ir 2 b ir 2 a S2(n1, n2) S2(n2, n1) ir 2 b Q(n1,n2) :- (S1(n1,p,_) Æ S1(n2,p,_)) Æ (S2(n1,n2) [ S2(n2,n1)) Æ (S2(n2,n1) [ S2(n1,n2)) ´ (S1(n1,p,_) Æ S1(n2,p,_) Æ S2(n1,n2))  (S1(n1,p,_) Æ S1(n2,p,_) Æ S2(n2,n1))

33 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Outline Preliminaries  Mapping languages  Semantics of query answering Query reformulation Query answering using data exchange Comparison

34 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Universal solutions Data exchange setting S,T,M, instance I of S  An instance J of T is a universal solution of the de setting above if it has homomorphisms to all other solutions  Solutions contain constants (i.e., values that appear in I) and variables (labeled nulls)  Homomorphism h: J 1 → J 2 between target instances:  h(c) = c, for constant c  If R(a 1,…,a m ) is in J 1,, then R(h(a 1 ),…,h(a m )) is in J 2

35 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Universal solutions I J J1J1 J2J2 J3J3 Universal Solution Solutions h1h1 h2h2 h3h3 Homomorphisms S SourceTarget T M...

36 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Universal solutions example M: 8 x,z S(x,z) ! 9 y T(x,y) Æ T(y,z) I = {S(a,b)} Solutions: J 1 = {T(a,a), T(a,b)} is not universal J 2 = {T(a,b), T(b,b)} is not universal J 3 = {T(a,X), T(X,b)} is universal J 4 = {T(a,X), T(X,b), T(a,Y), T(Y,b)} is universal J 5 = {T(a,X), T(X,b), T(Y,Y)} is not universal...

37 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Computing universal solutions Apply the chase procedure on joint instance h I, ;i Source-to-target dependencies only: terminates in PTIME and produces a joint instance h I,J i, where J is a universal solution (chase(I)) Target dependencies: not guaranteed to terminate  If it does, it computes universal solution  If it fails, no universal solution exists

38 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence h 1 : x ! a, y ! b, z ! c ) h 1 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c,X 1 )} i d 1 : 8 x,y,z S(x,y) Æ S(y,z) ! 9 w T(x,z,w) h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)}, ;i extend to h 1 ’ : w ! X 1

39 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence h 1 : x ! a, y ! b, z ! c ) h 1 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c,X 1 )} i d 1 : 8 x,y,z S(x,y) Æ S(y,z) ! 9 w T(x,z,w) h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)}, ;i extend to h 1 ’ : w ! X 1 h 2 : x ! a, y ! b, z ! d ) h 2 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c,X 1 ),T(a,d,X 2 )} i extend to h 2 ’ : w ! X 2

40 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence h 1 : x ! a, y ! b, z ! c ) h 1 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c,X 1 )} i d 1 : 8 x,y,z S(x,y) Æ S(y,z) ! 9 w T(x,z,w) h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)}, ;i extend to h 1 ’ : w ! X 1 h 2 : x ! a, y ! b, z ! d ) h 2 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c,X 1 ),T(a,d,X 2 )} i extend to h 2 ’ : w ! X 2 h 3 : x ! a, y ! e, z ! c extend to h 3 ’ : w ! X 1 not applicable!

41 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Universal solutions and query answering Theorem ([FKMP]): If Q is a conjunctive query, I is a source instance and J is a universal solution: Q(J) + = certain M,I (Q) Any solution J, for which the above holds for any conjunctive query, is universal

42 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Outline Preliminaries  Mapping languages  Semantics of query answering Query reformulation Query answering using data exchange Comparison

43 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Using inverse rules to compute universal solutions For every relation T i of T, let P M,T i be the reformulation of the query Q(x) :- T i (x), using the inverse rules algorithm. Proposition:  i P M,T i (I)  chase(I)  Crux: every step of a chase sequence corresponds to a step in the evaluation of the logic program using SLD resolution Corollary:  i P M,T i (I) is a universal solution

44 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Applying data exchange in GAV/LAV settings I1I1 I2I2 InIn S 2 S n S 1 T... Query Q M1M1 M2M2 MnMn S I J1J1 J2J2 JnJn J...

45 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Performance tradeoffs Data exchange: - requires the computation of a solution (polynomial in the size of the instance I) - need to propagate updates in the source - may require to recompute the whole universal solution + But then query evaluation is easy and efficient + If query load is large, the cost of computing the solution may be amortized

46 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Performance tradeoffs Reformulation + No “startup” cost + No need to propagate updates - Adds overhead to query processing (although reformulations for “common” queries can be precomputed/cached) - Requires distributed query evaluation engine (but there is room for optimization, e.g., adaptive query processing) - Generated reformulations are generally not minimal

47 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Conclusions Two approaches for answering queries across mappings  Reformulation (data integration)  Universal solutions (data exchange) Different problems  Data exchange is concerned with other aspects, e.g., identifying the appropriate solution to materialize Same answers (certain answers) Performance tradeoffs Tight relationship between chase and inverse rules techniques

48 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Dependencies Tuple-generating dependencies (tgds): 8 x  (x,z)   y  (x, y) (where ,  are conjunctions of relational atoms and x,y,z are vectors of variables)  Inclusion and multi-valued dependencies are a special case Equality-generating dependencies (egds): 8 x  (x)  x i = x j  Functional dependencies are a special case

49 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Containment mappings vs. source-to-target tgds A source-to-target tgd of the form: 8 x,z Q S (x,z)   y Q T (x, y) is equivalent to the sound GLAV mapping: Q S (x,z) v Q T (x, y) Sound GAV and LAV mappings can also be expressed by source- to-target tgds:  E.g.: S(x,z) v T 1 (x,y), T 2 (y,z) can be expressed as: 8 x,z S(x,z)   y T 1 (x, y) Æ T 2 (y,z) But exact mappings also involve a target-to-source direction:  E.g.: S(x,z) ´ T 1 (x,y), T 2 (y,z) is equivalent to: 8 x,z S(x,z)   y T 1 (x, y) Æ T 2 (y,z) (source-to-target) and 8 x,y,z T 1 (x, y) Æ T 2 (y,z)  S(x,z) (target-to-source)

50 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Computing universal solutions Apply the chase procedure on joint instance h I, ;i  Look for a homomorphism h from the premise of a dependency d to the joint instance, that preserves the relations  Apply the chase with h if there is no homomorphism h’ that extends h from the atoms of both sides of d to the joint instance  Add the image of the conclusion of d under the extension of h to the joint instance  When the chase terminates (always, for s-to-t dependencies, but not necessarily so for target dependencies), it produces a joint instance h I,J i, where J is a universal solution

51 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence d 1 : 8 x,y,z S(x,y) Æ S(y,z) ! T(x,z) h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)}, ;i ) h 1 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c)} i ) h 2 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c),},{T(a,c),T(a,d)} i h 1 : x ! a, y ! b, z ! c h 2 : x ! a, y ! b, z ! d h 3 : x ! a, y ! e, z ! c not applicable

52 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence h 1 : x ! a, y ! b, z ! c ) h 1 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c)} i ) h 2 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c),T(a,d)} i d 1 : 8 x,y,z S(x,y) Æ S(y,z) ! T(x,z) h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)}, ;i h 2 : x ! a, y ! b, z ! d extend to h 1 ’ : w ! X 1

53 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence h 1 : x ! a, y ! b, z ! c h 2 : x ! a, y ! b, z ! d ) h 1 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c)} i ) h 2 h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)},{T(a,c),T(a,d)} i d 1 : 8 x,y,z S(x,y) Æ S(y,z) ! T(x,z) h {S(a,b),S(b,c),S(b,d),S(a,e),S(e,c)}, ;i h 3 : x ! a, y ! e, z ! cnot applicable!

54 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence with labeled nulls d 1 : 8 x,y S(x,y) ! 9 zT(x,z) h {S(a,b),S(a,c)}, ;i h 1 : x ! a, y ! b ) h 1 h {S(a,b),S(a,c)},{T(a,X)} i extend to h’: z ! X

55 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis June 05 WPE-II Example chase sequence with labeled nulls d 1 : 8 x,y S(x,y) ! 9 zT(x,z) h {S(a,b),S(a,c)}, ;i h 1 : x ! a, y ! b ) h 1 h {S(a,b),S(a,c)},{T(a,X)} i extend to h’: z ! X h 2 : x ! a, y ! c can be extended to h’’: z ! Xnot applicable!