Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.

Slides:



Advertisements
Similar presentations
Boolean Algebra and Logic Gates
Advertisements

1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Containment of Conjunctive Queries on Annotated Relations TJ Green University of Pennsylvania Symposium on Database Provenance University of Edinburgh.
Completeness and Expressiveness
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
1 Extended Conjunctive Queries Unions Arithmetic Negation.
2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
Copyright © Cengage Learning. All rights reserved. CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION.
ICDT'2001, London, UK1 On Answering Queries in the Presence of Limited Access Patterns Chen Li Stanford University joint work with Edward Chang, UC Santa.
Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Efficient Query Evaluation on Probabilistic Databases
1 A Scalable Algorithm for Answering Queries Using Views Rachel Pottinger Qualifying Exam October 29, 1999 Advisor: Alon Levy.
1 Answering Queries Using Views Alon Y. Halevy Based on Levy et al. PODS ‘95.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
1 9. Evaluation of Queries Query evaluation – Quantifier Elimination and Satisfiability Example: Logical Level: r   y 1,…y n  r’ Constraint.
Conversion of CFG to PDA Conversion of PDA to CFG
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
2005certain1 Views as Incomplete Databases – Certain & Possible Answers  Views – an incomplete representation  Certain and possible answers  Complexity.
CSE 636 Data Integration Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman.
Local-as-View Mediators Priya Gangaraju(Class Id:203)
Inverses and GCDs Supplementary Notes Prepared by Raymond Wong
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
CSE 636 Data Integration Answering Queries Using Views Overview.
1 Decidability Turing Machines Coded as Binary Strings Diagonalizing over Turing Machines Problems as Languages Undecidable Problems.
1 Semantics of Datalog With Negation Local Stratification Stable Models Well-Founded Models.
2005lav-iii1 The Infomaster system & the inverse rules algorithm  The InfoMaster system  The inverse rules algorithm  A side trip – equivalence & containment.
Monadic Predicate Logic is Decidable Boolos et al, Computability and Logic (textbook, 4 th Ed.)
Matrix Algebra THE INVERSE OF A MATRIX © 2012 Pearson Education, Inc.
Decision Properties of Regular Languages
1 Searching for Solutions Careful Analysis of Expansions The Bucket Algorithm.
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
CSE 311 Foundations of Computing I Lecture 6 Predicate Logic, Logical Inference Spring
1 Data integration Most slides are borrowed from Dr. Chen Li, UC Irvine.
MATH 224 – Discrete Mathematics
Recursive query plans for Data Integration Oliver Michael By Rajesh Kanisetti.
Algebra Form and Function by McCallum Connally Hughes-Hallett et al. Copyright 2010 by John Wiley & Sons. All rights reserved. 3.1 Solving Equations Section.
Module 4.  Boolean Algebra is used to simplify the design of digital logic circuits.  The design simplification are based on: Postulates of Boolean.
1 Cluster Computing and Datalog Recursion Via Map-Reduce Seminaïve Evaluation Re-engineering Map-Reduce for Recursion.
Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers.
Answering Queries Using Views LMSS’95 Laks V.S. Lakshmanan Dept. of Comp. Science UBC.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman Fall 2006.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
Chapter 5 Notes. P. 189: Sets, Bags, and Lists To understand the distinction between sets, bags, and lists, remember that a set has unordered elements,
CSE 311 Foundations of Computing I Lecture 7 Logical Inference Autumn 2012 CSE
Chinese Remainder Theorem Dec 29 Picture from ………………………
1 Introduction to Abstract Mathematics Chapter 2: The Logic of Quantified Statements. Predicate Calculus Instructor: Hayk Melikya 2.3.
Extra slides for Chapter 3: Propositional Calculus & Normal Forms Based on Prof. Lila Kari’s slides For CS2209A, 2009 By Dr. Charles Ling;
1 Georgia Tech, IIC, GVU, 2006 MAGIC Lab Rossignac Lecture 02: QUANTIFIERS Sections 1.3 and 1.4 Jarek Rossignac CS1050:
1 Undecidability Everything is an Integer Countable and Uncountable Sets Turing Machines Recursive and Recursively Enumerable Languages.
Introduction to Finite Automata
Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.
1 Finite Model Theory Lecture 16 L  1  Summary and 0/1 Laws.
Overview of the theory of computation Episode 3 0 Turing machines The traditional concepts of computability, decidability and recursive enumerability.
Process Algebra (2IF45) Basic Process Algebra Dr. Suzana Andova.
2 2.2 © 2016 Pearson Education, Ltd. Matrix Algebra THE INVERSE OF A MATRIX.
Extensions of Datalog Wednesday, February 13, 2001.
1 Datalog with negation Adapted from slides by Jeff Ullman.
Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman
Containment Mappings Canonical Databases Sariaya’s Algorithm
CSE 311: Foundations of Computing
Dealing with Uniqueness Constraint in Query Optimization
Local-as-View Mediators
Logic Based Query Languages
ECE 352 Digital System Fundamentals
Datalog Inspired by the impedance mismatch in relational databases.
This Lecture Substitution model
Matrix Algebra THE INVERSE OF A MATRIX © 2012 Pearson Education, Inc.
Presentation transcript:

Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1

2 Conjunctive Queries Containment Mappings Canonical Databases

3 Conjunctive Queries uA CQ is a single Datalog rule, with all subgoals assumed to be EDB. uMeaning of a CQ is the mapping from databases (the EDB) to the relation produced for the head predicate by applying that rule to the EDB.

4 Containment of CQ’s uQ1  Q2 iff for all databases D, Q1(D)  Q2(D). uExample:  Q1: p(X,Y) :- arc(X,Z) & arc(Z,Y)  Q2: p(X,Y) :- arc(X,Z) & arc(W,Y) uDo you think that Q1 is contained in Q2? uDo you think that Q2 is contained in Q1?

5 Why Care About CQ Containment? uImportant optimization: if we can break a query into terms that are CQ’s, we can eliminate those terms contained in another. uContainment tests imply equivalence- of-programs tests. uCQ’s, or similar logics like description logic, are used in a number of AI applications.

6 Testing Containment uTwo approaches: 1.Containment mappings. 2.Canonical databases. uReally the same in the simple CQ case covered so far. uContainment is NP-complete, but CQ’s tend to be small so here is one case where intractability doesn’t hurt you.

7 Containment Mappings uA mapping from the variables of CQ Q2 to the variables of CQ Q1, such that: 1.The head of Q2 is mapped to the head of Q1. 2.Each subgoal of Q2 is mapped to some subgoal of Q1 with the same predicate.

8 Important Theorem uThere is a containment mapping from Q2 to Q1 if and only if Q1  Q2. uNote that the containment mapping is opposite the containment --- it goes from the larger (containing CQ) to the smaller (contained CQ).

9 Example Q1: p(X,Y):-r(X,Z) & g(Z,Z) & r(Z,Y) Q2: p(A,B):-r(A,C) & g(C,D) & r(D,B) Q1 looks for: Q2 looks for: XYZ ABDC Since C=D is possible, expect Q1  Q2. Is Q2  Q1? Is Q1  Q2?

10 Example --- Continued Q1: p(X,Y):-r(X,Z) & g(Z,Z) & r(Z,Y) Q2: p(A,B):-r(A,C) & g(C,D) & r(D,B) Containment mapping: m(A)=X; m(B)=Y; m(C)=m(D)=Z.

11 Example ---Concluded Q1: p(X,Y):-r(X,Z) & g(Z,Z) & r(Z,Y) Q2: p(A,B):-r(A,C) & g(C,D) & r(D,B) uNo containment mapping from Q1 to Q2. wg(Z,Z) can only be mapped to g(C,D). No other g subgoals in Q2. wBut then Z must map to both C and D --- impossible. uThus, Q1 properly contained in Q2.

12 Another Example Q1: p(X,Y):-r(X,Y) & g(Y,Z) Q2: p(A,B):-r(A,B) & r(A,C) Q1 looks for: Q2 looks for: XZY AB C Is Q2  Q1? Is Q1  Q2?

13 Example --- Continued Q1: p(X,Y):-r(X,Y) & g(Y,Z) Q2: p(A,B):-r(A,B) & r(A,C) uCan you find a containment mapping from Q2 to Q1? From Q2 to Q1? uIs every subgoal in Q1 in the target? uCan different atoms map to the same atom?

14 Proof of Containment-Mapping Theorem --- (1) uFirst, assume there is a CM m : Q2->Q1. uLet D be any database; we must show that Q1(D)  Q2(D). uSuppose t is a tuple in Q1(D); we must show t is also in Q2(D).

15 Proof --- (2) uSince t is in Q1(D), there is a substitution s from the variables of Q1 to values that: 1.Makes every subgoal of Q1 a fact in D. uMore precisely, if p(X,Y,…) is a subgoal, then [s(X),s(Y),…] is a tuple in the relation for p. 2.Turns the head of Q1 into t.

16 Proof --- (3) uConsider the effect of applying m and then s to Q2. head of Q2 :-subgoal of Q2m head of Q1 :-subgoal of Q1s t tuple of D s  m maps each sub- goal of Q2 to a tuple of D. And the head of Q2 becomes t, proving t is also in Q2(D); i.e., Q1  Q2.

17 Proof of Converse --- (1) uNow, we must assume Q1  Q2, and show there is a containment mapping from Q2 to Q1. uKey idea --- frozen CQ Q : 1.For each variable of Q, create a corresponding, unique constant. 2.Frozen Q is a DB with one tuple formed from each subgoal of Q, with constants in place of variables.

18 Example: Frozen CQ p(X,Y):-r(X,Z) & g(Z,Z) & r(Z,Y) uLet’s use lower-case letters as constants corresponding to variables. uThen frozen CQ is: Relation R for predicate r = {(x,z), (z,y)}. Relation G for predicate g = {(z,z)}.

19 Converse --- (2) uSuppose Q1  Q2, and let D be the frozen Q1. uClaim: Q1(D) contains the frozen head of Q1 --- that is, the head of Q1 with variables replaced by their corresponding constants. wProof: the “freeze” substitution makes all subgoals in D, and makes the head become the frozen head.

20 Converse --- (3) uSince Q1  Q2, the frozen head of Q1 must also be in Q2(D). uThus, there is a mapping s from variables of Q2 to D that turns subgoals of Q2 into tuples of D and turns the head of Q2 into the frozen head of Q1. uBut tuples of D are frozen subgoals of Q1, so s followed by “unfreeze” is a containment mapping from Q2 to Q1.

21 In Pictures Q2: h(X,Y) :- … p(Y,Z) … s h(u,v)p(a,b)D freeze Q1: h(U,V) :- … p(A,B) … s followed by inverse of freeze maps each subgoal p(Y,Z) of Q2 to a subgoal p(A,B) of Q1 and maps h(X,Y) to h(U,V).

22 Dual View of CM’s uInstead of thinking of a CM as a mapping on variables, think of a CM as a mapping from atoms to atoms. uRequired conditions: 1.The head must map to the head. 2.Each subgoal maps to a subgoal. 3.As a consequence, no variable is mapped to two different variables.

23 Canonical Databases uGeneral idea: test Q1  Q2 by checking that Q1(D 1 )  Q2(D 1 ),…, Q1(D n )  Q2(D n ), where D 1,…,D n are the canonical databases. uFor the standard CQ case, we only need one canonical DB --- the frozen Q1. uBut in more general forms of queries, larger sets of canonical DB’s are needed. uProve that the canonical DB test works (HW)

24 Constants uCQ’s are often allowed to have constants in subgoals. wCorresponds to selection in relational algebra. uCM’s and CM test are the same, but: 1.A variable can map to one variable or one constant. 2.A constant can only map to itself.

25 Example Q2: p(X) :- e(X,Y) Q1: p(A) :- e(A,10) CM from Q2 -> Q1 maps X->A and Y->10. Thus, Q1  Q2. A CM from Q1 -> Q2 would have to map constant 10 to variable Y; hence no such mapping exists.

26 Complexity of Containment uContainment of CQ’s is NP-complete. uWhat about determining whether Q1  Q2, when Q1 has no two subgoals with the same predicate? uWhat about determining whether Q1  Q2, when Q1 has no three subgoals with the same predicate? wSariaya’s algorithm is a linear-time test

Think about it uHow can we determine containment of queries with comparisons? uHow can we determine containment of conjunctive queries under bag semantics wA solution to this problem can be given instead of taking the final exam 27