2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views) conjunctive queries – CQ’s Extensions of CQ’s
2005conjunctive-ii2 Conjunctive queries –equivalence & containment For CQ’ q1, q2, with the same head predicate: Decision problems: The two problems are equivalent: solved one, solved the other
2005conjunctive-ii3 Solution for containment for equivalence : Solution for equivalence for containment: (here, the ri and sj are db predicates, not necessarily different)
2005conjunctive-ii4 Characterizations for containment : assume q1, q2 are given A mapping h from the variables of q2 to variables/constants (extended naturally to constants and atoms) is a homomorphism from q2 to q1 if 1)Maps head(q2) to head(q1) (assuming same heads identity on head vars) 2)Maps each atom of q2 to an atom of q1 3)If there are constrains on the side, Ci in qi, then h(C2) is implied by C1 Notation:
2005conjunctive-ii5 Thm: The following are equivalent: for CQ’s w/o built-in preds Proof: (ii) (i) is easy (and holds even with b.i. preds): Every valuation from q1 into a db D can be composed with h to a valuation from q2. Hence, every answer of q1 on D is also an answer of q2 on D v h D
2005conjunctive-ii6 For (i) (ii): The body of a CQ (w/o b.i’s) can be viewed as a db: consider each variable as a constant, different from all constants in the CQ and the other variables or, replace each variable x by a distinct constant c x Denote this db by db(q) Obviously, q(db(q)) contains the head of q (or its image) Example: Q: q(d) :- movies(t,d,a), directory(‘Plaza’, t, 19:30) db(Q): movies(c t,c d,c a ), directory(‘Plaza’, c t,19:30) Obviously, applying Q to this db, one obtains q(c d ) (use the “identity” valuation)
2005conjunctive-ii7 (i) (ii) (q2 contains q1 homomorphism from q2 to q1) Clearly, q1(db(q1)) contains head(q1) Since, q2(db(q1)) contains head(q1) The valuation from q2 to db(q1) that yields this answer is a homomorphism Example: q1: p(d) :- movies(t,d,’Jane’), directory(‘Plaza’, t, 19:30), location(‘Plaza’, a, ) q2: p(z) :- movies(t,z,a), directory(‘Plaza’, t, 19:30) Obviously, q1 is contained in q2, with h: t t, z d, a ’Jane’, that maps the two atoms of body(q2) to the first two of body(q1), and head(q2) to head(q1)
2005conjunctive-ii8 Because of this characterization, such a homomorphism is also called a containment mapping from q2 to q1 Intuition: q1 is contained in q2 iff It has ‘same or more atoms’ It may have some constants where q2 has variables
2005conjunctive-ii9 Another characterization: For a rule p(..) :- r1(..), …, rk(..) a model is a set of facts over p, r1,.., rk that satisfies the rule as a logical formula (assuming all variables are universally quantified) Thm: the following are equivalent: The important & useful characterization: homomorphism, i.e., containment mapping
2005conjunctive-ii10 Algorithm and complexity : To decide if q1 is contained in q2, search for a containment mapping from the variables of q2 to the variables and constants of q1: easy & fast in many cases, exponential in worst case The containment is in NP: given a mapping on the variables of q2, it is easy to check it is a homomorphism to q1
2005conjunctive-ii11 It is NP-hard: given a graph G, it is 3-colorable iff there is a homomorphism from G (represented as an edge relation) to the 3-clique one can represent G as the body of q2 (using distinct variables for distinct nodes), the 3-clique as the body of q1 for both, the head can be q( ) Hence, containment & equivalence are NP-complete (even for queries with no head variables) Note: this is expression complexity, not data complexity (here there is no db actually) *(when such a query is applied to a db, it returns either {()}, or {}) *
2005conjunctive-ii12 Minimization of CQ’s: For q, define a minimal equivalent query as any equivalent q’ with a minimal number of body atoms Thm: the minimal equivalent query of q is unique up to isomorphism, and can be obtained by removing some atoms from body(q) Proof:
2005conjunctive-ii13 Thus, for every CQ Q, there is a subset of the body that gives a minimal equivalent query Called a core of Q It is not necessarily unique, (different subsets may yield cores), but all cores are isomorphic
2005conjunctive-ii14 Containment & equivalence for extensions of CQ’s Extension to UCQ’s : let Thm: Proof: is obvious : if q1 is contained in q2, then each ri is contained in q2 q2(db(ri)) contains p(x) for some sj, sj(db(ri)) contains p(x) sj contains ri q1: r1: p(x) :- body1,1 … rk: p(x):- body1,k q2: s1: p(x) :- body2,1 … sm: p(x):- body2,m
2005conjunctive-ii15 Containment algorithm : For each ri, loop over sj, and search for a containment mapping from sj to ri Still exponential in size (of both queries) Complexity : The containment problem is now Explanation: A relation R(..) is ptime if membership can be verified in ptime
2005conjunctive-ii16 For a UCQ Q we can also consider the canonical db of Q, denoted db(Q), obtained by taking the bodies of all the rules together as a db (with different existential variables in different rules ) Here also: Thm: Q1 is contained in Q2 iff Q2(db(Q1)) contains head(Q1) (this also gives an algorithm for checking containment, which boils down to finding containment mappings)
2005conjunctive-ii17 Another extension of CQ’s: b.i. preds in the body Example: Q1: p(x, y) :- q(x, y), r(u, v), u <= v Q2: p(x, y) :- q(x, y), r(u,v), r(v, u) Is Q2 contained in/equivalent to Q1? Q2 is equivalent to the union of Q2,1: p(x, y) :- q(x, y), r(u,v), r(v, u), u<= v Q2,2: p(x, y) :- q(x, y), r(u,v), r(v, u), v< u Clearly, Q2,1 and Q2,2 are both contained in Q1 This can be generalized to an algorithm that reduces containment to that of UCQ’s (omitted)
2005conjunctive-ii18 Containment of a UCQ Q and a (recursive) Datalog program P: Still decidable, but double exponential time (upper & lower bound) Here also: Thm: P contains Q iff P(db(Q)) contains head Q this gives an algorithm for checking containment: apply P to db(Q), see if you obtain head(Q) (do you see exponentials in this algorithm?) Containment of Datalog programs : undecidable