2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.

Slides:



Advertisements
Similar presentations
Containment of Conjunctive Queries on Annotated Relations TJ Green University of Pennsylvania Symposium on Database Provenance University of Edinburgh.
Advertisements

Completeness and Expressiveness
1 Decidable Containment of Recursive Queries Diego Calvanese, Giuseppe De Giacomo, Moshe Y. Vardi presented by Axel Polleres
P, NP, NP-Complete Problems
NP-Hard Nattee Niparnan.
1 Decomposing Hypergraphs with Hypertrees Raphael Yuster University of Haifa - Oranim.
SLD-resolution Introduction Most general unifiers SLD-resolution
10 October 2006 Foundations of Logic and Constraint Programming 1 Unification ­An overview Need for Unification Ranked alfabeths and terms. Substitutions.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
1 Constraint operations: Simplification, Optimization and Implication.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
Theory of Computing Lecture 18 MAS 714 Hartmut Klauck.
NP-complete and NP-hard problems Transitivity of polynomial-time many-one reductions Concept of Completeness and hardness for a complexity class Definition.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Having Proofs for Incorrectness
© The McGraw-Hill Companies, Inc., Chapter 8 The Theory of NP-Completeness.
Efficient Query Evaluation on Probabilistic Databases
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
2005certain1 Views as Incomplete Databases – Certain & Possible Answers  Views – an incomplete representation  Certain and possible answers  Complexity.
Local-as-View Mediators Priya Gangaraju(Class Id:203)
Winter 2004/5Pls – inductive – Catriel Beeri1 Inductive Definitions (our meta-language for specifications)  Examples  Syntax  Semantics  Proof Trees.
Analysis of Algorithms CS 477/677
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
2005conjunctive1 Query languages, equivalence & containment  conjunctive queries – CQ’s  More expressive languages.
CSP, Algebras, Varieties Andrei A. Bulatov Simon Fraser University.
Chapter 11: Limitations of Algorithmic Power
2005lav-iii1 The Infomaster system & the inverse rules algorithm  The InfoMaster system  The inverse rules algorithm  A side trip – equivalence & containment.
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
App III. Group Algebra & Reduction of Regular Representations 1. Group Algebra 2. Left Ideals, Projection Operators 3. Idempotents 4. Complete Reduction.
Daniel Kroening and Ofer Strichman 1 Decision Procedures in First Order Logic Decision Procedures for Equality Logic.
Predicates and Quantifiers
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Lecture 22 More NPC problems
Theory of Computation, Feodor F. Dragan, Kent State University 1 NP-Completeness P: is the set of decision problems (or languages) that are solvable in.
NP Complexity By Mussie Araya. What is NP Complexity? Formal Definition: NP is the set of decision problems solvable in polynomial time by a non- deterministic.
CSE 024: Design & Analysis of Algorithms Chapter 9: NP Completeness Sedgewick Chp:40 David Luebke’s Course Notes / University of Virginia, Computer Science.
NP-COMPLETENESS PRESENTED BY TUSHAR KUMAR J. RITESH BAGGA.
Complexity Non-determinism. NP complete problems. Does P=NP? Origami. Homework: continue on postings.
CS344: Introduction to Artificial Intelligence Lecture: Herbrand’s Theorem Proving satisfiability of logic formulae using semantic trees (from Symbolic.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 28– Interpretation; Herbrand Interpertation 30 th Sept, 2010.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman Fall 2006.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
1 Chapter 34: NP-Completeness. 2 About this Tutorial What is NP ? How to check if a problem is in NP ? Cook-Levin Theorem Showing one of the most difficult.
Chapter 10 Graph Theory Eulerian Cycle and the property of graph theory 10.3 The important property of graph theory and its representation 10.4.
1 Reasoning with Infinite stable models Piero A. Bonatti presented by Axel Polleres (IJCAI 2001,
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding Combined Theories.
Chapter 11 Introduction to Computational Complexity Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
CSC 413/513: Intro to Algorithms
1 Finite Model Theory Lecture 16 L  1  Summary and 0/1 Laws.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
Extensions of Datalog Wednesday, February 13, 2001.
1 CMPS 277 – Principles of Database Systems Lecture #8.
Complexity Classes.
The NP class. NP-completeness
Advanced Algorithms Analysis and Design
Answering Queries using Templates with Binding Patterns
NP-Completeness Proofs
Containment Mappings Canonical Databases Sariaya’s Algorithm
NP-Completeness Yin Tat Lee
NP-Complete Problems.
Logics for Data and Knowledge Representation
Graphs and Algorithms (2MMD30)
NP-Completeness Yin Tat Lee
This Lecture Substitution model
Instructor: Aaron Roth
CHAPTER 7 Time complexity
Presentation transcript:

2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions of CQ’s

2005conjunctive-ii2  Conjunctive queries –equivalence & containment For CQ’ q1, q2, with the same head predicate: Decision problems: The two problems are equivalent: solved one, solved the other

2005conjunctive-ii3 Solution for containment  for equivalence : Solution for equivalence  for containment: (here, the ri and sj are db predicates, not necessarily different)

2005conjunctive-ii4 Characterizations for containment : assume q1, q2 are given A mapping h from the variables of q2 to variables/constants (extended naturally to constants and atoms) is a homomorphism from q2 to q1 if 1)Maps head(q2) to head(q1) (assuming same heads  identity on head vars) 2)Maps each atom of q2 to an atom of q1 3)If there are constrains on the side, Ci in qi, then h(C2) is implied by C1 Notation:

2005conjunctive-ii5 Thm: The following are equivalent: for CQ’s w/o built-in preds Proof: (ii)  (i) is easy (and holds even with b.i. preds): Every valuation from q1 into a db D can be composed with h to a valuation from q2. Hence, every answer of q1 on D is also an answer of q2 on D v h D

2005conjunctive-ii6 For (i)  (ii): The body of a CQ (w/o b.i’s) can be viewed as a db: consider each variable as a constant, different from all constants in the CQ and the other variables or, replace each variable x by a distinct constant c x Denote this db by db(q) Obviously, q(db(q)) contains the head of q (or its image) Example: Q: q(d) :- movies(t,d,a), directory(‘Plaza’, t, 19:30) db(Q): movies(c t,c d,c a ), directory(‘Plaza’, c t,19:30) Obviously, applying Q to this db, one obtains q(c d ) (use the “identity” valuation)

2005conjunctive-ii7 (i)  (ii) (q2 contains q1  homomorphism from q2 to q1) Clearly, q1(db(q1)) contains head(q1) Since, q2(db(q1)) contains head(q1) The valuation from q2 to db(q1) that yields this answer is a homomorphism Example: q1: p(d) :- movies(t,d,’Jane’), directory(‘Plaza’, t, 19:30), location(‘Plaza’, a, ) q2: p(z) :- movies(t,z,a), directory(‘Plaza’, t, 19:30) Obviously, q1 is contained in q2, with h: t  t, z  d, a  ’Jane’, that maps the two atoms of body(q2) to the first two of body(q1), and head(q2) to head(q1)

2005conjunctive-ii8 Because of this characterization, such a homomorphism is also called a containment mapping from q2 to q1 Intuition: q1 is contained in q2 iff It has ‘same or more atoms’ It may have some constants where q2 has variables

2005conjunctive-ii9 Another characterization: For a rule p(..) :- r1(..), …, rk(..) a model is a set of facts over p, r1,.., rk that satisfies the rule as a logical formula (assuming all variables are universally quantified) Thm: the following are equivalent: The important & useful characterization: homomorphism, i.e., containment mapping

2005conjunctive-ii10 Algorithm and complexity : To decide if q1 is contained in q2, search for a containment mapping from the variables of q2 to the variables and constants of q1: easy & fast in many cases, exponential in worst case The containment is in NP: given a mapping on the variables of q2, it is easy to check it is a homomorphism to q1

2005conjunctive-ii11 It is NP-hard: given a graph G, it is 3-colorable iff there is a homomorphism from G (represented as an edge relation) to the 3-clique one can represent G as the body of q2 (using distinct variables for distinct nodes), the 3-clique as the body of q1 for both, the head can be q( ) Hence, containment & equivalence are NP-complete (even for queries with no head variables) Note: this is expression complexity, not data complexity (here there is no db actually) *(when such a query is applied to a db, it returns either {()}, or {}) *

2005conjunctive-ii12 Minimization of CQ’s: For q, define a minimal equivalent query as any equivalent q’ with a minimal number of body atoms Thm: the minimal equivalent query of q is unique up to isomorphism, and can be obtained by removing some atoms from body(q) Proof:

2005conjunctive-ii13 Thus, for every CQ Q, there is a subset of the body that gives a minimal equivalent query Called a core of Q It is not necessarily unique, (different subsets may yield cores), but all cores are isomorphic

2005conjunctive-ii14  Containment & equivalence for extensions of CQ’s Extension to UCQ’s : let Thm: Proof:  is obvious  : if q1 is contained in q2, then each ri is contained in q2  q2(db(ri)) contains p(x)  for some sj, sj(db(ri)) contains p(x)  sj contains ri q1: r1: p(x) :- body1,1 … rk: p(x):- body1,k q2: s1: p(x) :- body2,1 … sm: p(x):- body2,m

2005conjunctive-ii15 Containment algorithm : For each ri, loop over sj, and search for a containment mapping from sj to ri Still exponential in size (of both queries) Complexity : The containment problem is now Explanation: A relation R(..) is ptime if membership can be verified in ptime

2005conjunctive-ii16 For a UCQ Q we can also consider the canonical db of Q, denoted db(Q), obtained by taking the bodies of all the rules together as a db (with different existential variables in different rules ) Here also: Thm: Q1 is contained in Q2 iff Q2(db(Q1)) contains head(Q1) (this also gives an algorithm for checking containment, which boils down to finding containment mappings)

2005conjunctive-ii17 Another extension of CQ’s: b.i. preds in the body Example: Q1: p(x, y) :- q(x, y), r(u, v), u <= v Q2: p(x, y) :- q(x, y), r(u,v), r(v, u) Is Q2 contained in/equivalent to Q1? Q2 is equivalent to the union of Q2,1: p(x, y) :- q(x, y), r(u,v), r(v, u), u<= v Q2,2: p(x, y) :- q(x, y), r(u,v), r(v, u), v< u Clearly, Q2,1 and Q2,2 are both contained in Q1 This can be generalized to an algorithm that reduces containment to that of UCQ’s (omitted)

2005conjunctive-ii18 Containment of a UCQ Q and a (recursive) Datalog program P: Still decidable, but double exponential time (upper & lower bound) Here also: Thm: P contains Q iff P(db(Q)) contains head Q this gives an algorithm for checking containment: apply P to db(Q), see if you obtain head(Q) (do you see exponentials in this algorithm?) Containment of Datalog programs : undecidable