Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

1 Decidable Containment of Recursive Queries Diego Calvanese, Giuseppe De Giacomo, Moshe Y. Vardi presented by Axel Polleres
Relational Calculus and Datalog
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
1 Extended Conjunctive Queries Unions Arithmetic Negation.
2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.
Information Integration Using Logical Views Jeffrey D. Ullman.
1 Chapter 4:Constraint Logic Programs Where we learn about the only programming concept rules, and how programs execute.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
©Silberschatz, Korth and Sudarshan5.1Database System Concepts Chapter 5: Other Relational Languages Query-by-Example (QBE) Datalog.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
Inference and Reasoning. Basic Idea Given a set of statements, does a new statement logically follow from this. For example If an animal has wings and.
1 Complexity of domain-independent planning José Luis Ambite.
Limits and Continuity Definition Evaluation of Limits Continuity
Answer Set Programming Overview Dr. Rogelio Dávila Pérez Profesor-Investigador División de Posgrado Universidad Autónoma de Guadalajara
Logic.
1 Global-as-View and Local-as-View for Information Integration CS652 Spring 2004 Presenter: Yihong Ding.
Containment of Nested XML Queries Xin (Luna) Dong, Alon Halevy, Igor Tatarinov University of Washington.
Algebra Problems… Solutions Algebra Problems… Solutions © 2007 Herbert I. Gross Set 4 By Herb I. Gross and Richard A. Medeiros next.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
1 A Scalable Algorithm for Answering Queries Using Views Rachel Pottinger Qualifying Exam October 29, 1999 Advisor: Alon Levy.
1 Answering Queries Using Views Alon Y. Halevy Based on Levy et al. PODS ‘95.
1 9. Evaluation of Queries Query evaluation – Quantifier Elimination and Satisfiability Example: Logical Level: r   y 1,…y n  r’ Constraint.
A scalable algorithm for answering queries using views Rachel Pottinger, Alon Levy [2000] Rachel Pottinger and Alon Y. Levy A Scalable Algorithm for Answering.
A Framework for Using Materialized XPath Views in XML Query Processing Dapeng He Wei Jin.
Restricted Satisfiability (SAT) Problem
Limits and Continuity Definition Evaluation of Limits Continuity
CSE 636 Data Integration Answering Queries Using Views Overview.
Information Integration Using Logical Views Jeffrey D. Ullman.
Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine) Materializing Views With Minimal Size To Answer Queries.
1 First order theories. 2 Satisfiability The classic SAT problem: given a propositional formula , is  satisfiable ? Example:  Let x 1,x 2 be propositional.
CS848: Topics in Databases: Foundations of Query Optimization Topics Covered  Databases  QL  Query containment  More on QL.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
CMPS 3223 Theory of Computation Automata, Computability, & Complexity by Elaine Rich ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Slides provided.
February 18, 2015CS21 Lecture 181 CS21 Decidability and Tractability Lecture 18 February 18, 2015.
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman Fall 2006.
Describing and Using Query Capabilities of Heterogeneous Sources Vasilis Vassalos& Yannis Papakonstantinou Presented by Srujan Kothapally.
Chapter 2 Calculus: Hughes-Hallett The Derivative.
1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999.
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding Combined Theories.
Answer Extraction To use resolution to answer questions, for example a query of the form  X C(X), we must keep track of the substitutions made during.
Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View Basic Concepts and Background.
Overview of the theory of computation Episode 3 0 Turing machines The traditional concepts of computability, decidability and recursive enumerability.
1 Finite Model Theory Lecture 5 Turing Machines and Finite Models.
Answering Queries Using Views Presented by: Mahmoud ELIAS.
Copyright 1999Paul F. Reynolds, Jr. Foundations of Logic Programming.
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Extensions of Datalog Wednesday, February 13, 2001.
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
1 CMPS 277 – Principles of Database Systems Lecture #8.
Limits and Continuity Definition Evaluation of Limits Continuity Limits Involving Infinity.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Modifying the Database
Answering Queries using Templates with Binding Patterns
Goal for this lecture Demonstrate how we can prove that one query language is more expressive than (i.e., “contained in” as described in the book) another.
Containment Mappings Canonical Databases Sariaya’s Algorithm
NP-Complete Problems.
Local-as-View Mediators
Logic Based Query Languages
Brief Introduction to Computational Logic
Chen Li Information and Computer Science
Datalog Inspired by the impedance mismatch in relational databases.
Materializing Views With Minimal Size To Answer Queries
Representations & Reasoning Systems (RRS) (2.2)
Presentation transcript:

Containment CSE 590 DB Rachel Pottinger

Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application: rewriting queries using views

Containment, what is it? zFor two queries, Q 1 and Q 2, if all of the answers to Q 1 are a subset of those for Q 2 for all databases, then Q 1 is contained in Q 2. zDenoted as Q 1  Q 2. zFor general datalog, this is undecidable (by reduction from decision problems for context free languages)

Why should I care? zContainment is useful in a number of situations, including: yQuery minimization yIndependence of queries using updates yRewriting queries using views  Interesting logic problem

More definitions zEquivalence of queries: Q1  Q2 if they return the same answers for all databases. This is the same as Q 1  Q 2 and Q 2  Q 1 zConjunctive query - a query that is formed only of conjunctions of predicates. zQ(X,Y):- e(X,Z),e(Z,Y)

Containment Mapping zLet Q 1 and Q 2 be two conjunctive queries yQ 1 : I :- J 1, …, J l yQ 2 : H :- G 1, …, G k zA symbol mapping h is said to be a containment mapping if h turns Q 2 into Q 1 ; that is, h(H)= I, and for each i = 1,2,…,k, there is some j such that h(G i )=J j. There is no requirement that each J j be the target of some G i

Proof Sketch zIf there’s a containment mapping from Q 2 to Q 1, then Q 1  Q 2 ySuppose  maps Vars(Q 2 )  Vars(Q 1 ) yLet D be a database and  be an answer y  is a mapping from Vars(Q 1 )  D y   Vars(Q 2 )  D zThe rest of the proof follows later

Example of homomorphism rules zQ 1 : fp(X,Y) :- e(Y,X), e(X,Z) zQ 2 : fp(A,B) :- e(B,A), e(C,A),e(A,D) zFor Q 1  Q 2, map from Q 2 to Q 1

Test for containment of a conjunctive query (Q 1  Q 2 )  Freeze the body of Q 1, and put this into a canonical database  Apply Q 2 to the canonical database  If Q 1 can be derived from Q 2 on the canonical database, then Q 1  Q 2, otherwise not

A chilling example Q 1 : p(X,Z) :- a(X,Y), a(Y,Z) Q 2 : p(X,Z) :- a(X,U), a(V,Z) Canonical Database of Q 1

Proof continued zIf Q 1  Q 2,then there is a containment mapping ySince Q 1  Q 2, we know that if we apply Q 2 to the canonical database formed from Q 1, we’ll get back the same fact we got from applying it to Q 1, which makes a mapping from Q 2 to Q 1.

Conjunctive queries with negation zNegation in the heads of the subgoals, ie: Q(X,Y):- e(X,Z),  e(Z,Y) zThe Levy and Sagiv test looks at an exponential number of canonical databases, thus is  P 2 complete  Consider all partitions of Q 1 ; form canonical databases for all of them, D 1, … D k Ë For each database D i, see if the database makes all subgoals of Q 1 true. Ì For all D i ’s passing step 2, see if it the head of Q 1 can be derived by applying Q 2 Í If so, then Q 1  Q 2, else not

A negative example zQ 1 : p(X,Z):-a(X,Y), a(Y,Z),  a(X,Z) zQ 2 : p(A,C):-a(A,B),a(B,C),  a(A,D)

Conjunctive Queries with Arithmetic Comparisons zQ(X,Y):-e(X,Z),e(Z,Y), Z < Y zTreat the same as the negated subgoals, only a check must be made for each ordering of each partition zAlso  P 2 complete for dense domain such as reals

Example with arithmetic comparisons zQ1:p(X,Z):-a(X,Y), a(Y,Z), X < Y zQ2:p(A,C):-A(A,B),A(B,C), A < C zfalse, see x = z = 0, y = 1

Other complexity results z  queries restricted to queries Q1 and Q2 such that all database predicates have arity at most 2 and every database predicate occurs at most three times in the body of Q1 -  P 2 zConjunctive queries where Q1 is fixed- NP complete zConjunctive queries where Q2 is fixed - polynomial zConjunctive query containment where Q2 is an acyclic query - polynomial time zConjunctive queries where every database predicate occurs at most twice in the body of Q1 - linear time

Rewriting Queries Using Views zUseful in query optimization zGood for query minimization zNeeded to make the best use of cached information zNecessary in data integration

Views zA view is a relation that is not part of the conceptual model, but is visible to the user. zUseful for common expressions, or protecting data zExample: If you had faculty(name, office, ssn) you may want students to access faculty_office(name, office)

Views (con’t.) zViews can be either materialized or virtual zIn data integration, data sources can be thought of as views

An example of rewriting queries using views zSuppose you had two databases: yOne has famous people and whether they are right or left handed yOne has the birthdays of famous people zYou want the birthdays of all of the lefties

Containment in rewriting zQuery of q(X):-e(X,Y), e(Y,X) zView of v(A,B):- e(A,C),e(C,B)

zQ(x,u):-p(x,y),p 0 (y,z),p 1 (x,w),p 2 (w,u) zV 1 (a,b):-p(a,c),p 0 (c,b),p 1 (a,d) zV 2 (a,b):-p 1 (a,b) zV 3 (a,b):-p 2 (a,b) A more complicated example