CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre 503 725-2405.

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

Relational Calculus and Datalog
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
Knowledge & Reasoning Logical Reasoning: to have a computer automatically perform deduction or prove theorems Knowledge Representations: modern ways of.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
Chapter 3 Tuple and Domain Relational Calculus. Tuple Relational Calculus.
©Silberschatz, Korth and Sudarshan5.1Database System Concepts Chapter 5: Other Relational Languages Query-by-Example (QBE) Datalog.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
Ver 1,12/09/2012Kode :CCs 111,Sistem basis DataFASILKOM Chapter 5: Other Relational Languages Database System Concepts, 5th Ed. ©Silberschatz, Korth and.
Answer Set Programming Overview Dr. Rogelio Dávila Pérez Profesor-Investigador División de Posgrado Universidad Autónoma de Guadalajara
1 9. Evaluation of Queries Query evaluation – Quantifier Elimination and Satisfiability Example: Logical Level: r   y 1,…y n  r’ Constraint.
Winter 2002Arthur Keller – CS 18014–1 Schedule Today: Feb. 26 (T) u Datalog. u Read Sections Assignment 6 due. Feb. 28 (TH) u Datalog and SQL.
Winter 2004/5Pls – inductive – Catriel Beeri1 Inductive Definitions (our meta-language for specifications)  Examples  Syntax  Semantics  Proof Trees.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Using Definite Knowledge Notes for Ch.3 of Poole et al. CSCE 580 Marco Valtorta.
Deductive Databases Chapter 25
DEDUCTIVE DATABASE.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
The Relational Model: Relational Calculus
1 CS 430 Database Theory Winter 2005 Lecture 6: Relational Calculus.
CSE 544 Relational Calculus Lecture #2 January 11 th, Dan Suciu , Winter 2011.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
Computing & Information Sciences Kansas State University Wednesday, 17 Sep 2008CIS 560: Database System Concepts Lecture 9 of 42 Wednesday, 18 September.
Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 26 A First Course in Database Systems.
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Extensions of Datalog Wednesday, February 13, 2001.
CS589 Principles of DB Systems Fall 2008 Lecture 4b: Domain Independence and Safety Lois Delcambre
Relational Calculus Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 4.
Relational Calculus Chapter 4, Section 4.3.
By P. S. Suryateja Asst. Professor, CSE Vishnu Institute of Technology
CS589 Principles of DB Systems Spring 2014 Unit 2: Recursive Query Processing Lecture 2-1 – Naïve algorithm for recursive queries Lois Delcambre (slides.
Introduction to Logic for Artificial Intelligence Lecture 2
Module 2: Intro to Relational Model
Propositional Calculus: Boolean Functions and Expressions
Relational Model By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany)
Relational Calculus Chapter 4, Part B
Goal for this lecture Demonstrate how we can prove that one query language is more expressive than (i.e., “contained in” as described in the book) another.
Chapter 6: Formal Relational Query Languages
Predicate Calculus Validity
First-Order Logic and Inductive Logic Programming
Chapter 2: Intro to Relational Model
Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition
Elementary Metamathematics
Database Applications (15-415) Relational Calculus Lecture 6, September 6, 2016 Mohammad Hammoud.
Logic Programming Languages
Chapter 5: Other Relational Languages
Relational Calculus.
Motivation for Datalog
MA/CSSE 474 More Math Review Theory of Computation
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
CS 186, Fall 2002, Lecture 8 R&G, Chapter 4
Logic Based Query Languages
Chapter 2: Intro to Relational Model
Chapter 6: Formal Relational Query Languages
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
Datalog Inspired by the impedance mismatch in relational databases.
This Lecture Substitution model
From Relational Calculus to Relational Algebra
Relational Algebra & Calculus
Chapter 27: Formal-Relational Query Languages
Relational Calculus Chapter 4, Part B 7/1/2019.
Representations & Reasoning Systems (RRS) (2.2)
CS589 Principles of DB Systems Fall 2008 Lecture 4a: Introduction to Datalog Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4a: Introduction to Datalog Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4e: Logic (Model-theoretic view of a DB) Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4b: Domain Independence and Safety Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Presentation transcript:

CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 2 Goal for this lecture Demonstrate how we can prove that one query language is more expressive than (what the book calls “contained in”) another. Introduce the way the proofs use mathematical induction Walk through some of the proofs from the book Summarize the results of QL equivalence

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 3 Equivalence of relational query languages Two queries are equivalent if they return the same answer for any possible DB state. One QL 2 is more expressive than QL 1 if we can prove that every query expressible in QL 1 can be expressed in QL 2. QL 1 and QL 2 are equivalent if you can prove “more expressive” or “contained in” in both directions. The following four query languages are equivalent. Relational algebra Safe, non-recursive datalog programs with negation Allowed domain calculus (and allowed tuple calculus)

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 4 How do we prove it? Complete the circle 1. Prove relational algebra is contained in safe, non-recursive Datalog with negation 2. Prove safe, non-recursive Datalog is contained in allowable domain calculus with negation 3. Prove allowable domain relational calculus is contained in relational algebra Then we will know that all three languages are equivalent.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 5 1. Prove relational algebra is contained in safe, non-recursive Datalog We need to take an arbitrary relational algebra query and show how to construct an equivalent safe, non- recursive Datalog program. How shall we frame the proof? How do we take an arbitrary relational algebra query expression? By induction on the number of operators that appear in the query expression. We need: a (minimal) list of the operators in relational algebra. a base case. the inductive hypotheses.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 6 1. Prove relational algebra is contained in safe, non-recursive Datalog (continued) Minimal set of operators (with the full expressive power of the relational algebra): , -, , ⋈, s ( By the way, how would you prove that this is, in fact, a minimal set of operators?) Base case for the induction: a relational algebra expression with zero operators. that is, a query of the form: R What is the equivalent Datalog program? answer(x 1, x 2, …, x n ) :- R(x 1, x 2, …, x n ). or just R(x 1, x 2, …, x n ).

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 7 The induction step Assume that the theorem is true for all relational algebra expression with q or fewer operators. Then consider a relational algebra expression with q+1 operators. What can that (q+1) st operator be? One of the operators in our minimal set. UNION: if the query expression Q is Q 1  Q 2 then the equivalent safe, non-recursive Datalog program is: Q(x 1, x 2, …, x n ) :- Q 1 (x 1, x 2, …, x n ). Q(x 1, x 2, …, x n ) :- Q 2 (x 1, x 2, …, x n ). How do we know this Datalog query is safe? We are assuming Q 1 and Q 2 are union-compatible. Is that okay? (QED for Union)

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 8 The remaining operators: DIFFERENCE: if the query Q is Q 1 ̶ Q 2 then the equivalent safe, non-recursive Datalog program is: Q(x 1, x 2, …, x n ) :- Q 1 (x 1, x 2, …, x n ),  Q 2 (x 1, x 2, …, x n ). How do we know that Q, Q1, and Q2 have same attribute and thus the same number of attributes? How do we know that this is safe? (QED for Diff.) PROJECT: NATURAL JOIN: SELECT:

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 9 Prove that the theorem holds for the remaining operators: PROJECT: NATURAL JOIN: SELECT:

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 10 Reminder: Which Datalog language? Datalog – one rule, no negation, no recursion. (conjunctive queries) Datalog – multiple rules, no negation, no recursion. Datalog – multiple rules, no negation, with recursion. Recursion but not relationally complete. Datalog – multiple rules, with negation, no recursion. Relationally complete but no recursion. Datalog – multiple rules, with negation, with recursion. Relationally complete with recursion but some queries are ambiguous! Our focus here

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 11 Prove safe, non-recursive Datalog is contained in allowable domain calculus How would you get started?

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 12 (Reminder) Datalog syntax Atomic formulas: R(y 1, y 2, …, y k ) – a predicate formula x = y (Note: this is syntatic sugar for equal(x,y).) R(v 1, v 2, …, v k ) – a ground atomic formula Literal: an atomic formula (positive literal) or the negation of an atomic formula (negative literal) Clause (Datalog rule): L :- L 1, L 2, …, L n. where L is a predicate formula and Li, i = 1, …, n, is a positive or negative literal.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 13 Prove safe, non-recursive Datalog is contained in allowable domain calculus Induction on the number of rules that appear in the Datalog program that have the relation symbol R in it’s head. Base case: In the book, they use the case of zero rules. This means that the query answer is simply a relation from the existing database. Inductive step: Assume that we can express any Datalog program with q rules in an equivalent allowable domain calculus expression. Consider a Datalog program consisting of q+1 rules. Choose a rule and construct an allowable domain calculus expression.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 14 Proving Safe Datalog is contained in Allowable Domain Calculus Inductive step: introduce new variables – as many as there are attributes in the query answer. introduce x i =y j (for repeated variables that occur in literal R) and x i =v (for constants that appear in literal R) Introduce “  z i ” for all variables in the body but not in head Introduce R1(…)^R2(…)^  R3(…) to represent the body of the rule Find all other rules with same head, construct a body (as above), and then create one big disjunction.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier Prove allowable domain relational calculus is contained in relational algebra This proof may be a little more interesting than the first two. In order to construct a relational algebra expression, you need to build some useful relations – to be used as input to relational algebra operators – based on what is present in the domain calculus expression.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 16 Proving Allowable Domain Calculus is contained in Relational Algebra Induction on the number of logical connectors in the allowed domain calculus formula. Minimal set of logical connectors ( , v,  ) For each R(A 1, …, A m ) construct an expression that gives the active domain of R. Do that for all R. Construct a dom. calculus Fdom(x) expression that comprises the union of all such domains or constants from the query. { x 1, …, x n | F ^ Fdom(x 1 )^ … ^ Fdom(x n )} We know how to construct rel. alg expressions for Fdom. We form the cross product of Reldom(F) n times and intersect is with the rel. alg. expression we need to express F – the original expression in Q.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 17 Rest of the proof sketch Base case: zero logical connectors. Then F is just one relation predicate. The rel alg expression is  …(σ…R) to accommodate any constants or repeated variables in R(x1, …, xn) and to account for R having more variables than the desired query answer. Induction: Assume it’s true for q logical connectors. F 1 v F 2 : use  …(E 1 ∩ RelDom(F 1 ) n-m ) U  …(E 2 ∩ RelDom(F 2 ) n-k )  F: use RelDom(F) n – E  x (F): use  …(E)