CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

Relational Calculus and Datalog
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Calculus Chapter 4, Part B.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
1 541: Relational Calculus. 2 Relational Calculus  Comes in two flavours: Tuple relational calculus (TRC) and Domain relational calculus (DRC).  Calculus.
Answer Set Programming Overview Dr. Rogelio Dávila Pérez Profesor-Investigador División de Posgrado Universidad Autónoma de Guadalajara
1 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke Deductive Databases Chapter 25.
1 9. Evaluation of Queries Query evaluation – Quantifier Elimination and Satisfiability Example: Logical Level: r   y 1,…y n  r’ Constraint.
Winter 2002Arthur Keller – CS 18014–1 Schedule Today: Feb. 26 (T) u Datalog. u Read Sections Assignment 6 due. Feb. 28 (TH) u Datalog and SQL.
CSE 636 Data Integration Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman.
Relational Calculus. Another Theoretical QL-Relational Calculus n Comes in two flavors: Tuple relational calculus (TRC) and Domain relational calculus.
Winter 2004/5Pls – inductive – Catriel Beeri1 Inductive Definitions (our meta-language for specifications)  Examples  Syntax  Semantics  Proof Trees.
Relational Calculus CS 186, Spring 2007, Lecture 6 R&G, Chapter 4 Mary Roth   We will occasionally use this arrow notation unless there is danger of.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Relational Calculus R&G, Chapter 4   We will occasionally use this arrow notation unless there is danger of no confusion. Ronald Graham Elements of Ramsey.
Relational Calculus CS 186, Fall 2003, Lecture 6 R&G, Chapter 4   We will occasionally use this arrow notation unless there is danger of no confusion.
CSE115/ENGR160 Discrete Mathematics 01/20/11 Ming-Hsuan Yang UC Merced 1.
Deductive Databases Chapter 25
CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1 Introduction to Logic and Databases R. Ramakrishnan Database Management Systems, Gehrke- Ramakrishnan,
Rutgers University Relational Calculus 198:541 Rutgers University.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
Lecture 3 [Self Study] Relational Calculus
The Relational Model: Relational Calculus
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Calculus Chapter 4, Section 4.3.
1 CS 430 Database Theory Winter 2005 Lecture 6: Relational Calculus.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical.
Database Management Systems, R. Ramakrishnan1 Relational Calculus Chapter 4.
Database Management Systems,1 Relational Calculus.
Relational Calculus Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 17, 2007 Some slide content courtesy.
Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers.
Relational Calculus R&G, Chapter 4. Relational Calculus Comes in two flavors: Tuple relational calculus (TRC) and Domain relational calculus (DRC). Calculus.
Relational Calculus CS 186, Spring 2005, Lecture 9 R&G, Chapter 4   We will occasionally use this arrow notation unless there is danger of no confusion.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
Lecture 7: Foundations of Query Languages Tuesday, January 23, 2001.
1 Reasoning with Infinite stable models Piero A. Bonatti presented by Axel Polleres (IJCAI 2001,
Relational Calculus Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
Copyright © 2004 Pearson Education, Inc.. Chapter 6 The Relational Algebra and Relational Calculus.
Albert Gatt LIN3021 Formal Semantics Lecture 3. Aims This lecture is divided into two parts: 1. We make our first attempts at formalising the notion of.
Database Management Systems, R. Ramakrishnan1 Relational Calculus Chapter 4, Part B.
Section 1.4. Propositional Functions Propositional functions become propositions (and have truth values) when their variables are each replaced by a value.
Relational Calculus. Relational calculus query specifies what is to be retrieved rather than how to retrieve it. – No description of how to evaluate a.
Extensions of Datalog Wednesday, February 13, 2001.
Lecture 9: Query Complexity Tuesday, January 30, 2001.
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4b: Domain Independence and Safety Lois Delcambre
Relational Calculus Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 4.
Relational Calculus Chapter 4, Section 4.3.
CS589 Principles of DB Systems Spring 2014 Unit 2: Recursive Query Processing Lecture 2-1 – Naïve algorithm for recursive queries Lois Delcambre (slides.
Lois Delcambre (slides from Dave Maier)
Formal Modeling Concepts
Goal for this lecture Demonstrate how we can prove that one query language is more expressive than (i.e., “contained in” as described in the book) another.
CS 186, Spring 2007, Lecture 6 R&G, Chapter 4 Mary Roth
CS 186, Fall 2002, Lecture 8 R&G, Chapter 4
Logic Based Query Languages
Predicates and Quantifiers
Datalog Inspired by the impedance mismatch in relational databases.
CS 186, Spring 2007, Lecture 6 R&G, Chapter 4 Mary Roth
Chapter 27: Formal-Relational Query Languages
Relational Calculus Chapter 4, Part B 7/1/2019.
CS589 Principles of DB Systems Fall 2008 Lecture 4a: Introduction to Datalog Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4a: Introduction to Datalog Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4e: Logic (Model-theoretic view of a DB) Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4b: Domain Independence and Safety Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Presentation transcript:

CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 2 Goals for today Briefly discuss negative occurrences of variables with universal quantification in domain calculus. Discuss proofs of equivalence/test 1, as desired. Introduce the problem with recursion and negation in Datalog. Introduce stratification – a syntactic restriction that avoids the problem. Mention several ways to define the semantics of a Datalog program.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 3 Consult handout from Ramakrishnan/Gehrke We want values in the query answer to come from the DB or constants in the query. In addition, if we use an existential quantifier, we want the value substituted in for the existential quantifier to come from the DB or the constants in the query. Finally, if we use a universal quantifier, we want to find any value that makes the formula false by only checking the tuples that use values from the DB or the constants in the query.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 4 What about the 3 rd bullet? A typical tuple calculus query with universal quantification: { S | S ε Sailors ^  B ε Boats (B.color=‘red ’ →(  R ε Reserves(S.sid=R.sid^R.bid=B.bid)} which is equivalent to: { S | S ε Sailors ^  B ε Boats (  (B.color=‘red ’) v (  R ε Reserves(S.sid=R.sid^R.bid=B.bid)}

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 5 Proving Safe Datalog is contained in Allowable Domain Calculus Induction on the number of rules with the answer predicate in the head.’ Base case: zero rules. (The book assumes that the answer predicate is a DB relation name.) Inductive step: introduce additional variables & introduce x i =y j and x i =v (for repeated var. & constants) Introduce “  z i ” for all var. in body but not in head Introduce R1(…)^R2(…)^  R3(…) for body Find all other rules with same head, construct a body (as above), create one big disjunction.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 6 Proving Allowable Domain Calculus is contained in Relational Algebra Induction on the number of logical connectors in the allowed domain calculus formula. Minimal set of logical connectors ( , v,  ) For each R(A 1, …, A m ) construct an expression that gives the active domain of R. Do that for all R. Construct a dom. calculus Fdom(x) expression that comprises the union of all such domains or constants from the query. { x 1, …, x n | F ^ Fdom(x 1 )^ … ^ Fdom(x n )} We know how to construct rel. alg expressions for Fdom. We form the cross product of Reldom(F) n times and intersect is with the rel. alg. expression we need to express F – the original expression in Q.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 7 Rest of the proof sketch Base case: zero logical connectors. Then F is just one relation predicate. The rel alg expression is  …(σ…R) to accommodate any constants or repeated variables in R(x1, …, xn) and to account for R having more variables than the desired query answer. Induction: Assume it’s true for q logical connectors. F 1 v F 2 : use  …(E 1 ∩ RelDom(F 1 ) n-m ) U  …(E 2 ∩ RelDom(F 2 ) n-k )  F: use RelDom(F) n – E  x (F): use  …(E)

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 8 Several Datalog languages Datalog – one rule, no negation, no recursion. (conjunctive queries) Datalog – multiple rules, no negation, no recursion. Datalog – multiple rules, no negation, with recursion. Recursion but not relationally complete. Datalog – multiple rules, with negation, no recursion. Relationally complete but no recursion. Datalog – multiple rules, with negation, with recursion. Relationally complete with recursion but some queries are ambiguous! Our focus in Unit 1 Our focus in Unit 3

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 9 Datalog program and it’s dependency graph Given a DB with two relations: Topics(Topic) and Interests(Person,Topic) Result(a):-Interests(a,b,),  DIFF(a). Diff(a) :- Prod(a,b),  Interests(a,b). Prod(a,b) :- Interests(a,c), Topics(b). Draw an arrow from body predicate to corresponding head predicate. Acyclic dependency graph = not recursive. Result Diff Prod InterestsTopics

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 10 Another example Anc(x,y) :- Parent-of(x,y). Anc(x,z) :- Anc(x,y), Parent-of(y,z). Recursive Datalog program has a cycle in the dependency graph. Anc Parent-of

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 11 Now consider Datalog with recursion & negation Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). What is the query answer?

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 12 What does the dependency graph look like? Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). This is a recursive program. Employee Person Student

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 13 More about this dependency graph Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). Cycle in dependency graph → recursion. Label negative predicates in dependency graph. Employee Person Student  

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 14 Now consider recursion & negation Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). We have a problem when a cycle in a dependency graph includes (at least) one negative edge. Employee Person Student  

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 15 Another example with negation & recursion Node(a). Node(b). Node(c). Node(d). Arc(a,b). Arc(c,d). Reachable(a). Reachable(y) :- Reachable(x), Arc(x,y). Unreachable(x) :- Node(x),  Reachable(x). These are facts from the database.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 16 Stratification – for Datalog with negation and recursion A stratification of a Datalog program P is a partition of P into strata or layers where, for every rule, H :- A 1, …, A m,  B 1, …, B q The rules that define H are all in the same strata. For all positive predicates in this rule, their strata is <= the strata of this rule. Strata(A i ) <= Strata(H). For all negative predicates in this rule, their strata is < the strata of this rule. Strata(A i ) < Strata(H).

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 17 Exercise: Define a stratification for this datalog program. Node(a). Node(b). Node(c). Node(d). Arc(a,b). Arc(c,d). Reachable(a). Reachable(y) :- Reachable(x), Arc(x,y). Unreachable(x) :- Node(x),  Reachable(x). These are facts from the database.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 18 Can you define a stratification for this program? Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x).

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 19 Comments Stratified Datalog programs have a unique answer. Not all Datalog programs can be stratified.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 20 Slide repeated from Unit 1 (emphasis added) Datalog with recursion Faculty (f-id, name, major-professor) Academic-descendant(x,y) :- Faculty(x,a,y). Academic-descendant(x,z) :- Academic-descendant(x,y), Faculty(y,b,z). How does this Datalog program (without negation) get evaluated? Fire all rules (from right to left) until you don’t produce any new tuples in Academic-descendant. Note each Datalog rule is independent. The variable names in separate rules have no connection.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 21 Semantics of a Datalog Program The fixed point of the “immediate consequence” operator, applied to the Datalog program. The minimal model for the Datalog program. Plus others, particularly for Datalog with negation.

CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 22 Comments For Datalog with recursion, but NO negation, then: The minimal model is unique. The minimal model is always the intersection of all the models. The minimal model is the same as the fixed point of the immediate consequence operator. This language is monotonic (rules only add facts) For Datalog with recursion & negation: There may not be a unique minimal model.