Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre

Similar presentations


Presentation on theme: "CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre"— Presentation transcript:

1 CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre lmd@cs.pdx.edu 503 725-2405

2 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 2 Goals for today Briefly discuss negative occurrences of variables with universal quantification in domain calculus. Discuss proofs of equivalence/test 1, as desired. Introduce the problem with recursion and negation in Datalog. Introduce stratification – a syntactic restriction that avoids the problem. Mention several ways to define the semantics of a Datalog program.

3 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 3 Consult handout from Ramakrishnan/Gehrke We want values in the query answer to come from the DB or constants in the query. In addition, if we use an existential quantifier, we want the value substituted in for the existential quantifier to come from the DB or the constants in the query. Finally, if we use a universal quantifier, we want to find any value that makes the formula false by only checking the tuples that use values from the DB or the constants in the query.

4 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 4 What about the 3 rd bullet? A typical tuple calculus query with universal quantification: { S | S ε Sailors ^  B ε Boats (B.color=‘red ’ →(  R ε Reserves(S.sid=R.sid^R.bid=B.bid)} which is equivalent to: { S | S ε Sailors ^  B ε Boats (  (B.color=‘red ’) v (  R ε Reserves(S.sid=R.sid^R.bid=B.bid)}

5 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 5 Proving Safe Datalog is contained in Allowable Domain Calculus Induction on the number of rules with the answer predicate in the head.’ Base case: zero rules. (The book assumes that the answer predicate is a DB relation name.) Inductive step: introduce additional variables & introduce x i =y j and x i =v (for repeated var. & constants) Introduce “  z i ” for all var. in body but not in head Introduce R1(…)^R2(…)^  R3(…) for body Find all other rules with same head, construct a body (as above), create one big disjunction.

6 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 6 Proving Allowable Domain Calculus is contained in Relational Algebra Induction on the number of logical connectors in the allowed domain calculus formula. Minimal set of logical connectors ( , v,  ) For each R(A 1, …, A m ) construct an expression that gives the active domain of R. Do that for all R. Construct a dom. calculus Fdom(x) expression that comprises the union of all such domains or constants from the query. { x 1, …, x n | F ^ Fdom(x 1 )^ … ^ Fdom(x n )} We know how to construct rel. alg expressions for Fdom. We form the cross product of Reldom(F) n times and intersect is with the rel. alg. expression we need to express F – the original expression in Q.

7 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 7 Rest of the proof sketch Base case: zero logical connectors. Then F is just one relation predicate. The rel alg expression is  …(σ…R) to accommodate any constants or repeated variables in R(x1, …, xn) and to account for R having more variables than the desired query answer. Induction: Assume it’s true for q logical connectors. F 1 v F 2 : use  …(E 1 ∩ RelDom(F 1 ) n-m ) U  …(E 2 ∩ RelDom(F 2 ) n-k )  F: use RelDom(F) n – E  x (F): use  …(E)

8 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 8 Several Datalog languages Datalog – one rule, no negation, no recursion. (conjunctive queries) Datalog – multiple rules, no negation, no recursion. Datalog – multiple rules, no negation, with recursion. Recursion but not relationally complete. Datalog – multiple rules, with negation, no recursion. Relationally complete but no recursion. Datalog – multiple rules, with negation, with recursion. Relationally complete with recursion but some queries are ambiguous! Our focus in Unit 1 Our focus in Unit 3

9 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 9 Datalog program and it’s dependency graph Given a DB with two relations: Topics(Topic) and Interests(Person,Topic) Result(a):-Interests(a,b,),  DIFF(a). Diff(a) :- Prod(a,b),  Interests(a,b). Prod(a,b) :- Interests(a,c), Topics(b). Draw an arrow from body predicate to corresponding head predicate. Acyclic dependency graph = not recursive. Result Diff Prod InterestsTopics

10 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 10 Another example Anc(x,y) :- Parent-of(x,y). Anc(x,z) :- Anc(x,y), Parent-of(y,z). Recursive Datalog program has a cycle in the dependency graph. Anc Parent-of

11 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 11 Now consider Datalog with recursion & negation Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). What is the query answer?

12 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 12 What does the dependency graph look like? Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). This is a recursive program. Employee Person Student

13 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 13 More about this dependency graph Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). Cycle in dependency graph → recursion. Label negative predicates in dependency graph. Employee Person Student  

14 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 14 Now consider recursion & negation Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x). We have a problem when a cycle in a dependency graph includes (at least) one negative edge. Employee Person Student  

15 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 15 Another example with negation & recursion Node(a). Node(b). Node(c). Node(d). Arc(a,b). Arc(c,d). Reachable(a). Reachable(y) :- Reachable(x), Arc(x,y). Unreachable(x) :- Node(x),  Reachable(x). These are facts from the database.

16 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 16 Stratification – for Datalog with negation and recursion A stratification of a Datalog program P is a partition of P into strata or layers where, for every rule, H :- A 1, …, A m,  B 1, …, B q The rules that define H are all in the same strata. For all positive predicates in this rule, their strata is <= the strata of this rule. Strata(A i ) <= Strata(H). For all negative predicates in this rule, their strata is < the strata of this rule. Strata(A i ) < Strata(H).

17 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 17 Exercise: Define a stratification for this datalog program. Node(a). Node(b). Node(c). Node(d). Arc(a,b). Arc(c,d). Reachable(a). Reachable(y) :- Reachable(x), Arc(x,y). Unreachable(x) :- Node(x),  Reachable(x). These are facts from the database.

18 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 18 Can you define a stratification for this program? Person(Dan). (one tuple in base relation) Student(x) :- Person(x),  Employee(x). Employee(x) :- Person(x),  Student(x).

19 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 19 Comments Stratified Datalog programs have a unique answer. Not all Datalog programs can be stratified.

20 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 20 Slide repeated from Unit 1 (emphasis added) Datalog with recursion Faculty (f-id, name, major-professor) Academic-descendant(x,y) :- Faculty(x,a,y). Academic-descendant(x,z) :- Academic-descendant(x,y), Faculty(y,b,z). How does this Datalog program (without negation) get evaluated? Fire all rules (from right to left) until you don’t produce any new tuples in Academic-descendant. Note each Datalog rule is independent. The variable names in separate rules have no connection.

21 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 21 Semantics of a Datalog Program The fixed point of the “immediate consequence” operator, applied to the Datalog program. The minimal model for the Datalog program. Plus others, particularly for Datalog with negation.

22 CS 510 Principles of DB Systems, Fall 2006 © Lois Delcambre, Dave Maier 22 Comments For Datalog with recursion, but NO negation, then: The minimal model is unique. The minimal model is always the intersection of all the models. The minimal model is the same as the fixed point of the immediate consequence operator. This language is monotonic (rules only add facts) For Datalog with recursion & negation: There may not be a unique minimal model.


Download ppt "CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre"

Similar presentations


Ads by Google