CS240A: Databases and Knowledge Bases From Differential Fixpoints to Magic Sets Carlo Zaniolo Department of Computer Science University of California,

CS240A: Databases and Knowledge Bases From Differential Fixpoints to Magic Sets Carlo Zaniolo Department of Computer Science University of California, Los Angeles January, 2002 Notes From Chapter 9 of Advanced Database Systems by Zaniolo, Ceri, Faloutsos, Snodgrass, Subrahmanian and Zicari Morgan Kaufmann, 1997

Recursive Predicates r 1 : anc(X, Y)  parent(X, Y). r 2 : anc(X, Z)  anc(X,Y), parent(Y,Z). r 2 is a recursive rule---a left linear one r 1 is the a nonrecursive rule defining a recursive predicate—this is called an exit rule. An alternative definition for anc: r 3 : anc(X, Y)  parent(X, Y). r 4 : anc(X, Z)  anc(X,Y), anc(Y,Z). Here r 4 is a quadratic rule.

Fixpoint Computation The inflationary immediate consequence operator for P:  P ( I ) = T P ( I )  I We have:  P  n (  ) = T P  n (  ) lfp(T P ) = T P   (  ) = lfp(  P ) =  P   (  )

Fixpoint Computation (cont.) Naïve Fixpoint Algorithm for P (M =  for now  {S : = M ; S: =  P (M) while S  S { S : = S; S: =  P (S) } } We can replace the first  P with  E and the second one with  R respectively denoting the immediate consequence operators for the exit rules and the recursive ones.

Differential Fixpoint (a.k.a. Seminaive Computation) Redundant Computation: the j th iteration step also re- computes all atoms obtained in the (j – 1) th step. Finite differences techniques tracing the derivations over two steps: 1. S the set of atoms obtained up to step j-1 2. S’ the set of atoms obtained up to step j 3.  S =  R (S) - S = T R (S) - S denotes the new atoms at step j (i.e., the atoms that were not in S at step j-1) 4.  S =  R (S) - S = T R (S) - S are the new atoms obtained at step j+1.

Differential Fixpoint Algorithm (M =  for now  {S := M;  S := T E (M); S  := S  S; while  S   {  S := T R (S) - S; S := S ;  S :=  S ; S  := S  S } } anc,  anc, and anc, respectively, denote ancestor atoms that are in S,  S, and S = S  S.

Rule Differentiation  To compute  S: = T R ( S) - S we can use a T R defined by the following rule:  anc(X, Z)  anc(X,Y), parent(Y,Z).  This can be rewritten as:  anc(X, Z)   anc(X,Y), parent(Y,Z).  anc(X, Z)  anc(X,Y), parent(Y,Z). The second rule can now be eliminated, since it produces only atoms that were already contained in anc, i.e., in the S computed in the previous iteration. Thus, for linear rules, replace:  S := T R (S) - S  by  S := T R (  S) - S  Forn nonlinear rules the rewriting is more complex.

Non Linear Rules ancs(X, Y)  parent(X, Y). ancs(X, Z)  ancs(X,Y), ancs(Y,Z). r:  ancs(X, Z)  ancs(X,Y), ancs(Y,Z). r 1 :  ancs(X, Z)   ancs(X,Y), ancs(Y,Z). r 2 :  ancs(X, Z)  ancs(X,Y), ancs(Y,Z). Now, we can re-write r 2 as: r 2,1 :  ancs(X, Z)  ancs(X,Y),  ancs(Y,Z). r 2,2 :  ancs(X, Z)  ancs(X,Y), ancs(Y,Z). Rule r 2,2 produces only `old' values, and can be eliminated. We are left with rules r 1 and r 2,1 :  ancs(X, Z)   ancs(X,Y), ancs(Y,Z).  ancs(X, Z)  ancs(X,Y),  ancs(Y,Z).

Semivaive Fixpoint (cont.)  Analogy with symbolic differentiation  Performance improvements: it is typically the case that n =   S  << N =  S    S .  The original ancs rule, for instance, requires the equijoin of two relations of size N; after the differentiation we need to compute two equijoins, each joining a relation of size n with one of size N.

General Nonlinear Rules A recursive rule of rank k is as follows: r: Q 0  c 0, Q 1, c 1, Q 2,  Q k, c k Is rewritten as follows : r 1 :  Q 0  c 0,  Q 1, c 1, Q 2,  Q k, c k r 2 :  Q 0  c 0, Q 1, c 1,  Q 2,  Q k, c k  r k :  Q 0  c 0, Q 1, c 1, Q 2,  Q k, c k Thus the jth rule has the form: r j :  Q 0   Q   Q j  Q

Iterated Fixpoint Computation for program P stratified in n strata Let P j, 1  j  n denote the rules with their head in the j-th stratum. Then, M j be inductively constructed as follows:  1. M 0 =  and  2. M j =  Pj   (M j-1 ). The naïve fixpoint algorithm remains the same, but M := M j-1 and  P is replaced by  Pj Theorem: Let P be a positive program stratified in n strata, and let M n be the result produced by the iterated fixpoint computation. Then, M n = lfp( T P ). For programs with negated goals the computation by strata is necessary to produce the correct result (I.e., the M n is the stable model for P---not discussed here)

Bottom-Up versus Top-Down Computation anc(X, Y)  parent(X, Y). Compiled Rules anc(X, Z)  anc(X,Y), parent(Y,Z). parent(X, Y)  father(X, Y). parent(X, Y)  mother(X, Y). mother(anne, silvia). Database mother(silvia, marc).  The differential fixpoint is computed in a bottom-up fashion. For a query ?anc(X, Y) this is optimal.  But many queries are such as ?anc(marc, Y) we want to propagate down the ‘marc’ constraint. Same for query forms: ?anc($X, Y), ?anc(X, $Y), or ?anc($X, $Y).

Specialization for Left-linear Recursive Rules ?anc(tom, Desc). anc(Old, Young)  parent(Old, Young). anc(Old, Young)  anc(Old, Mid), parent(Mid, Young) This is changed into: ? anc(tom, Desc ) anc(Old/tom, Young)  parent(Old/tom, Young). anc(Old/tom, Young)  anc(Old/tom, Mid), parent(Mid, Young). Similar to the pushing selection inside recursion of query optimizers. This works for left-linear rules with the query form: ?anc($Someone, Desc)

Right-linear rules anc(Old, Young)  parent(Old, Young). anc(Old, Young)  parent(Old, Mid), anc(Mid, Young). Descendants of Tom : ? anc(TOM, X)  This query can no longer be implemented by specializing the program. Solution: turn the rules into equivalent left-recursive ones!  Symmetrically anc(X, $Y) cannot be supported into the above, to right- linear one above to which specialization applies.  The situation is symmetric. A query such as anc(X, $Y) cannot be supported on the left-linear version of the program. But the program can be transformed into the one above, to right-linear rules above to which specialization can apply.  For each left (right) linear rule there exists an equivalent right(left) linear program---similar tor regular grammars in PLs.  Deductive Database compilers do that.

The Magic Set Method  Specialization only works for left/right linear programs. It does not work in general, even for linear rules. The same generation example: sg(A, A). sg(X, Y)  parent(XP,X), sg(XP,YP), parent(YP,Y). ?sg(marc, Who).  This program cannot be computed in a bottom-up fashion because the exit rule is not safe.  We can compute a “magic” set containing all the ancestors of marc and add them to the two rules.

Magic Sets for non-recursive rules  Find the graduating seniors and their parents’ address: spa(SN, PN, Paddr)  senior(SN), parent(SN, PN), address(PN, Paddr). senior(SN)  student(SN, _, senior),graduating(SN).  To find the address of the parent named `Joe Doe’ ?spa(SN, `Joe Doe’, Paddr)  Suppose that computing parent(X, $Y) is safe and not too expensive.

Magic Set Rewriting spa_q(‘Joe Doe’). m.senior(SN)  spa_q(SN), parent(SN,PN). senior(SN)  m.senior(SN),  student(SN, _, senior), graduating(SN). The rest remains unchanged: spa(SN, PN, Paddr)  senior(SN), parent(SN,PN), address(PN,Paddr). ? spa(SN, `Joe Doe’, Paddr).

The Same Generation Example sg(A, A). sg(X, Y)  parent(XP,X), sg(XP,YP), parent(YP,Y). ?sg(marc, Who).  This program cannot be computed in a bottom-up fashion because the exit rule is not safe.  We can compute a “magic” set containing all the ancestors of marc and add them to the two rules.  The magic set computation utilizes the bound arguments and goals in rules (blue).The first argument of sg is bound in the query. Thus X is bound and through goal parent(XP, X) the binding is passed to XP in the recursive goal. The variables Y and YP remain unbound

Magic Sets (Cont.) Magic set rules: m.sg(marc). m.sg(XP)  m.sg(X), parent(XP,X). Transformed rules: sg(X, X)  m.sg(X). sg(X, Y)  parent(XP,X), sg(XP,YP), parent(YP,Y), m.sg(X). Query: ?sg(marc, Who).  The rules for the magic predicates are built by using: (1) the query constant as the exit rule (a fact). (2) the bound arguments and predicates from the recursive rules---but the head and tail must be switched!

Recursive Methods  There are many other recursive methods, but the magic set is the most general and more widely use in deductive systems—including LDL++

CS240A: Databases and Knowledge Bases From Differential Fixpoints to Magic Sets Carlo Zaniolo Department of Computer Science University of California,

Similar presentations

Presentation on theme: "CS240A: Databases and Knowledge Bases From Differential Fixpoints to Magic Sets Carlo Zaniolo Department of Computer Science University of California,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS240A: Databases and Knowledge Bases From Differential Fixpoints to Magic Sets Carlo Zaniolo Department of Computer Science University of California,

Similar presentations

Presentation on theme: "CS240A: Databases and Knowledge Bases From Differential Fixpoints to Magic Sets Carlo Zaniolo Department of Computer Science University of California,"— Presentation transcript:

Similar presentations

About project

Feedback