2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes.

2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes tuples using views that cannot contribute to the rewriting, and then discards these tuples We show examples, and then how to address the problems

2005lav-iv2 Example : A db: parenthood relation par(c, p) A view: v(C, G) :- par(C, P), par(P, G) // only grandchildren A query: Q: q(X, Y) :- par(X, Z), par(Z, Y) // find grandchildren The algorithm inverts the view: par(C, f(C, G)), par ((f(C,G), G) -: v(C,G) Given n tuples in the view, it produces 2n tuples, then joins, the discards the results that contain f(-,-) The bucket algorithm will spend more time on rewriting, find: Q’(X, Y) :- v(X, Y) And then output the n results

2005lav-iv3 Example (university db) : Views: v1(s, c, q, t) :- registered(s, c, q), course(c, t), c>=500, q>=a98 v2(s, p, c, q) :- registered(s, c, q), teaches(p, c, q) v3(s, c) :- registered(s, c, q), q<=a94 v4(p, c, t, q) :- registered(s, c, q), teaches(p, c, q), course(c, t), q<=a97 Query: q(s, p, c) :- registered(s, c, q), teaches(p, c, q), course(c, t), c>=300, q>=a95 Inverting v3: registered(s, c, f(s,c)) -: v3(s, c) This may produce any number of facts for registered, but for this query none can be used – why?

2005lav-iv4 v3(s, c) :- registered(s, c, q), q<=a94 q(s, p, c) :- registered(s, c, q), teaches(p, c, q), course(c, t), c>=300, q>=a95 How should the constraint on q in v3 be represented? Could export it by f(s, c) = a95 in query (how is q in the query transformed to f(s,c)?) But, what if the view contained no constraint?  The view must export variables constrained in the query The query has a join on q with teaches; teaches facts are derived only from other views, so q will be exported as a different function symbol, or as q (which of these here?)  a join will fail (cannot join f1(-,-) with f2(-,-) or a regular variable)  The view must export join variables of the query

2005lav-iv5 The factors that determine usability of a view are the same as in the bucket algorithm, but the inverse rules algorithm tries to use all views anyway Solution: compose query with inverse rules, to obtain a new query that uses directly the views Composition: Consider the heads of inverse rules as a db – collection of facts Look for valuations – mapping of query variables that map query atoms to this db Then repalce query goals by views

2005lav-iv6 Example : A db: parenthood relation par(c, p) A view: v(C, G) :- par(C, P), par(P, G) // only grandchildren A query: Q: q(X, Y) :- par(X, Z), par(Z, Y) // find grandchildren The algorithm inverts the view: par(C, f(C, G)), par ((f(C,G), G) -: v(C,G) Two candidate valuation mappings: X  C, Z  f(C,G), Y  G  q(C, G) :- v(C, G), v(C, G) X  f(C, G), Z ,G, Y  f(C, G)  (assuming we add C=G) q(f(G, G), f(G,G)) :- v(G, G), v(G, G) 2 nd is discarded – no function symbols in result Minimization of 1 st gives q(C, G) :- v(C, G), same as bucket ‘db’

2005lav-iv7 q(s, p, c) :- registered(s, c, q), teaches(p, c, q), course(c, t), c>=300, q>=a95 registered(s, c, f(s, c)), f(s, c)<=a94 :- v3(s, c) Any valuation that uses this fact must map q  f(s, c) The constraint f(s, c) =a95, but what if there is no constraint to export? The mapping q  f(s, c) cannot be used to map teaches to any fact derived from other views  v3 cannot be used

2005lav-iv8 A mapping will fail to define a valuation if a view does not export a join variable, and does not contain the join (why?) The view does not export a variable that is constrained in the query (cannot ‘check’ the constraint in the ‘db’) Thus, the results (for a CQ query, possibly with constraints) will be the same as for bucket (assuming it is correct & complete) The amount of work invested will probably be similar Composition can be performed also for Datalog queries, but weeding out useless mappings is more difficult

2005lav-iv9 The MiniCon algorithm --- the final one?  Motivation  Preliminaries  The MiniCon algorithm

2005lav-iv10  Motivation Previous algorithms: bucket, inverse rules, may be quite expensive to use, especially for systems with many views. The bucket algorithm has a narrow peephole in 1 st stage – each bucket is for a single atom  global constraints are treated only in 2 nd stage  Many useless combinations may be examined The inverse rules algorithm improved by composition, seems to perform similar work The motivation: find an algorithm that will do more work in preliminary filtering, and will scale up to hundreds of views

2005lav-iv11  Preliminaries The idea Once a view is put in a bucket of a query atom, switch to considering join variables – and find which other atoms are necessarily covered by the view Along the way, find out also which view head variables need to be equated Given coverage by views, combine views with disjoint covers Expected gain: more filtering in the 1 st stage, better representation of information  A smaller number of combinations, reduced number of containment checks in the 2 nd stage

2005lav-iv12 Example : A db: parenthood relation par(c, p) A view: v(C, G) :- par(C, P), par(P, G) // only grandchildren A query: Q: q(X, Y) :- par(X, Z), par(Z, Y) Bucket : one view in each bucket par(X, Z): { v(X,G)} par(Z, Y): {v(P, Y)} When the two view atoms are combined, a containment check discovers that G=Y  containment, & redundancy of 2nd atom Alternative: given par(X, Z): v(X,G), since Z (join var) occurs in 2 nd atom of query, add par(Z, Y) to coverage of v(X,G), with G=Y In 2 nd stage, just use v(X, Y)

2005lav-iv13 Assumptions, terminology: CQ queries and views, for now: no constants / constraints in query/views View definitions use variables different from those in query or other views (disjoint sets of variables) b(Q) – body atoms of Q, b(V) – body atoms of view V A mapping from vars(Q) to a vars(V) is interesting only if it maps a non-empty subset of b(Q) to b(V) Considered mappings always map Q head vars to V head vars – head var preservation – (hvp) If h maps x in vars(Q) to an existential var in some V, then all atoms of b(Q) that contain x must be mapped to same V: join variable condition --- (jvc)

2005lav-iv14 Given Q(X), assume Q’ is a rewriting in terms of views Q’: q(X) :- v1(X1), …, vn(Xn) (some vi, vj may be occurrences of same view v)  Exists containment mapping h from Q to exp(Q’) (satisfies hvp) Let Gi be the set of atoms of b(Q) mapped to b(exp(vi)) h/i – h restricted to vars(Gi) Then And Gi satisfies (jvc): if h/i maps x of vars(Gi) to existential variable of vi, then every atom g in b(Q) that contains this atom is in Gi

2005lav-iv15 The occurrence of vi in Q’ may have some head variables equated Example : the original head might be vi(A, B, C) the head in Q’ : vi(X, X, Z) These equalities are given by a unique least set of equality constraints Ei (v/E -- the view v, with head variables equated as specified by E) Summary (so far): the containment mapping can be decomposed into “disjoint” components (vi, Ei, h/i, Gi) All we need to do is find such components, then combine them What is the condition for successful combination? Does a combination (s.t. ) ever fail ?

2005lav-iv16 To find such components, we must use the given view definitions (variables different from those of Q or exp(Q’)). Answer : a component and its mapping can be expressed as: Here: hi is a mapping from Q to the given view definition for vi E’i – the least set of equalities that make hi a good mapping h’i is a variable renaming E’i and hi depend only on Q and the definition of vi  We can find components mappings from Q to the view defs, then combine & rename, possibly equating more head vars Gi vi/E’i exp(vi(Xi)) hi h/i h’i

2005lav-iv17 One more step : A component (vi, Ei, hi, Gi) may be further decomposed into smaller components (vi, Ei1, hi1, Gi1), (vi, Ei2, hi2, Gi2) provided each of Gi1, Gi2 satisfies (jvc), and they are disjoint Each of Ei1, Ei2 is a subset of Ei, least sets for the mappings hi1, hi2 to be ok When these are combined, Ei1 union Ei2 is augmented with the remaining equalities of Ei Minimal such components: Easier to find Can be re-used for different combinations.

2005lav-iv18 What is a minimal component? C = (vi, Ei, hi, Gi) is minimal if hi satisfies (hvp) + (jvc) (assuming the equalities in Ei) There is no component C1 whose last three components are contained in C’s last three components (at least one is proper containment) A component: minicon (mini containment) description -- MCD The algorithm constructs and combines minimal MCDs

2005lav-iv19  The MiniCon Algorithm Minimal MCD Construction Algorithm : For each g in b(Q), each k in each b(vi) Let E(g,k) be the least set of equalities s.t. a mapping h(g,k) from g to k that satisfies (hvp) exists // E(g,k) and h(g,k), if they exist, // are uniquely determined by g, k If E(g,k) and h(g,k) exist find all minimal MCDs that extend them: (vi, Ei, hi, Gi) extends if Ei contains E(g,k), hi contains h(g,k), Gi contains g For the final set of MCDs remove duplicates

2005lav-iv20 How do we find minimal MCDs that extend a given mapping? I. Extension to one more query atom, one view atom extend (vi, E, h, g, k) // E equalities on head vars of vi // h: vars(Q)  vars(vi), partial, hvp with E // g in b(Q), k in b(vi) try to extend h to map g to k, with hvp, by adding equalities to E return fail, or the (uniquely determined) E’,h’ (The first step in alg. of previous page is this one, given empty E and h)

2005lav-iv21 How do we find minimal MCDs that extend a given mapping? II. Extend repeatedly, as long as needed and successful Given vi, g, k, E(g,k) and h(g,k) : Let C = {(vi, E(g,k), h(g,k), {g}}, MC = {} //C – initial component, (jvc) possibly not satisfied While C not empty –remove some c = (vi, E, h, G) from C –if (jvc) satisifed – put in MC –if not, exists x in vars(Q) s.t. h(x) is existential, g’ that contains x, g’ not in G –for each k’ in b(vi) if extend(vi, E, h, g’, k’) succeeds, put extension in C Remove duplicates from MC

2005lav-iv22 Example : A db: parenthood relation par(c, p) A view: v(C, G) :- par(C, P), par(P, G) // only grandchildren A query: Q: q(X, Y) :- par(X, Z), par(Z, Y) MCDs: 1 st query atom, 1 st view atom: h(1,1) = {X  C, Z  P}, E(1.1) ={} need to extend to par(Z, Y), can only map to 2 nd view atom MCD: (v, E={}, h={X  C, Z  P, Y  G}, b(Q)) 1 st query atom, 2 nd view atom: no mapping … The only MCD is the above

2005lav-iv23 Comment : In the paper, if (vi, Ei1, hi1, Gi1) and (vi, Ei2, hi2, Gi2) are both minimal extensions, and Gi1 is contained in Gi2, then the 2 nd is thrown away (another minimization) I do not know how to explain this optimization, or prove that with it the algorithm is still complete

2005lav-iv24 2 nd phase: MCD combination, and variable renaming : A set of MCDs {(vi, Ei, hi, Gi)} is a candidate if For each candidate set: Rename variables : for each view variable y : If hi(x) = y (y a view variable), rename y to x else rename y to a fresh distinct variable Note : if x in domain of both hi, hj, then hi(x), hj(x) are head variables of vi, vj (by def of MCD),  renaming makes them equal

2005lav-iv25 Example (cont’d): A db: parenthood relation par(c, p) A view: v(C, G) :- par(C, P), par(P, G) // only grandchildren A query: Q: q(X, Y) :- par(X, Z), par(Z, Y) MCD: (v, E={}, h={X  C, Z  P, Y  G}, b(Q)) Rename in v C to X, G to Y Rewriting: q(X, Y) :- v(X, Y)

2005lav-iv26 Example : A db: parenthood relation par(c, p) A view: v(C, G) :- par(C, P), par(P, G) // only grandchildren A query: Q: q(X, X) :- par(X, Z), par(Z, X) // I am my own grandpa MCDs: 1 st query atom, 1 st view atom: h(1,1) = {X  C, Z  P}, E(1.1) ={} need to extend to par(Z, X), can only map to 2 nd view atom MCD: (v, {C=G}, {X  C, Z  P}, b(Q)) 1 st query atom, 2 nd view atom: no mapping … The only MCD is the above

2005lav-iv27 Example : A db: parenthood relation par(c, p) A view: v(C, P) :- par(C, P), par(P, G) // parents where grandparents exist A query: Q: q(X, Y) :- par(X, Z), par(Z, Y) MCDs: h(1,1) = {X  C, Z  P}, E(1.1) ={}  MCD A1 = ( v(C, P), {}, h(1,1), {par(X,Z)} ) h(1, 2) = {X  P, Z  G}, E(1,2)={}, fails (why?) h(2, 1) = {Z  C, Y  P}, E(2,1)={}  MCD A2 = ( v(C, P), {}, h(2,1), {}, {par(Z,Y)} ) h(2, 2) = {Z  P, Y  G}, fails (why?)

2005lav-iv28 A view: v(C, P) :- par(C, P), par(P, G) A query: Q: q(X, Y) :- par(X, Z), par(Z, Y) MCDs: A1 = ( v(C, P), {}, h(1,1), {par(X,Z)} ) A2 = ( v(C, P), {}, h(2,1), {par(Z,Y)} ) Rewritings: (rename views to have distinct vars) A1+A2: X  C1, Z  P1, Z  C2, Y  P2 : add P1 (in 1 st v) = C2 (in 2 nd v) rewriting v(C1,P1), v(P1, P2) renaming: v(X, Z), v(Z, Y) – a correct rewriting

2005lav-iv29 When Q or views contain constants: MCD formation: a of Q must be mapped to a head variable of vi, or itself If x is in headvar(Q), it can be mapped to headvar(vi) or to a Whenever x is mapped to a, hi records this fact MCD combination: If A1, A2 are defined on x, then allow also Both map x to a One maps x to a, the other to head var of view In either case, rename x to a in rewriting

2005lav-iv30 When Q or views contain comparisons: If views contain comparisons, no change to algorithm (it finds contained rewritings anyway) If Q contains comparisons, then there may be no Datalog program that computes the certain answers (can express x != y) But, we can expect that extending the algorithm for comparisons will be a good heuristics, and will find certain answers in many cases

2005lav-iv31 When Q or views contain comparisons: C(Q) – constraints of Q (closed under inference) MCD formation: (vi, Ei, hi, Gi) (extend the join variable condition) If hi(x) is existential of vi, and c(x, y) in C(Q), then hi(y) is defined C(vi) must imply all constraints in hi(C(Q)) that involve at least one existential of vi MCD combination: Add all constraints of C(Q) not covered by those of the views

2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes.

Similar presentations

Presentation on theme: "2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes.

Similar presentations

Presentation on theme: "2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes."— Presentation transcript:

Similar presentations

About project

Feedback