CS 4432logical query rewriting - lecture 151 CS4432: Database Systems II Lecture #15 Logical Query Rewriting Professor Elke A. Rundensteiner
CS 4432logical query rewriting - lecture 152 Query in SQL Query Plan in Algebra (logical) Other Query Plan in Algebra (logical)
CS 4432logical query rewriting - lecture 153 Query plan 1 (in relational algebra) B,D R.A =“c” S.E=2 R.C=S.C X RS
CS 4432logical query rewriting - lecture 154 Query plan 2 (in relational algebra) B,D R.A = “c” S.E = 2 R S natural join on R.C=S.C
CS 4432logical query rewriting - lecture 155 Relational algebra optimization What are transformation rules ? –preserve equivalence What are good transformations? –reduce query execution costs
CS 4432logical query rewriting - lecture 156 Rules: Natural join rewriting. R S=SR (R S) T= R (S T) R SS T T R Can also write as trees, e.g.:
CS 4432logical query rewriting - lecture 157 Rules: Other binary operators ? R S=SR (R S) T= R (S T) What about : Cross product? Condition join? Union? Intersection ? Difference ?
CS 4432logical query rewriting - lecture 158 Note: Carry attribute names in results, so order is not important T R R SS T
CS 4432logical query rewriting - lecture 159 R x S = S x R (R x S) x T = R x (S x T) R U S = S U R R U (S U T) = (R U S) U T Rules: Natural joins & cross products & union R S=SR (R S) T= R (S T)
CS 4432logical query rewriting - lecture 1510 Rules: Selects p1 p2 (R)= p1 [ p2 (R)] [ p1 (R)] U [ p2 (R)] p1vp2 (R) =
CS 4432logical query rewriting - lecture 1511 Bags vs. Sets R = {a,a,b,b,b,c} S = {b,b,c,c,d} What about union R U S = ? Option 1 SUM R U S = {a,a,b,b,b,b,b,c,c,c,d} Option 2 MAX R U S = {a,a,b,b,b,c,c,d}
CS 4432logical query rewriting - lecture 1512 Which option makes this rule work ? p1vp2 (R) = p1 (R) U p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c p1vp2 (R) = {a,a,b,b,b,c} p1 (R) = {a,a,b,b,b} p2 (R) = {b,b,b,c} p1 (R) U p2 (R) = {a,a,b,b,b,c} Let us try MAX():
CS 4432logical query rewriting - lecture 1513 Which option makes this rule work ? p1vp2 (R) = p1 (R) U p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c p1vp2 (R) = {a,a,b,b,b,c} p1 (R) = {a,a,b,b,b} p2 (R) = {b,b,b,c} p1 (R) U p2 (R) = {a,a,b,b,b,b,b,b,c} What about Sum()?
CS 4432logical query rewriting - lecture 1514 Which option makes this rule work ? p1 p2 (R)= p1 [ p2 (R)] Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c MAX or SUM ?
CS 4432logical query rewriting - lecture 1515 Option 2 (MAX) makes this rule work: p1vp2 (R) = p1 (R) U p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c p1vp2 (R) = {a,a,b,b,b,c} p1 (R) = {a,a,b,b,b} p2 (R) = {b,b,b,c} p1 (R) U p2 (R) = {a,a,b,b,b,c}
CS 4432logical query rewriting - lecture 1516 Yet another example ! Senators (……)Reps (……) T1 = yr,state Senators; T2 = yr,state Reps T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA Union? “Sum” option makes more sense!
CS 4432logical query rewriting - lecture 1517 Decision -> In summary, we tend to use “SUM” option for bag union -> Thus great care must be taken, as some rules cannot be used for bags !
CS 4432logical query rewriting - lecture 1518 Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y xy (R) = x [ y (R)]
CS 4432logical query rewriting - lecture 1519 Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attribs p (R S) = q (R S) = Rules: combined [ p (R)] S R [ q (S)]
CS 4432logical query rewriting - lecture 1520 p q (R S) = ? Rules: combined Rule can be derived !
CS 4432logical query rewriting - lecture 1521 Derivation for rule : p q (R S) = p [ q (R S) ] = p [ R q (S) ] = [ p (R)] [ q (S)]
CS 4432logical query rewriting - lecture 1522 More Rules can be Derived: p q (R S) = p q m (R S) = pvq (R S) = Rules: combined (continued)
CS 4432logical query rewriting - lecture 1523 We did one, do others on your own : p q (R S) = [ p (R)] [ q (S)] p q m (R S) = m [ ( p R) ( q S) ] pvq (R S) = [ ( p R) S ] U [ R ( q S) ]
CS 4432logical query rewriting - lecture 1524 Rules: combined Let x = subset of R attributes z = attributes in predicate P (subset of R attributes) x [ p ( R ) ] = { p [ x ( R ) ] } x x xz
CS 4432logical query rewriting - lecture 1525 Rules: combined Let x = subset of R attributes y = subset of S attributes z = intersection of R,S attributes xy (R S) = xy { [ xz ( R ) ] [ yz ( S ) ] }
CS 4432logical query rewriting - lecture 1526 In textbook: more transformations More rewrite rules Other operations, such as, duplicate elimination, etc. Eliminate common sub-expressions Identify contradictions
CS 4432logical query rewriting - lecture 1527 xy { p (R S) } = xy { p [ xz’ (R) yz’ (S)] } z’ = z U { attributes used in P }
CS 4432logical query rewriting - lecture 1528 Rules for combined with X similar... e.g., p (R X S) = ?
CS 4432logical query rewriting - lecture 1529 p (R U S) = p (R) U p (S) p (R - S) = p (R) - S = p (R) - p (S) Rules U combined:
CS 4432logical query rewriting - lecture 1530 Which are “good” transformations?
CS 4432logical query rewriting - lecture 1531 Conventional wisdom: do projects early Example: relation R(A,B,C,D,E) predicate P: (A=3) (B=“cat”) E { p (R)} vs. E { p { ABE (R)} }
CS 4432logical query rewriting - lecture 1532 What if we have A, B indexes? B = “cat” A=3 Intersect pointers to get pointers to matching tuples! But Then better to do projection later !
CS 4432logical query rewriting - lecture 1533 p1 p2 (R) p1 [ p2 (R)] p (R S) [ p (R)] S R S S R x [ p (R)] x { p [ xz (R)] } Which are “good” transformations?
CS 4432logical query rewriting - lecture 1534 Bottom line: Some heuristics : –Early selection is usually good No transformation is always good Rule application defines a search space –Need cost criteria to make decision