Algebraic Laws
consider physical plans {P1,P2,…..} {P1,C1>...} parse convert apply laws estimate result sizes consider physical plans estimate costs pick best execute Pi answer SQL query parse tree logical query plan “improved” l.q.p l.q.p. +sizes statistics
Algebraic laws for improving query plans Commutative and associative laws: Proofs of the above identities? Above laws are applicable for sets and bags
Theta-join Commutative: Not always associative: On schema R(a,b), S(b,c), T(c,d) can not be transformed into:
Laws Involving Selection ()
If all attributes in the condition C are in R (for binary operators) Note that in union R and S have to have the same attributes.
Example: Consider relation schemas R(A,B) and S(B,C) and the RA expression below: (A=1 OR A=3) AND B < C(R ⊳⊲ S) 1. Splitting AND A=1 OR A=3 (B < C(R ⊳⊲ S)) 2. Push to S A=1 OR A=3 (R ⊳⊲ B < C(S)) 3. Push Cond to R A=1 OR A=3 (R) ⊳⊲ B < C(S) 4. Splitting OR (A=1 (R) A=3(R)) ⊳⊲ B < C(S) Step 4 above may or may not be useful.
Some Trivial Laws Watch for some extreme cases: -- an empty relation: e.g., R⋃S = S, if R = ∅ -- a selection or theta-join whose condition is always satisfied e.g., C(R) = R, if C = true -- a projection on all attributes is “better” not to be done at all!!
Pushing selections Usually selections are pushed down the expression tree. The following example shows that it is sometimes useful to pull selection up in the tree. StarsIn(title,year,starName) Movie(title,year,length,studioName) CREATE VIEW MoviesOf1996 AS SELECT * FROM MOVIE WHERE year=1996; Query: Which stars worked for which studios in 1996? SELECT starName,studioName FROM MoviesOf1996 NATURAL JOIN StarsIN;
pull selection up then push down
Laws for (bag) Projection A simple law: Project out attributes that are not needed later. i.e. keep only the output attr. and any join attribute.
Laws for (bag) Projection … Bag projections cannot be pushed below ⋃S, and no projection can be pushed below either set/bag versions of ⋂ and – Example: Consider relation schemas R(A,B) and S(A,B). Supp. R has only (1,2) and S has only (1,3). A(R ⋂ S) = A(∅) but A(R) ⋂ A(S) = {(1)}
Examples for pushing projection Schema R(a,b,c), S(c,d,e)
Example: Pushing Projection Schema: StarsIn(title,year,starName) Query: SELECT starName FROM StarsIn WHERE year = 1996; starName year=1996 StarsIn starName,year starName year=1996 StarsIn Should we transform to ? Depends!
Reasons for not pushing the projection If there is an index on StarsIn.year, such index is useless in the projected relation starName,year(StarsIn) While such an index is very useful for the selection on “year=1996”
Laws for duplicate elimination, grouping, and aggregations Two general rules: ( absorbs ) , L ⊆ M Note: (R)=R, if R has no duplicates (e.g., base relation or result of a grouping) Defn: L is duplicate-impervious if it is not affected by duplicates, i.e., MIN or MAX. In this case,
Example Schema: StarsIn(title,year,starName) MovieStar(name,address,gender,birthdate) Query: Find for each year, the birth date of the youngest star to appear in a movie in that year.
Improving logical query plans Push as far down as possible (sometimes pull them up first). Do splitting of complex conditions in in order to push even further. Push as far down as possible, introduce new early (but take care for exceptions) Remove or move Combine with to produce -joins or equi-joins Choose an order for joins
Example of improvement SELECT title FROM StarsIn, MovieStar WHERE starName = name AND birthdate LIKE '%1960'; title StarsIn MovieStar starName=name birthdate LIKE ‘%1960’ title starname=name AND birthdate LIKE ‘%1960’ StarsIn MovieStar
And a better plan introducing a projection to filter out useless attributes: title StarsIn MovieStar starName=name birthdate LIKE ‘%1960’ name