The Query Compiler Section 16.3 DATABASE SYSTEMS – The Complete Book Presented By:Under the supervision of: Deepti KunduDr. T.Y.Lin
Topics to be covered From Parse to Logical Query Plans Conversion to Relational Algebra Removing Subqueries From Conditions Improving the Logical Query Plan Grouping Associative/ Commutative Operators
16.3 From Parse to Logical Query Plans ►
Review Query Preferred logical query plan Parser Preprocessor Logical query plan generator Query Rewriter Section 16.1 Section 16.3
Two steps to turn Parse tree into Preferred Logical Query Plan Replace the nodes and structures of the parse tree, in appropriate groups, by an operator or operators of relational algebra. Take the relational algebra expression and turn it into an expression that we expect can be converted to the most efficient physical query plan.
Reference Relations StarsIn (movieTitle, movieYear, starName) MovieStar (name, address, gender, birthdate)
Conversion to Relational Algebra If we have a with a that has no subqueries, then we may replace the entire construct – the select-list, from-list, and condition – by a relational- algebra expression.
The relational-algebra expression consists of the following from bottom to top: The products of all the relations mentioned in the, which Is the argument of: A selection σ C, where C is the expression in the construct being replaced, which in turn is the argument of: A projection π L, where L is the list of attributes in the
A query : Example SELECT movieTitle FROM Starsin, MovieStar WHERE starName = name AND birthdate LIKE ‘%1960’;
SELECT movieTitle FROM Starsin, MovieStar WHERE starName = name AND birthdate LIKE ‘%1960’;
Translation to an algebraic expression tree
Removing Subqueries From Conditions For parse trees with a that has a subquery Intermediate operator – two argument selection It is intermediate in between the syntactic categories of the parse tree and the relational- algebra operators that apply to relations.
Using a two-argument σ π movieTitle σ StarsIn MovieStar IN π name starName σ birthdate LIKE ‘%1960'
Two argument selection with condition involving IN Now say we have, two arguments – some relation and the second argument is a of the form t IN S. ‘t’ – tuple composed of some attributes of R ‘S’ – uncorrelated subquery Steps to be followed: 1. Replace the by the tree that is the expression for S ( δ is used to remove duplicates) 2. Replace the two-argument selection by a one-argument selection σ C. 3. Give σ C an argument that is the product of R and S.
Two argument selection with condition involving IN σ R tINS σCσC X R δ S
The effect
Improving the Logical Query Plan Algebraic laws to improve logical query plans: Selections can be pushed down the expression tree as far as they can go. Similarly, projections can be pushed down the tree, or new projections can be added. Duplicate eliminations can sometimes be removed, or moved to a more convenient position in the tree. Certain selections can be combined with a product below to turn the pair of operations into an equijoin.
Grouping Associative/ Commutative Operators An operator that is associative and commutative operators may be though of as having any number of operands. We need to reorder these operands so that the multiway join is executed as sequence of binary joins. Its more time consuming to execute them in the order suggested by parse tree. For each portion of subtree that consists of nodes with the same associative and commutative operator (natural join, union, and intersection), we group the nodes with these operators into a single node with many children.
The effect of query rewriting Π movieTitle Starname = name StarsIn σ birthdate LIKE ‘%1960 ’ MovieStar
Final step in producing logical query plan => U U U W R ST VU UV W R S T
An Example to summarize “find movies where the average age of the stars was at most 40 when the movie was made” SELECT distinct m1.movieTitle, m1,movieYear FROM StarsIn m1 WHERE m1.movieYear – 40 <= ( SELECT AVG (birthdate) FROM StartsIn m2, MovieStar s WHERE m2.starName = s.name AND m1.movieTitle = m2.movieTitle AND m1.movieYear = m2.movieyear );
SELECT distinct m1.movieTitle, m1,movieYear FROM StarsIn m1 WHERE m1.movieYear – 40 <= ( SELECT AVG (birthdate) FROM StartsIn m2, MovieStar s WHERE m2.starName = s.name AND m1.movieTitle = m2.movieTitle AND m1.movieYear = m2.movieyear );
Selections combined with a product to turn the pair of operations into an equijoin…
Condition pushed up the expression tree…
`
Selections combined…