Download presentation
Presentation is loading. Please wait.
Published byDerek Webb Modified over 8 years ago
1
Query Processing – Query Trees
2
Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying where clauses eliminated – Rows grouped by – Groups not satisfying having eliminated – Select clause target list evaluated – If distinct eliminate duplicate rows – Union after each subselect evaluated – Rows sorted in order by
3
Actual Order of Evaluation Order chosen by query optimizer – Determines most efficient way Syntax checking phase – scan - identify tokens of text – parse- check syntax – validate - check attributes and relation names Query optimization phase – create internal representation - query tree – identify execution strategies - plan Maintains statistics for tables and columns, indexes – choose suitable plan in query to optimize query e.g. order of execution of ops, use indexes, etc.
4
Evaluation cont’d Execution phase – query optimizer produces execution plan – code generator generates code – runtime db processor runs query code To minimize run time – chosen strategy NOT optimal, but reasonably efficient For procedural languages limited need for query optimization
5
Heuristics in Query Optimization Apply select and project before join or other binary operations. Why? – select and project reduce size strategy is obvious, but challenge was to show could be done with rules
6
Query Optimization Canonical form (initial query tree - conceptual order of evaluation) – Leaf nodes are tables – Internal nodes are operations – Begin by separating select conditions from joins (end up with X) – Combine all selects then all projects transform to final query tree using general transformation rules for relational algebra
7
Query tree for SQL query Select lname From employee, works_on, project Where pname=‘Aquarius’ and pnumber=pno and essn=ssn and bdate > ‘1957-12-31’
8
General Transformation Rules for Relational Algebra 1.Cascade of 2.Commutativity of 3.Cascade of 4.Commuting with 5.Commutativity of |X| (and X) 6.Commuting with |X| (or X) 7.Commuting with |X| (or X)
9
General Transformation Rules for Relational Algebra 8. Commutativity of set operations U and ∩ 9. Associativity of |X|, X, U and ∩ 10. Commuting with set operations 11. The operation commuted with U 12. Converting a ( , X) sequence into |X|
10
Outline of a Heuristic Algebraic Optimization Algorithm Use Rule 1 break up conjunctive ’s into cascades of ’s Use Rules 2,4,6, 10 for commutativity of to: – move ’s as far down tree as possible Use Rules 5 and 9 for commutativity and associativity of binary operations to: – Place most restrictive (and |X|) so executed first fewest tuples, smallest absolute size or smallest selectivity But make sure no cartesian products result
11
Outline of a Heuristic Algebraic Optimization Algorithm Use Rule 12, combine Cartesian product with to: – create |X| Use Rules 3, 4, 7, 11 concerning cascade of ’s and commuting with other ops to: – move down tree as far as possible Identify subtrees that represent groups of operations than can be executed by a single algorithm
12
Summary of Heuristics Apply first operations that reduce size of intermediate results – Perform ’s and ’s as early as possible (move down tree as far as possible) – Execute most restrictive and |X| first (reorder leaf nodes but avoid cartesian products)
13
Multiple table joins – Query plan identifies most efficient ordering of joins – may not have to materialize actual table resulting from join – instead use pipelining - successive rows output from one step fed into next plan
14
Converting trees into Query plans pipelined evaluation – Forward result from an operation directly to next operation Result from , placed in buffer |X| consumes tuples from buffer Result from |X| pipelined to
15
Query Tree Question Should we do a pname, pnumber then pname = ‘Aquarius’ then pnumber ? No, since the operations are done together –the processor would read a row of project, see if pname = ‘Aquarius’ then use pnumber to perform the join.
16
Algorithms DBMS has general access algorithms to implement select, join, or combinations of ops Don't have separate access routines for each op – Creating temporary files is inefficient – Generate algorithms for combinations of operations – join, select, project - code is created dynamically to implement multiple operations
17
Materialized table Think about what operations require utilizing a materialized table – Input to select? – Input to project? – Input to join? How is this implemented? Next topic
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.