Download presentation
Presentation is loading. Please wait.
1
ICS 424 - 01 (072)Query Processing and Optimization 1 Chapter 15 Algorithms for Query Processing and Optimization ICS 424 Advanced Database Systems Dr. Muhammad Shafique
2
ICS 424 - 01 (072)Query Processing and Optimization 2 Outline Introduction Processing a query SQL queries and relational algebra Implementing basic query operations Heuristics-based query optimization Overview of query optimization in Oracle
3
ICS 424 - 01 (072)Query Processing and Optimization 3 Material Covered from Chapter 15 Pages 537, 538, 539 Section 15.1 Section 15.2 Section 15.6 Section 15.7 Section 15.9
4
ICS 424 - 01 (072)Query Processing and Optimization 4 Introduction to Query Processing Query optimization The process of choosing a suitable execution strategy for processing a query. Two internal representations of a query: Query Tree Query Graph
5
ICS 424 - 01 (072)Query Processing and Optimization 5 Background Review DDL compiler DML compiler Runtime database processor System catalog
6
ICS 424 - 01 (072)Query Processing and Optimization 6 Processing a Query Tasks in processing a high-level query 1.Scanner scans the query and identifies the language tokens 2.Parser checks syntax of the query 3.The query is validated by checking that all attribute names and relation names are valid 4.An intermediate internal representation for the query is created (query tree or query graph) 5.Query execution strategy is developed 6.Query optimizer produces an execution plan 7.Code generator generates the object code 8.Runtime database processor executes the code Query processing and query optimization
7
ICS 424 - 01 (072)Query Processing and Optimization 7 Processing a Query Typical steps in processing a high-level query 1.Query in a high-level query language like SQL 2.Scanning, parsing, and validation 3.Intermediate-form of query like query tree 4.Query optimizer 5.Execution plan 6.Query code generator 7.Object-code for the query 8.Run-time database processor 9.Results of query
8
ICS 424 - 01 (072)Query Processing and Optimization 8
9
ICS 424 - 01 (072)Query Processing and Optimization 9 SQL Queries and Relational Algebra SQL query is translated into an equivalent extended relational algebra expression --- represented as a query tree In order to transform a given query into a query tree, the query is decomposed into query blocks Query block: The basic unit that can be translated into the algebraic operators and optimized. A query block contains a single SELECT-FROM-WHERE expression, as well as GROUP BY and HAVING clause if these are part of the block. The query optimizer chooses an execution plan for each block
10
ICS 424 - 01 (072)Query Processing and Optimization 10 COMPANY Relational Database Schema (1)
11
ICS 424 - 01 (072)Query Processing and Optimization 11 COMPANY Relational Database Schema (2)
12
ICS 424 - 01 (072)Query Processing and Optimization 12 SQL Queries and Relational Algebra (1) Example SELECTLname, Fname FROM EMPLOYEE WHERE Salary > ( SELECT MAX(Salary) FROM EMPLOYEE WHERE Dno = 5) Inner block and outer block
13
ICS 424 - 01 (072)Query Processing and Optimization 13 Translating SQL Queries into Relational Algebra SELECT LNAME, FNAME FROM EMPLOYEE WHERE SALARY > (SELECT MAX (SALARY) FROMEMPLOYEE WHERE DNO = 5); SELECTMAX (SALARY) FROMEMPLOYEE WHERE DNO = 5 SELECT LNAME, FNAME FROM EMPLOYEE WHERE SALARY > C π LNAME, FNAME ( σ SALARY>C (EMPLOYEE)) ℱ MAX SALARY ( σ DNO=5 (EMPLOYEE))
14
ICS 424 - 01 (072)Query Processing and Optimization 14 SQL Queries and Relational Algebra (2) Uncorrelated nested queries Vs Correlated nested queries Example Retrieve the name of each employee who works on all the projects controlled by department number 5. SELECT FNAME, LNAME FROMEMPLOYEE WHERE ( (SELECTPNO FROMWORKS_ON WHERESSN=ESSN) CONTAINS (SELECTPNUMBER FROMPROJECT WHEREDNUM=5) )
15
ICS 424 - 01 (072)Query Processing and Optimization 15 SQL Queries and Relational Algebra (3) Example For every project located in ‘Stafford’, retrieve the project number, the controlling department number and the department manager’s last name, address and birthdate. SQL query: SELECT P.NUMBER,P.DNUM,E.LNAME, E.ADDRESS, E.BDATE FROMPROJECT AS P,DEPARTMENT AS D, EMPLOYEE AS E WHERE P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN AND P.PLOCATION=‘STAFFORD’; Relation algebra : PNUMBER, DNUM, LNAME, ADDRESS, BDATE ((( PLOCATION=‘STAFFORD’ (PROJECT)) DNUM=DNUMBER (DEPARTMENT)) MGRSSN=SSN (EMPLOYEE))
16
ICS 424 - 01 (072)Query Processing and Optimization 16 SQL Queries and Relational Algebra (4)
17
ICS 424 - 01 (072)Query Processing and Optimization 17 Implementing Basic Query Operations An RDBMS must provide implementation(s) for all the required operations including relational operators and more External sorting Sort-merge strategy Sorting phase Number of file blocks (b) Number of available buffers (n B ) Runs --- (b / n B ) Merging phase --- passes Degree of merging --- the number of runs that are merged together in each pass
18
ICS 424 - 01 (072)Query Processing and Optimization 18 Algorithms for External Sorting (1) External sorting: Refers to sorting algorithms that are suitable for large files of records stored on disk that do not fit entirely in main memory, such as most database files. Sort-Merge strategy: Starts by sorting small subfiles (runs) of the main file and then merges the sorted runs, creating larger sorted subfiles that are merged in turn.
19
ICS 424 - 01 (072)Query Processing and Optimization 19 Algorithms for External Sorting (2)
20
ICS 424 - 01 (072)Query Processing and Optimization 20 Algorithms for External Sorting (3) Analysis Number of file blocks = b Number of initial runs = n R Available buffer space = n B Sorting phase: n R = (b/n B ) Degree of merging: d M = Min (n B -1, n R ); Number of passes: n P = (log dM (n R )) Number of block accesses: (2 * b) + (2 * b * (log dM (n R ))) Example done in the class
21
ICS 424 - 01 (072)Query Processing and Optimization 21 Implementing Basic Query Operations (cont.) Estimates of selectivity Selectivity is the ratio of the number of tuples that satisfy the condition to the total number of tuples in the relation. SELECT ( ) operator implementation 1.Linear search 2.Binary search 3.Using a primary index (or hash key) 4.Using primary index to retrieve multiple records 5.Using clustering index to retrieve multiple records 6.Using a secondary index on an equality comparison 7.Conjunctive selection using an individual index 8.Conjunctive selection using a composite index 9.Conjunctive selection by intersection of record pointers
22
ICS 424 - 01 (072)Query Processing and Optimization 22 Implementing Basic Query Operations (cont.) JOIN operator implementation 1.Nested-loop join 2.Sort-merge join 3.Hash join Partition Hash join Hybrid hash join PROJECT operator implementation Set operator implementation Implementing Aggregate operators/functions Implementing OUTER JOIN
23
ICS 424 - 01 (072)Query Processing and Optimization 23
24
ICS 424 - 01 (072)Query Processing and Optimization 24 Buffer Space and Join performance In the nested-loop join, it makes a difference which file is chosen for the outer loop and which for the inner loop. If EMPLOYEE is used for the outer loop, each block of EMPLOYEE is read once, and the entire DEPARTMENT file (each of its blocks) is read once for each time we read in ( n B - 2) blocks of the EMPLOYEE file. We get the following: Total number of blocks accessed for outer file = b E Number of times ( n B - 2) blocks of outer file are loaded = b E / n B – 2 Total number of blocks accessed for inner file = b D * b E / n B – 2 Hence, we get the following total number of block accesses: b E + ( b E / n B – 2 * b D ) = 2000 + ( (2000/5) * 10) = 6000 blocks On the other hand, if we use the DEPARTMENT records in the outer loop, by symmetry we get the following total number of block accesses: b D + ( b D / n B – 2 * b E ) = 10 + ( (10/5) * 2000) = 4010 blocks
25
ICS 424 - 01 (072)Query Processing and Optimization 25 Implementing Basic Query Operations (cont.) Combining operations using pipelining Temporary files based processing Pipelining or stream-based processing Example: consider the execution of the following query list of attributes ( ( c1 (R) ( c2 (S))
26
ICS 424 - 01 (072)Query Processing and Optimization 26 General Transformation Rules for Relational Algebra Operations 1.Cascade of : A conjunctive selection condition can be broken up into a cascade (that is, a sequence) of individual operations: C1 AND C2 AND ….AND Cn (R) ≡ C1 ( C2 ( …( Cn (R))…) 2.Commutativity of : The operation is commutative: C1 ( C2 (R)) ≡ C2 ( C1 (R)) 3.Cascade of : In a cascade (sequence) of operations, all but the last one can be ignored 4.Commuting with : If the selection condition c involves only those attributes A1,..., An in the projection list, the two operations can be commuted And more …
27
ICS 424 - 01 (072)Query Processing and Optimization 27 Heuristic-Based Query Optimization Outline of heuristic algebraic optimization algorithm 1.Break up SELECT operations with conjunctive conditions into a cascade of SELECT operations 2.Using the commutativity of SELECT with other operations, move each SELECT operation as far down the query tree as is permitted by the attributes involved in the select condition 3.Using commutativity and associativity of binary operations, rearrange the leaf nodes of the tree 4.Combine a CARTESIAN PRODUCT operation with a subsequent SELECT operation in the tree into a JOIN operation, if the condition represents a join condition 5.Using the cascading of PROJECT and the commuting of PROJECT with other operations, break down and move lists of projection attributes down the tree as far as possible by creating new PROJECT operations as needed 6.Identify sub-trees that represent groups of operations that can be executed by a single algorithm
28
ICS 424 - 01 (072)Query Processing and Optimization 28 Heuristic-Based Query Optimization: Example Query "Find the last names of employees born after 1957 who work on a project named ‘Aquarius’." SQL SELECT LNAME FROM EMPLOYEE, WORKS_ON, PROJECT WHERE PNAME=‘Aquarius’ AND PNUMBER=PNO AND ESSN=SSN AND BDATE.‘1957-12-31’;
29
ICS 424 - 01 (072)Query Processing and Optimization 29
30
ICS 424 - 01 (072)Query Processing and Optimization 30
31
ICS 424 - 01 (072)Query Processing and Optimization 31
32
ICS 424 - 01 (072)Query Processing and Optimization 32
33
ICS 424 - 01 (072)Query Processing and Optimization 33
34
ICS 424 - 01 (072)Query Processing and Optimization 34 Overview of Query Optimization in Oracle Rule-based query optimization: the optimizer chooses execution plans based on heuristically ranked operations. May be phased out Cost-based query optimization: the optimizer examines alternative access paths and operator algorithms and chooses the execution plan with lowest estimate cost. The query cost is calculated based on the estimated usage of resources such as I/O, CPU and memory needed. Application developers could specify hints to the ORACLE query optimizer. application developer might know more information about the data. SELECT /*+...hint... */ [rest of query] SELECT /*+ index(t1 t1_abc) index(t2 t2_abc) */ COUNT(*) FROM t1, t2 WHERE t1.col1 = t2.col1;
35
ICS 424 - 01 (072)Query Processing and Optimization 35 Summary Background review Processing a query SQL queries and relational algebra Implementing basic query operations Heuristics-based query optimization Overview of query optimization in Oracle
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.