Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional.

Query Execution Optimizing Performance

Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional processing to supply all of the missing details. In practice, an SQL query is translated internally into a relational algebra expression. One advantage of using relational algebra is that it makes alternative forms of a query easier to explore. The different algebraic expressions for a query are called logical query plans. We will focus first on the methods for executing the operations of the relational algebra. Then we will focus on how transform logical query plans.

Preview Parsing: read SQL, output relational algebra tree Query rewrite: Transform tree to a form, which is more efficient to evaluate Physical plan generation: select implementation for each operator in tree, and for passing results up the tree. In this chapter we will focus on the implementation for each operator.

Relational Algebra recap RA union, intersection and difference correspond to UNION, INTERSECT, and EXCEPT in SQL Selection  corresponds to the WHERE -clause in SQL Projection  corresponds to SELECT -clause Product  corresponds to FROM -clause Join’s corresponds to JOIN, NATURAL JOIN, and OUTER JOIN in the SQL2 standard Duplicate elimination  corresponds to DISTINCT in SELECT -clause Grouping  corresponds to GROUP BY Sorting  corresponds to ORDER BY

Expression trees MovieStar(name,addr,gender,birthdate) StarsIn(title, year, starName) SELECT title, birthdate FROM MovieStar, StarsIn WHERE year = 1996 AND Gender = ‘F’ AND starName = name;

Join method? Can we pipeline the result of one or both selections, and avoid storing the result on disk temporarily? Are there indexes on MovieStar.gender and/or StarsIn.year that will make the  's efficient? How to generate such alternative expression trees will be Chapter 16.

Relational algebra for real SQL Keep in mind the following fact: - A relation in algebra is a set, while a relation in SQL is probably a bag - In short, a bag allows duplicates. - Not surprisingly, this can effect the cost of related operations.

Bag union, intersection, and difference Card(t,R) means the number of occurrences of tuple t in relation R Card(t, R  S) = Card(t,R) + Card(t,S) Card(t,R  S) = min{Card(t,R), Card(t,S)} Card(t,R–S) = max{Card(t,R)–Card(t,S), 0} Example: R= {A,B,B}, S = {C,A,B,C} R  S = {A,A,B,B,B,C,C} R  S = {A,B} R – S = {B}

Beware: Bag Laws != Set Laws Not all algebraic laws that hold for sets also hold for bags. For one example, the commutative law for union R  S = S  R does hold for bags. - Since addition is commutative, adding the number of times that tuple x appears in R and S doesn’t depend on the order of R and S. Set union is idempotent, meaning that S  S = S. However, for bags, if x appears n times in S, then it appears 2n times in S  S. Thus S  S != S in general.

Physical operators Physical query plans are built from physical operators. - In most cases, the physical operators are direct implementations of the relational algebra operators. However, there are several other physical operators for various supporting tasks. 1.Table-scan (the most basic operation we want to perform in a physical query plan) 2.Index-scan (E.g. if we have an index on some relation R we can retrieve the blocks of R by using the index) 3.Sort-scan (takes a relation R and a specification of the attributes on which the sort is to be made, and produces R in sorted order)

Computational Model When comparing algorithms for the same operations we do not consider the cost of writing the output. - Because the cost of writing the output on the disk depends on the size of the result, not on the way the result was computed. In other words, it is the same for any computational alternative. - Also, we can often pipeline the result to other operators when the result is constructed in main memory. So a final output phase may not even be required.

Cost parameters In order to evaluate or estimate the cost of query resolution, we need to identify the relevant parameters. Typical cost parameters include: - R = the relation on disk - M = number of main memory buffers available (1buffer = 1block) - B(R) = number of blocks of R - T(R) = number of tuples of R - V(R, a) = number of distinct values in column a of R - V(R, L) = number of different tuples in R (where L is a list of attributes or columns) Simple cost estimate: - Basic scan: B(R) - 2PMMS: 3B(R) Recall that final output is not counted

Algorithms for implementing RA-operators Classification of algorithms –Sort based methods –Hash based methods –Index based methods Degree of difficultness of algorithms –One pass (when one relation can fit into main memory) –Two pass (when no relation can fit in main memory, but again the relations are not very extremely large) –Multi pass (when the relations are very extremely large) Classification of operators –Tuple-at-a-time, unary operations( ,  ) –Full-relation, unary operations ( ,  ) –Full-relation, binary operations (union, join,…)

Selection -- The condition C might involve Arithmetic (+,-,  ) or string operators such as LIKE Comparison between terms, e.g. a < b or a+b = 10. Boolean connectives AND, OR, and NOT Example: R = a b ---- 0 1 2 3 4 5 2 3 a b ---- 2 3 4 5 2 3 a b ---- 4 5 One pass, tuple-at-a-time

Projection -- Argument L of  is a sequence of elements of the following form: A single attribute in R, or An expression x  y, where x and y are attribute names, or An expression E  z, where E is an expression involving attributes in R and z is a new attribute name not in R Example: R = a b c ------ 0 1 2 3 4 5 a x ---- 0 3 3 9 x y ---- 1 One pass, tuple-at-a-time

Selection and projection Cost = B(R) or T(R) (if the relation is not clustered) Space requirement: M  1 block Principle: –Read one block (or one tuple if the relation is not clustered) at a time –Filter in or out the tuples of this block.

Duplicate Elimination R1 :=  (R2). R1 consists of one copy of each tuple that appears in R2 one or more times. R =AB 12 34 12  (R) = AB 12 34 One pass, unary full- relation operations

Duplicate elimination: for each tuple decide: seen before: ignore new: output Principle: –It is the first time we have seen this tuple, in which case we copy it to the output. –We have seen the tuple before,in which case we must not output this tuple. We need a Main Memory hash-table to be efficient. Requirement: 

One pass, unary full- relation operations

Grouping: Accumulate the information on groups in main memory. Need to keep one entry for each value of the grouping attributes, through a main memory search structure (hash table). Then, we need for each group to keep an aggregated value (or values if the query asks for more than one aggregation). –For MIN/MAX we keep the min or max value seen so far for the group. –For COUNT aggregation we keep a counter which is incremented each time we encounter a tuple belonging to the group. –For SUM, we add the value if the tuple belongs to the group. –For AVG? MM requirement. Typically, a (group) tuple will be smaller than a tuple of the input relation, Typically, the number of groups will be smaller than the number of tuples in the input relation. The number of groups is Requirement:  =< M One pass, unary full-relation operations

One pass, binary operators Requirement: min(B(R),B(S)) ≤ M Exception: bag union Cost: B(R) + B(S) Assume R is larger than S. How to perform the operations below: –Set union, set intersection, set difference –Bag intersection, bag difference –Cartesian product, natural join All these operators require reading the smaller of the relations into main memory using there a search scheme (e.g. main memory hash table) for easy search and insertion.

Set Union Let R and S be sets. We read S into M-1 buffers of main memory. All these tuples are also copied to the output. We then read each block of R into the M th buffer, one at a time. For each tuple t of R we see if t is in S, and if not, we copy t to output. At the end we output S from main mem.

Set Intersection Let R and S be sets. The result will be set. We read S into M-1 buffers of main memory. We then read each block of R into the M-th buffer, one at a time. For each tuple t of R we see if t is in S, and if so, we copy t to output.

Set Difference Let R and S be sets. Since difference is not a commutative operator, we must distinguish between R-S and S-R. Read S into M-1 buffers of main memory. Then read each block of R into the M th buffer, one at a time. To compute R-S: for each tuple t of R we see if t is not in S, and if so, we copy t to output. To compute S-R: for each tuple t of R we see if t is is in S, we delete t from S in such a case. At the end we output those tuples of S that remain.

Bag Intersection Let R and S be bags. Read S into M-1 buffers of main memory. Also, associate with each tuple a count, which initially measures the number of times the tuple occurs in S. Then read each block of R into the M-th buffer, one at a time. For each tuple t of R we see if t is in S. If not we ignore it. Otherwise, if the counter is greater than zero, we output t and decrement the counter.

Bag Difference We read S into M-1 buffers of main memory. Also, we associate with each tuple a count, which initially measures the number of times the tuple occur in S. We then read each block of R into the M-th buffer, one at a time. To compute S-R: for each tuple t of R we see if t is is in S, we decrement its counter. At the end we output those tuples of S that remain with counter positive. To compute R-S: we may think of the counter c for tuple t as having c reasons to not output t. Now, when we process a tuple of R we check to see if that tuple appears in S. If not we output t. Otherwise, we check to see the counter c of t. If it is 0 we output t. If not, we don’t output t, and we decrement c.

Product We read S into M-1 buffers of main memory. No special structure is needed. We then read each block of R into the M-th buffer, one at a time. And combine each tuple with all the tuples of S.

Natural Join We read S into M-1 buffers of main memory and build a search structure where the search key is the shared attributes X of R and S. We then read each block of R into the M-th buffer, one at a time. For each tuple t of R we look for tuples t in S, and if t[X] = u[X], we copy t.u to the output.

Nested-Loop joins “one-and-a-half” pass method, since one relation will be read just once. Tuple-Based Nested-loop Join Algorithm: FOR each tuple s in S DO FOR each tuple r in R DO IF r and s join to make a tuple t THEN output t Improvement to Take Advantage of Disk I/O Model Instead of retrieving tuples of R, T(S) times, load memory with as many tuples of S as can fit, and match tuples of R against all S tuples in memory.

Block-based nested loops Assume B(S) ≤ B(R), and B(S) > M Read M-1 blocks of S into main memory and compare to all of R, block by block FOR each chunk of M-1 blocks of S DO FOR each block b of R DO FOR each tuple t of b DO find the tuples of S in memory that join with t output the join of t with each of these tuples

Example B(R) = 1000, B(S) = 500, M = 101 Outer loop iterates 5 times At each iteration we read M-1 (i.e. 100) blocks of S and all of R (i.e. 1000) blocks. Total time: 5*(100 + 1000) = 5500 I/O’s Question: What if we reversed the roles of R and S? We would iterate 10 times, and in each we would read 100+500 blocks, for a total of 10*(100+500) = 6000 I/O’s. Compare with one-pass join, if it could be done! We would need 1500 disk I/O’s if B(S)  M-1

Analysis of blocks nested loops Number of disk I/O’s: [B(S)/(M-1)] * (M-1 + B(R)) or B(S) + (B(S)*B(R))/(M-1) or approximately (B(S)*B(R))/M

Summary of one-pass algorithms 15.2.1 15.2.2 15.2.3 15.3.3

Two-pass algorithms based on sorting This special case of multi-pass algorithms is sufficient for most of the relation sizes. Main idea for unary operations on R Suppose B(R)  M (main memory size in blocks) First pass: –Read M blocks of R into Main Memory –Sort the content of Main Memory –Write the sorted result (sublist/run) into M blocks on disk. Second pass: create final result

Duplicate elimination  using sorting In the second phase (merging) we don’t sort but copy each tuple just once. We can do that because the identical tuples will show up “at the same time,” i.e. they will be all the first ones at the buffers (for the sorted sublists). As usual, if one buffer gets empty we refill it.

Duplicate-Elimination using Sorting Example Assume M=3, each buffer holds 2 records and relation R consists of the following 17 tuples: 2, 5, 2, 1, 2, 2, 4, 5, 4, 3, 4, 2, 1, 5, 2, 1, 3 After the first pass the following sorted sub-lists are created: 1, 2, 2, 2, 2, 5 2, 3, 4, 4, 4, 5 1, 1, 2, 3, 5 In the second pass we dedicate a memory buffer to each sub-list.

Example (Cont’d)

Analysis of  (R) 2B(R) to create sorted sublists, B(R) to read each sublist in phase 2. Total: 3B(R) How large can R be? –There can be no more than M sublists since we need one buffer for each one.  So, B(R)/M ≤ M, (B(R)/M is the number of sublists) i.e. B(R) ≤ M 2 To compute  (R) we need at least sqrt(B(R)) blocks of Main Memory.

Sort-based , , - Example: set union. Analysis: 3(B(R) + B(S)) disk I/O’s Condition: B(R) + B(S) ≤ M 2 Similar algorithms for sort based intersection and difference (bag or set versions). Create sorted sublists of R and S Use input buffers for sorted sublists of R and S, one buffer per sublist. Output each tuple once. - We can do that since all the identical tuples appear “at the same time.”

Simple sort-based join For R(X,Y) S(Y,Z) with M buffers of memory: Completely: sort R on Y, sort S on Y Merge phase Use 2 input buffers: 1 for R, 1 for S. Pick tuple t with smallest Y value in the buffer for R If t doesn’t match with the first tuple in the buffer for S, then just remove t. Otherwise, read all the tuples from R with the same Y value as t and put them in the M-2 part of the memory. When the input buffer for R is exhausted fill it again and again. Then, read the tuples of S that match. For each one we produce the join of it with all the tuples of R in the M-2 part of the memory.

Example of sort join B(R) = 1000, B(S) = 500, M= 101 To sort R, we need 4*B(R) I/O’s, same for S. –Number of I/O’s = 4*(B(R) + B(S)) Doing the join in the merge phase: –Number of I/O’s = B(R) + B(S) Total disk I/O’s = 5*(B(R) + B(S)) = 7500 Memory Requirement? To be able to do the sort, we should have B(R) ≤ M 2 and B(S) ≤ M 2 Recall: for nested-loop join, we needed 5500 disk I/O’s, but the memory requirement was quadratic (it is linear, here), i.e., nested-loop join is not good for joining relations that are much larger than main memory.

Potential problem... R(X, Y) ----------- x 1 a x 2 a … x n a S(Y, Z) --------- a z 1 a z 2... a z m What if Size of n tuples > M-2 and Size of m tuples> M-2? If the tuples from R (or S) with the same value y of Y do not fit in M-2 buffers, then we use all M-2 buffers to do a nested-loop join on the tuples with Y-value y from both relations. Observe that we can “smoothly” continue with the nested loop join when we see that the R tuples with Y-value y do not fit in M-2 buffers.

Do we really need the fully sorted files? Suppose we are not worried about many common Y values Can We Improve on Sort Join? R S Join? sorted runs

A more efficient sort-based join Suppose we are not worried about many common Y values Create Y-sorted sublists of R and S Bring first block of each sublist into a buffer (assuming we have at most M sublists) Find smallest Y-value from heads of buffers. Join with other tuples in heads of buffers, use other possible buffers, if there are “many” tuples with the same Y values. Disk I/O: 3*(B(R) + B(S)) Requirement: B(R) + B(S) ≤ M 2

Example B(R) = 1000, B(S) = 500, M= 101 Total of 15 sorted sublists If too many tuples join on a value Y, use the remaining 86 MM buffers for a one pass join on Y Total cost: 3*(1000 + 500) = 4500 disk I/O’s M 2 =10201 > B(R) + B(S), so the requirement is satisfied

Summary of sort-based algorithms

Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional.

Similar presentations

Presentation on theme: "Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional.

Similar presentations

Presentation on theme: "Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional."— Presentation transcript:

Similar presentations

About project

Feedback