CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 242 Database Systems II Query Execution.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 242 Database Systems II Query Execution

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 243 Introduction We have optimized the logical query plan, applying relational algebra equivalences. In order to refine this plan into a physical query plan, we in particular need to choose one of the available algorithms to implement the basic operations (selection, join,... ) of the query plan. For each alternative physical query plan, we estimate its cost. The cost estimates are based on the size estimates that we discussed in the previous chapter.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 244 Introduction Disk I/O (read / write of a disk block) is orders of magnitude more expensive than CPU operations. Therefore, we use the number of disk I/Os to measure the cost of a physical query plan. We ignore CPU costs, timing effects, and double buffering requirements. We assume that the arguments of an operator are found on disk, but the result of the operator is left in main memory.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 245 Introduction We use the following parameters (statistics) to express the cost of an operator: - B(R) = # of blocks containing R tuples, - f(R) = max # of tuples of R per block, - M = # memory blocks available in the buffer, - HT(i) = # levels in index i, - LB(i) = # of leaf blocks in index i. M may comprise the entire main memory, but typically the main memory needs to be shared with other processes, and M is much (!) smaller.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 246 Introduction The performance of relational operators depends on many parameters such as the following ones. Are the tuples of a relation stored physically contiguous ( clustered )? If yes, the number of blocks to be read is greatly reduced compared to non-clustered storage. Is a relation sorted by the relevant (selection, join) attribute? Otherwise, it may need to be sorted on-the-fly. Which indexes exist? Some algorithms require the existence of a corresponding index.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 247 Introduction Each operator (selection, join,...) in a logical query plan can be implemented by one of a fairly large number of alternative algorithms. We distinguish three types of algorithms: - sorting-based algorithms, - hash-based algorithms, - index-based algorithms. Sorting, building of hash table or building of index can either have happened in advance or may happen on the fly.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 248 Introduction We can also categorize algorithms according to the number of passes over the data: - one-pass algorithms read data only once from disk, - two-pass algorithms read data once from disk, write intermediate relation back to disk and then read the intermediate relation once. - multiple-pass algorithms perform more than two passes over the data, not considered in class.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 249 One-Pass Algorithms for Unary Operations Consider the unary, tuple-at-a-time operations, selection and projection on relation R. Read all the blocks of R into the input buffer, one at a time. Perform the operation on each tuple and move the selected / projected tuple to the output buffer. R Input buffer Output buffer Unary operation

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 250 One-Pass Algorithms for Unary Operations Output buffer may be input buffer of other operation and is not counted. Thus, algorithm requires only M = 1 buffer blocks. I/O cost is B(R). If some index is applicable for a selection, have to read only blocks that contain qualifying tuples.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 251 One-Pass Algorithms for Binary Operations Binary operations: union, intersection, difference, Cartesian product, and join. Use subscripts B and S to distinguish between the set- and bag version, e.g. The bag union can be computed using a very simple one-pass algorithm: copy each tuple of R to the output, and copy each tuple of S to the output. I/O cost is B(R) + B(S), M = 1.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 252 One-Pass Algorithms for Binary Operations Other binary operations require the reading of the smaller of the two input relations into main memory. One buffer to read blocks of the larger relation, M-1 buffers for holding the entire smaller table. I/O cost is B(R) + B(S). In main memory, a data structure is built that efficiently supports insertions and searches. Data structure, e.g., hash table or binary balanced tree. Space overhead can be neglected.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 253 One-Pass Algorithms for Binary Operations For set union, read the smaller relation (S) into M-1 buffers, representing it in a data structure whose search key consists of all attributes. All these tuples are also copied to the output. Read all blocks of R into the M-th buffer, one at a time. For each tuple t of R, check whether t is in S. If not, copy t to the output. For set intersection, copy t to output if it also is in S.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 254 Nested-Loop Joins We now consider algorithms for the join operator. The simplest one is the nested-loop join, a one- and-a-half pass algorithm. One table is read once, the other one multiple times. It is not necessary that one relation fits in main memory. Perform the join through two nested loops over the two input relations.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 255 Nested-Loop Joins Tuple-based nested-loop join natural join R S, join attribute C for each r  R do for each s  S do if r.C = s.C then output (r,s) Outer relation R, inner relation S. One buffer for outer relation, one buffer for inner relation. M = 2. I/O cost is T(R) x T(S).

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 256 Nested-Loop Joins Example Relations not clustered T(R1) = 10,000 T(R2) = 5,000 R1 as the outer relation Cost for each R1 tuple t1: read tuple t1 + read relation R2 Total I/O cost is 10,000 (1+5,000)=50,010,000

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 257 Nested-Loop Joins Can do much better by organizing access to both relations by blocks. Use as much buffer space as possible (M-1) to store tuples of the outer relation. Block-based nested-loop join for each chunk of M-1 blocks of R do read these blocks into the buffer; for each block b of S do read b into the buffer; for each tuple t of b do find the tuples of R that join with t and output the join results

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 258 Nested-Loop Joins Example R1 as the outer relation T(R1) = 10,000, T(R2) = 5,000 S(R1) = S(R2) = 1/10 block M = 101, 100 buffers for R1, 1 buffer for R2 10 R1 chunks cost for each R1 chunk: read chunk: 1,000 IOs read R2: 5,000 IOs total I/O cost is 10 x 6,000 = 60,000 IOs

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 259 Nested-Loop Joins Can do even better by reversing the join order. R2 R1 T(R1) = 10,000, T(R2) = 5,000 S(R1) = S(R2) = 1/10 block M = 101, 100 buffers for R2, 1 buffer for R1 5 R2 chunks cost for each R2 chunk: read chunk: 1,000 IOs read R1: 10,000 IOs total I/O cost is 5 x 11,000 = 55,000 IOs

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 260 Nested-Loop Joins Finally, performance is dramatically improved when input relations are clustered. With clustered relations, for each R2 chunk: read chunk: 100 IOs read R1: 1,000 IOs Total I/O is 5 x 1,100 = 5,500 IOs. Note that the IO cost for a one-pass join (which has the minimum IO of any join algorithm) in this example is 1,000 + 500 = 1,500 IOs. For a comparison, the one-pass join requires M=501 buffer blocks, which is optimal.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 261 Two-Pass Algorithms Based on Sorting If the input relations are sorted, the efficiency of duplicate elimination, set-theoretic operations and join can be greatly improved. Reserve one buffer for each of the input relations R and S and another buffer for the output. Scan both relations simultaneously in sort order, merging matching tuples. For example, for set intersection: repeatedly consider the tuple t that is least in the sort order (w.r.t. primary key) among all tuples in the input buffer. If it appears in both R and S, output t.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 262 Two-Pass Algorithms Based on Sorting In the following, we present a simple sort-merge join algorithm. It is called merge-join, if step (1) can be skipped, since the input relations R1 and R2 are already sorted. Sort-merge join (1) if R1 and R2 not sorted, sort them (2) i  1; j  1; while (i  T(R1))  (j  T(R2)) do if R1{ i }.C = R2{ j }.C then outputTuples else if R1{ i }.C > R2{ j }.C then j  j+1 else if R1{ i }.C < R2{ j }.C then i  i+1

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 263 Two-Pass Algorithms Based on Sorting Procedure outputTuples produces all pairs of tuples from R1 and R2 with C = R1{ i }.C = R2{ j }.C. In the worst case, need to match each pairs of tuples from R1 and R2 (nested-loop join). Procedure outputTuples While (R1{ i }.C = R2{ j }.C)  (i  T(R1)) do [jj  j; while (R1{ i }.C = R2{ jj }.C)  (jj  T(R2)) do [output pair R1{ i }, R2{ jj }; jj  jj+1 ] i  i+1 ]

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 264 Two-Pass Algorithms Based on Sorting Example i R1{i}.C R2{j}.Cj 110 51 220 202 320 203 430 304 540 305 506 527

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 265 Two-Pass Algorithms Based on Sorting Example Both R1, R2 ordered by C; relations clustered. Memory R1 R2 ….. R1 R2 Total cost: read R1 cost + read R2 cost = 1,000 + 500 = 1,500 IOs

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 266 Two-Pass Algorithms Based on Sorting What if input relations are not yet in the required sort order? Do Two-Phase, Multiway Merge-Sort (2PMMS). Phase 1: Sort each block of relation R separately in main memory, write sorted sublists back to disk. Phase 2: Merge all the B(R) sorted sublists. pointer to first unchosen tuple output buffer... select smallest unchosen input buffer (sorted)

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 267 Two-Pass Algorithms Based on Sorting Each sorted sublist has a length of M blocks. Number of sublists is B(R)/M. Therefore, This means we require In phase 1, each tuple is read and written once. In phase 2, each tuple is read again. We ignore the cost of writing the results to disk. Thus, the IO cost is 3B(R).

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 268 Two-Pass Algorithms Based on Sorting IO cost is 4B(R), if sorting is used as a first step of sort-join and the results must be written to the disk. If relation R is too big, apply the idea recursively. Divide R into chunks of size M(M-1), use 2PMMS to sort each one, and take resulting sorted lists as input for a third (merge) phase. This leads to Multi-Phase, Multiway Merge Sort.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 269 Two-Pass Algorithms Based on Sorting (i) For each 100 blk chunk of R: - read chunk - sort in memory - write to disk sorted chunks R1 R2... Example M=101 Memory

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 270 Two-Pass Algorithms Based on Sorting (ii) Read all chunks + merge + write out Sorted file Memory Sorted Chunks...

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 271 Two-Pass Algorithms Based on Sorting Sort cost: cach tuple is read, written, read, written Join cost: each tuple is read Sort cost R1: 4 x 1,000 = 4,000 Sort cost R2: 4 x 500 = 2,000 Total cost = sort cost + join cost = 6,000 + 1,500 = 7,500 IOs

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 272 Two-Pass Algorithms Based on Sorting Nested loop join (best version discussed above) needs only 5,500 IOs, i.e. outperforms sort-join. However, the situation changes for the following scenario: R1 = 10,000 blocks clustered R2 = 5,000 blocks not ordered. R1 is 1000 blocks, sorting needs M  31.62. R2 is 500 blocks, sorting needs M  22.36. I.e., need at least M=32 buffers.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 273 Two-Pass Algorithms Based on Sorting Nested-loops join: 5000 x (100+10,000) = 50 x 10,100 100 = 505,000 IOs Sort-join: 5(10,000+5,000) = 75,000 IOs Sort-join clearly outperforms nested-loop join!

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 274 Two-Pass Algorithms Based on Sorting Simple sort-join costs 5(B(R) + B(S)) IOs. It requires It assumes that tuples with the same join attribute value fit in M blocks. If we do not have to worry about large numbers of tuples with the same join attribute value, then we can combine the second phase of the sort with the actual join (merge). We can save the writing to disk in the sort step and the reading in the merge step.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 275 Two-Pass Algorithms Based on Sorting This algorithm is an advanced sort-merge join. Repeatedly find the least C-value c among the tuples in all input buffers. Instead of writing a sorted output buffer to disk, and reading it again later, identify all the tuples of both relations that have C=c. Cost is only 3(B(R) + B(S)) IOs. Since we have to simultaneously sort both input tables and keep them in memory, the memory requirements are getting larger:

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 276 Two-Pass Algorithms Based on Hashing If both input relations are too large to be stored in the buffer, hash all the tuples of both relations applying the same hash function to the join attribute(s). Hash function h has domain of k hash values, i.e. k buckets. Only tuples from R and S that fall into the same bucket i can join. Hash first relation R, then relation S, writing the buckets to disk.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 277 Two-Pass Algorithms Based on Hashing To hash relation R, read it block by block. Allocate one buffer block to each of the k buckets. For each tuple t, move it to the buffer of h (t). If a buffer is full, write it to disk and initialize it. Finally, write to disk all partially full buffer blocks. IO cost is B(R). Memory requirement M = k+1.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 278 Two-Pass Algorithms Based on Hashing For each i, read the i -th bucket of R into completely memory, and read the i -th bucket of S into memory, one block at a time. For each S tuple s in the buffer block, determine matching tuples r in R and output (r,s). We assume that each bucket fits into main memory.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 279 Two-Pass Algorithms Based on Hashing Hash join Hash function h, range 0... k Buckets for R1: G0, G1,... Gk Buckets for R2: H0, H1,... Hk Algorithm (1) Hash R1 tuples into G buckets (2) Hash R2 tuples into H buckets (3) For i = 0 to k do match tuples in buckets Gi, Hi and output results

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 280 Two-Pass Algorithms Based on Hashing Example hash function: even/odd buckets R1R2Buckets 2 5Even 4 3 12Odd: 5 3 813 9 8 11 14 2 4 8 4 12 8 14 3 5 9 5 3 13 11 R1R2

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 281 Two-Pass Algorithms Based on Hashing R, S clustered (un-ordered). Use 100 hash buckets of 10 blocks each. To hash R: read R, hash, and write buckets. Hash S in the same way. R... 10 blocks 100

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 282 Two-Pass Algorithms Based on Hashing Suppose R is the smaller of the input relations. Read one R bucket, build memory hash table (with other hash function). Read corresponding S bucket, one block at a time, and hash probe. Repeat same procedure for all buckets. R S... R memory...

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 283 Two-Pass Algorithms Based on Hashing Cost “Bucketize:” Read R + write Read S + write Join:Read R, S Total cost = 3 x [1,000+500] = 4,500 IO This is an approximation, since buckets will vary in size, and we have to round up to full blocks.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 284 Two-Pass Algorithms Based on Hashing Memory requirements Size of R bucket =(B(R)/M-1) k = M-1 = number of hash buckets This is assuming that all hash buckets of R have the same size. Same calculation for S. The buckets for the smaller input relation must fit into main memory.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 285 Index-Based Algorithms Index-based algorithms are especially useful for the selection operator, but also for the join operator. We distinguish clustering and non-clustering indexes. A clustering index is an index where all tuples with a given search key value appear on (roughly) as few blocks as possible. One relation can have only one clustering index, but multiple non-clustering indexes.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 286 Index-Based Algorithms Index join For each r  R do X  index (S, C, r.C) for each s  X do output (r,s) index(rel, attr, value) returns the set of rel tuples with attr = value

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 287 Index-Based Algorithms Example Assume R.C index exists; 2 levels. Assume S clustered, unordered. Assume R.C index fits in memory. Cost reads of S: 500 IOs for each S tuple: - probe index – no IO - if match, read R tuple: 1 IO.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 288 Index-Based Algorithms What is expected number of matching tuples? (a) say R.C is key, S.C is foreign key then expect 1 match (b) say V(R,C) = 5000, T(R) = 10,000 with uniform distribution assumption expect 10,000/5,000 = 2 matching tuples.

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 289 Index-Based Algorithms (c) Say DOM(R, C)=1,000,000 T(R) = 10,000 with alternate assumption expect = 10,000 = 1 matches 1,000,000 100 What is expected number of matching tuples?

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 290 Index-Based Algorithms Total cost of index join (a) Total cost = 500+5000(1)1 = 5,500 IO (b) Total cost = 500+5000(2)1 = 10,500 IO (c) Total cost = 500+5000(1/100)1=550 IO

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 291 Index-Based Algorithms What if index does not fit in memory? Example: say R1.C index is 201 blocks. Keep root + 99 leaf nodes in memory. Expected cost of each probe is E = (0)99 + (1)101  0.5. 200

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 292 Index-Based Algorithms Total cost (including probes) For case (b) = 500+5000 [probe + get records] = 500+5000 [0.5+2] uniform assumption = 500+12,500 = 13,000 IOs For case (c): = 500+5000[0.5  1 + (1/100)  1] = 500+2500+50 = 3,050 IOs

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 293 Summary Join Algorithms Nested-loop join ok for “small” relations (relative to memory size). Hash-join usually best for equi-join, where relations not sorted and no indexes exist. Sort-merge join good for non-equi-join e.g., R.C > S.C. If relations already sorted, use merge join. If index exists, index-join can be efficient (depends on expected result size).

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 242 Database Systems II Query Execution.

Similar presentations

Presentation on theme: "CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 242 Database Systems II Query Execution."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 242 Database Systems II Query Execution.

Similar presentations

Presentation on theme: "CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 242 Database Systems II Query Execution."— Presentation transcript:

Similar presentations

About project

Feedback