Query and Join Optimization 11/5. Overview Recap of Merge Join Optimization Logical Optimization Histograms (How Estimates Work. Big problem!) Physical.

Query and Join Optimization 11/5

Overview Recap of Merge Join Optimization Logical Optimization Histograms (How Estimates Work. Big problem!) Physical Optimizer (if we have time)

Recap on Merge

Key (Simple) Idea To find an element that is no larger than all elements in two lists, one only needs to compare minimum elements from each list. A 1 <= A 2 <= … <= A N B 1 <= B 2 <= … <= B M Then Min {A 1, B 1 } <= A i for i=1….N and Min {A 1, B 1 } <= B j for j=1….M A 1 <= A 2 <= … <= A N B 1 <= B 2 <= … <= B M Then Min {A 1, B 1 } <= A i for i=1….N and Min {A 1, B 1 } <= B j for j=1….M

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 7,111, 520,31 2, 2225,3023,24 Main Memory Two Sorted Files (disk)

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 7,1120,31 25,3023,24 Main Memory Two Sorted Files (disk) 1, 5 2, 22

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 7,1120,31 25,3023,24 Main Memory Two Sorted Files (disk) 1,5 2,22

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 7,1120,31 25,3023,24 Main Memory Two Sorted Files (disk) 5 22 1,2

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 7,1120,31 25,3023,24 Main Memory Two Sorted Files (disk) 22 1,25 What next?

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 20,31 25,3023,24 Main Memory Two Sorted Files (disk) 22 1,25 7,11

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 20,31 25,3023,24 Main Memory Two Sorted Files (disk) 7,11 22 1,25

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 20,31 25,3023,24 Main Memory Two Sorted Files (disk) 11 22 1,25,7

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 20,31 25,3023,24 Main Memory Two Sorted Files (disk) 11 22 1,2 5,7

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 20,31 25,3023,24 Main Memory Two Sorted Files (disk) 22 1,211 5,7

Merge BIG sorted files to produce BIGGER Sorted Files With SMALL memory 25,3023,24 Main Memory Two Sorted Files (disk) 20,31 22 1,211 5,7

We can merge lists of arbitrary length with only 3 buffer pages. If Lists of size N and M, then Cost: 2(N+M) if lists of size N,M. What if we merge B lists with B+1 buffer pages?

Query Optimization

Optimization Order the operations within a query to reduce the cost. Major component of the database – Most mysterious & important – Heart of Query Processing (QP) “QP is not rocket science. When you flunk out of QP, we make you go build rockets.” – anonymous

Join Optimization

RA Reminder Sailors(sid,sname,rating,age) Reserves(sid,bid,date) Boats(bid,bname,color) Warning: Keys… “Find Names of sailors who’ve reserved a red boat” “Find the names of sailors who reserved both a red boat and a green boat”

Schema for Examples Reserves: – Each tuple is 40 bytes long, 100 tuples per page, 1000 pages. Sailors: – Each tuple is 50 bytes long, 80 tuples per page, 500 pages. Sailors ( sid : integer, sname : string, rating : integer, age : real) Reserves ( sid : integer, bid : integer, day : dates, rname : string)

What is this doing? You too can type EXPLAIN! (you may also want to know ANALYZE) When it’s slow, you’d like to know!

Joins One of the most important for performance Many, many algorithms: All fun. SELECT * FROM Reserves R1, Sailors S1 WHERE R1.sid = S1.sid SELECT * FROM Reserves R1, Sailors S1 WHERE R1.sid = S1.sid What is this in RA?

Some dry notation Given Relation R. Define the following two functions. T(R) = “# of tuples in R” B(R) = “# of pages/blocks in R” NB: I omit ceiling in calculations. A good exercise is to put them in the appropriate places! NB2: We don’t write the output writing to disk cost!

26 Nested loop join

27 Nested Loop Joins Tuple-based nested loop R S for each tuple r in R do for each tuple s in S do if r and s join then output (r,s) B(R) = 500 T(R) = 50,000 B(S) = 1000 T(S) =200,000 then, 5e7 IOs. ~ 140 hours! B(R) = 500 T(R) = 50,000 B(S) = 1000 T(S) =200,000 then, 5e7 IOs. ~ 140 hours! Cost: B(R) + T(R) B(S). Why? What is the cost if we switch the R and S?

28 Block Nested Loop Joins for each (M-1) blocks br of R do for each block bs of S do for each tuple s in bs do for each tuple r in br do if r and s join then output(r,s) Let M be the number of blocks in memory (M=11) B(R) = 500 T(R) = 50,000 B(S) = 1000 T(S) =200,000 NLJ =140 hrs BNLJ=.14 hrs B(R) = 500 T(R) = 50,000 B(S) = 1000 T(S) =200,000 NLJ =140 hrs BNLJ=.14 hrs NLJ = B(R) + T(R)B(S) BNLJ = B(R) + B(R)B(S)/(M-1) NLJ = B(R) + T(R)B(S) BNLJ = B(R) + B(R)B(S)/(M-1)

29 Nested Loop Joins Block-based Nested Loop Join – Still a smart cross product. Nevertheless, useful! NB: it is faster to iterate over the smaller relation first R S: R=outer relation, S=inner relation

Smarter than Cross Products

31 Index Nested Loop Joins Index -based nested loop R S on A for each tuple r in R do for each tuple s find all s.t. r.A = s.A Clustered B+ tree on S.A. All distinct values fit on a page. How much does this join cost? ~ B(R) + T(R)*3 (rule of thumb) Clustered B+ tree on S.A. All distinct values fit on a page. How much does this join cost? ~ B(R) + T(R)*3 (rule of thumb) Does not evaluate the full cross product!

Sort Merge

Join: Sort-Merge (R S) Sort R and S on the join column, then scan them to do a ``merge’’ (on join col.), and output result tuples. R is scanned once; each S group is scanned once per matching R tuple. – Multiple scans of an S group are likely to find needed pages in buffer. If R, S are already sorted on the join key, SMJ is awesome! If R, S are already sorted on the join key, SMJ is awesome!

Example of Sort-Merge Join Cost: 6 M + 6N + (M+N) – The cost of scanning, M+N, could be M*N (very unlikely! When does this happen?) – Here M (resp. N) is the size in Pages of R (resp. N)

Sort Merge v. Nested Loops steel cage match If we have 100 buffer pages, reserves is 1000 pages and Sailors 500 pages then – Sort both in two passes: 2 * 2 * 1000 + 2 * 2 * 500 – Merge phase 1000 + 500 so 7500 IOs What is BNLJ? – 500 + 1000*500/99 = 5550 But, if we have 35 buffer pages? – Sort Merge has same behavior (still 2 pass) – BNLJ? ~ 15k IOs! NB: SMJ both relations sorted in two passes

A simple optimization: Merged! Observe. The last phase of the external sort is a merge, and we can merge the merge phases. – Create sorted runs 2 * (1000 + 500) Each run is of length (B-1) (approximately) There are 1000/(B-1) + 500/(B-1) such runs – If (1000+500)/(2(B-1)) < B-1 then all runs fit in memory, roughly if (M+N) < 2B 2 or max { M, N } < 2B 2 – One can create runs of length 2(B-1) using what’s called a tournament sort used in PostgreSQL So we’ll say max { M, N } < B 2 implies cost 3(M+N)

Hash Join

Hash-Join Partitions of R & S Input buffer for Si Hash table for partition Ri (k < B-1 pages) B main memory buffers Disk Output buffer Disk Join Result hash fn h2 B main memory buffers Disk Original Relation OUTPUT 2 INPUT 1 hash function h B-1 Partitions 1 2 B-1... (2) Read a partition of R, hash it using h2. Scan matching partition of S for matches. (2) Read a partition of R, hash it using h2. Scan matching partition of S for matches. (1) Partition both relations using hash fn h: R tuples in partition i will only match S tuples in partition i. Should h = h2?

How much memory does Hash join need to perform well? Good case=perform the join in 2 passes 1 st Point: How large are the partitions? – R is of size M – S is of size N (wlog M < N) – Partition R into B-1 buffer pages (why B-1?) How many partitions result? How big are they? Roughly, each partition of R is f M/(B-1) where f is some fudge factor. Roughly, each partition of R is f M/(B-1) where f is some fudge factor.

How much memory does Hash join need to perform well? Good case=perform the join in 2 passes 2 nd Question: During the probe phase, how much memory do we need? Key :Only smaller partition needs to fit! Buffer needs to fit 1 partition of R, 1 page of S, & output: B > f M / (B-1) + 1 + 1 i.e., B 2 > fM The little dog!

Sort-Merge v. Hash Join In partitioning phase, read+write both R,S; 2(M+N). In matching phase, read both R,S; M+N I/Os. Given a minimum amount of memory (what is this, for each?) both have a cost of 3(M+N) I/Os. Minimum memory: HJ : B 2 > min {M,N} pages – i.e., the smaller relation SMJ: B 2 > max {M,N} pages – i.e., the larger relation. Minimum memory: HJ : B 2 > min {M,N} pages – i.e., the smaller relation SMJ: B 2 > max {M,N} pages – i.e., the larger relation. Hash Join superior if relation sizes differ greatly. Why?

Further Comparisons of Hash and Sort Joins Hash Joins are highly parallelizable. Sort-Merge less sensitive to data skew and result is sorted

Observations about Hash-Join In-memory hash table speeds up matching tuples, so little more memory is needed (fudge factor). If the hash function does not partition uniformly, one or more R partitions may not fit in memory. – What then? Can apply hash-join technique recursively to do the join of this R-partition with corresponding S- partition. – SKEW!

Recall: Logical Optimization

Single block SQL to RA SELECT DISTINCT S.sid FROM Sailors S, Reserves R WHERE s.sid = r.sid and s.rating > 8 SELECT DISTINCT S.sid FROM Sailors S, Reserves R WHERE s.sid = r.sid and s.rating > 8 SELECT S.sid, COUNT(DISTINCT Bid) FROM Sailors S, Reserves R WHERE s.sid = r.sid and s.rating > 8 GROUP BY S.sid HAVING COUNT(DISTINCT Bid) > 5 SELECT S.sid, COUNT(DISTINCT Bid) FROM Sailors S, Reserves R WHERE s.sid = r.sid and s.rating > 8 GROUP BY S.sid HAVING COUNT(DISTINCT Bid) > 5 Highly rated sailors who reserve many different boats and how many boats they reserve How would you optimize these? Highly rated sailors

Logical Optimization Summary Use query equivalence to compute same output via different plans – Key reason to use an algebra Often logical rewritings applied heuristically: – Always convert selection + cross product to Join Asymptotic reduction – Push down selections and projections Often, but not always a good idea!

Physical Optimization

One concept: Pipelining Intermediate results: could write them to disk or pipeline them to next operator Reserves Sailors sid=sid bid=100 rating > 5 sname RA Tree: We can apply selection & projection “on the fly”. Why?

Overview of Query Optimization A Plan is Tree of R.A. ops with choice of algorithm for each op. – Each operator typically implemented using a `pull’ interface: Two main issues: – For a given query, what plans are considered? Algorithm to search plan space for cheapest (estimated) plan. – How is the cost of a plan estimated? Ideally: Want to find best plan. Practically: Avoid worst plans! We will study the System R approach.

Highlights of System R Optimizer Impact: Most widely used; works well for < 10 joins. Cost estimation: Approximate art at best. – Statistics, maintained in system catalogs, used to estimate cost of operations and result sizes. – Considers combination of CPU and I/O costs. Enumerates an entire plan space: – Too many plans so only left-deep plans considered. Left-deep plans allow output of each operator to be pipelined into the next operator – Cartesian products avoided. There are other styles now… rule based.

An Example

How does it get those costs?

Cost Estimation

For each plan considered, must estimate cost: – Must estimate cost of each operation in plan tree. You know or can guess this: We’ve already discussed how to estimate the cost of operations (sequential scan, index scan, joins, etc.) All estimates depend on input cardinality… – Must also estimate size of result for each operation in tree! For selections and joins, assume independence of predicates. Let’s see how to estimate…

Estimating Results Sizes Estimate the reduction factor Column = value (e.g. Salary = 100k) – If there is an index I then 1/#keys(I) – If no index? 1/10 Column1 = Column2 (e.g. Salary = Age ) – Index I1, I2, then 1/max(#keys(I1), #keys(I2)) – If no Index? 1/10 Column > value – If Index I then (High(I) – value) / (High(I) – Low(I)) – No Index? 1/2 Later: Do better with histograms

Histograms

A histogram idea is to make “buckets” count how many are in each bucket How to choose the buckets? – Equiwidth & Equidepth Turns out high-frequency values are very important

Abstract Example Values Frequency How do we compute how many values between 8 and 10? (Yes, it’s obvious) Problem: Counts take too much space!

The Uniform is Red How much space does this take to store?

Fundamental Tradeoffs Want high resolution (like the full counts) Want low space (like uniform) Histograms are a compromise!

The Uniform is Red A query How do you estimate # of tuples? What about point queries?

Equi width All buckets roughly the same width

Equidepth All buckets contain roughly the same number of items

Range Query: x in [5,8] All buckets roughly the same width

Histograms Simple, intuitive and popular Parameters # of buckets and type Can extend to many attributes (multidimensional)

Maintaining Histograms Histograms require that we update them! – Typically, you must run/schedule a command to update statistics on the database – Out of date histograms can be terrible! There is research work on self-tuning histograms and the use of query feedback – Oracle 11g

Nasty example 1. we insert many tuples with value > 16 2. we do not update the histogram 3. we ask for values > 20? 1. we insert many tuples with value > 16 2. we do not update the histogram 3. we ask for values > 20?

When estimates behave badly If we underestimate the number of tuples, what kinds of plans suffer? If we overestimate the number of tuples, what kinds of plans suffer? Think about using unclustered indexes…. We could have used that index! Or we could have used a hash join in one pass instead of sorting in two!

Compressed Histograms One popular approach: 1.Store the most frequent values and their counts explicitly 2.Keep an equiwidth or equidepth one for the rest of the values People continue to try all manner of fanciness here that people try wavelets, graphical models, entropy models,…

Query and Join Optimization 11/5. Overview Recap of Merge Join Optimization Logical Optimization Histograms (How Estimates Work. Big problem!) Physical.

Similar presentations

Presentation on theme: "Query and Join Optimization 11/5. Overview Recap of Merge Join Optimization Logical Optimization Histograms (How Estimates Work. Big problem!) Physical."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Query and Join Optimization 11/5. Overview Recap of Merge Join Optimization Logical Optimization Histograms (How Estimates Work. Big problem!) Physical.

Similar presentations

Presentation on theme: "Query and Join Optimization 11/5. Overview Recap of Merge Join Optimization Logical Optimization Histograms (How Estimates Work. Big problem!) Physical."— Presentation transcript:

Similar presentations

About project

Feedback