Download presentation
Presentation is loading. Please wait.
1
Examples of Physical Query Plan Alternatives
Selected Material from Chapters 12, 14 and 15 The slides for this text are organized into chapters. This lecture covers Chapter 12, providing an overview of query optimization and execution. This chapter is the first of a sequence (Chapters 12, 13, 14, 15) on query evaluation that might be covered in full in a course with a systems emphasis. It can also be used stand-alone, as a self-contained overview of these issues, in a course with an application emphasis. It covers the essential concepts in sufficient detail to support a discussion of physical database design and tuning in Chapter 20. 1
2
Query Optimization NOTE: SQL provides many ways to express a query.
HENCE: System has many options for evaluating a query. Optimizer is important for query performance: Generates alternative plans Chooses plan with least estimated cost. Ideally, find best plan. Realistically, consistently find a quite good one.
3
A Query Evaluation Plan
An extended relational algebra tree Annotations at each node indicate: access methods to use for each table. implementation methods used for each relational operator.
4
A Query Evaluation Plan
Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) Reserves Sailors sid=sid bid=100 rating > 5 sname
5
Query Optimization Multi-operator Queries: Pipelined Evaluation
On-the-fly: The result of one operator is pipelined to another operator without creating a temporary table to hold intermediate result, called on-the-fly. Materialized : Otherwise, intermediate results must be materialized before the next operator can access it. C B A
6
Alternative Plans: Schema Examples
Reserves (sid: integer, bid: integer, day: dates, rname: string) Sailors (sid: integer, sname: string, rating: integer, age: real) Reserves: Each tuple is 40 bytes long, 100 tuples per page, 1000 pages. Sailors: Each tuple is 50 bytes long, 80 tuples per page, 500 pages. 3
7
Alternative Plans: Motivating Example
SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 RA Tree: Reserves Sailors sid=sid bid=100 rating > 5 sname 4
8
For each page of Sailors, scan Reserves 500+500*1000 I/Os Or,
RA Tree: Reserves Sailors sid=sid bid=100 rating > 5 sname SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Costs : 1. Scan Sailors : For each page of Sailors, scan Reserves *1000 I/Os Or, 2. Scan Reserves For each page of Reserves, scan Sailors * 500 I/Os Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) Plan: 4
9
Alternative Plans: Motivating Example
SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Cost: *1000 I/Os Typically, bad plan ! Reasons : selections could be `pushed’ earlier, no use made of indexes Goal of optimization: Find more efficient plan Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) Plan: 4
10
Alternative Plans - 2 -- No Indexes
Main idea : push selects down. Reserves Sailors sid=sid bid=100 rating > 5 sname 5
11
Alternative Plans - 2 -- No Indexes
Main idea : push selects down. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) temp T2) (Sort-Merge Join) 5
12
Alternative Plan - 2 With 5 buffer pages, cost of plan:
Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) temp T2) (Sort-Merge Join) With 5 buffer pages, cost of plan: Scan Reserves (1000) + write temp T (if we have 100 boats, uniform distribution then it is : 10 pages,). Scan Sailors (500) + write temp T ( if we have 10 ratings then it is : 250 pages). Sort T1 (2*2*10), sort T2 (2*4*250), merge (10+250) Total: page I/Os. 5
13
Alternative Plans - 2 With 5 buffer pages, Scanning and filtering:
Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) temp T2) (Sort-Merge Join) Alternative Plans - 2 With 5 buffer pages, Scanning and filtering: IOs Optimization1: block nested loops join: join cost = 10+4*250, total cost = 2770. Optimization2: `push’ projections: T1 only sid, 10/4=[2.5]=3;T1 fits in 3 pages, T2 only sid and sname, 250/2=125 pages cost of BNL drops to 125 IOs, Total cost < 2000 IOs temp T1 temp T2 5
14
Alternative Plan : Using Indices?
Reserves Sailors sid=sid bid=100 sname rating > 5 Push Selections Down ? What Indices help here? Index on Reserves.bid? Index on Sailors.rating? Index on Sailors.sid? Index on Reserves.sid? 6
15
Example Plan : With Index
With index on Reserves.bid : Assume 100 bid values. Assume 100,000 tuples. Assume 100 tuples/disk We get 100,000/100 = tuples On 1000/100 = 10 disk pages. If index clustered, Cost = 10 I/Os. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) 6
16
Example Plan : With Index
With index on Reserves.bid : Assume 100 bid values. Assume 100,000 tuples. Assume 100 tuples/disk We get 100,000/100 = tuples On 1000/100 = 10 disk pages. If index clustered, Cost = 10 I/Os. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) 6
17
Example Plan : Use Another Index
Index on Sailors? Which? Selection on Sailors may reduce number of tuples considered in join. But then requires us to materialize the Sailor tuples again Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loop Join, with pipelining ) ? 6
18
Index Nested Loop with Pipelining:
Outer is not materialized Projecting out unnecessary fields from outer doesn’t help Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) 6
19
Example Plan Continued
Index on Sailors.sid : sid is key for Sailors. At most one matching tuple, unclustered on sid is OK. Cost? For each Reserves tuples (1000): get matching Sailors tuple (1.2 I/O). So total IOs. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Index Nested Loops, with pipelining ) 6
20
Alternative Plan : With Second Index
Selection Push down? Push (rating>5) before join ? Answer: No, because of availability of sid index on Sailors. Reason : No index on selection result. Then lookup requires scan Sailors. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) 6
21
Summary A query is evaluated by converting it to a tree of operators and evaluating the operators in the tree. There are alternative evaluation algorithms for each relational operator. Query evaluation must compare alternative plans based on their estimated costs Must understand query optimization to understand performance impact of a given database design on a query workload 19
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.