Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS4432: Database Systems II Query Processing- Part 1 1.

Similar presentations


Presentation on theme: "CS4432: Database Systems II Query Processing- Part 1 1."— Presentation transcript:

1 CS4432: Database Systems II Query Processing- Part 1 1

2 2 Example Data: relation R (A, B, C) relation S (C, D, E) Query: SELECT B, D FROM R, S WHERE R.A = “ c ” and S.E = 2 and R.C=S.C Select B, D From R, S Where R.A = “ c ” And S.E = 2 And R.C=S.C

3 3 Relational Algebra – Possible Query Plans OR:  B,D [  R.A= “ c ”  S.E=2  R.C = S.C (RXS)] Plan 1 Select B, D From R, S Where R.A = “ c ” And S.E = 2 And R.C=S.C

4 4 Plan 2 Relational Algebra – Possible Query Plans Select B, D From R, S Where R.A = “ c ” And S.E = 2 And R.C=S.C Natural join (C is common column)

5 5 Select B, D From R, S Where R.A = “ c ” And S.E = 2 And R.C=S.C Plan 2

6 6 Plan 3 Relational Algebra – Possible Query Plans Select B, D From R, S Where R.A = “ c ” And S.E = 2 And R.C=S.C Natural join (1) Use R.A index to select R tuples with R.A = “ c ” (2) For each R.C value found, use S.C index to find matching tuples (3) Eliminate S tuples S.E  2 (4) Join matching R,S tuples, project B,D attributes and place in result Assume indexes on R.A and S.C

7 7 =“c”=“c” Check E=2? output: Select B, D From R, S Where R.A = “ c ” And S.E = 2 And R.C=S.C Plan 3

8 8 Query Compilation, Optimization and Execution

9 Overview of Query Execution 9 SQL Query  Compile  Optimize  Execute

10 Example 10 Query : Find the movies with stars born in 1960 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘ %1960 ’ ); SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘ %1960 ’ );

11 Step 1: Generate Parse Tree 11

12 Step 2: Relational Algebra & Logical Plan 12 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘ %1960 ’ ); SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘ %1960 ’ ); Expression TreeLogical Query Plan Expression Tree is a midway between a parse tree and relational algebra

13 Step 3: Optimize & Create Several Logical Plans 13 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘ %1960 ’ ); SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘ %1960 ’ ); Plan 1 Plan 2 Question: Push project to StarsIn?

14 Overview of Query Execution 14 SQL Query  Compile  Optimize  Execute

15 Step 4: Estimate the Sizes That is done for each plan 15

16 Step 5: Consider Physical Plans Physical plan means how each operator will execute (which algorithm) – E.g., Join can be nested-loop, hash-based, merge-based, or sort-based Each logical plan will map to multiple physical plans 16 Logical Plan One Physical Plan

17 Overview of Query Execution 17 SQL Query  Compile  Optimize  Execute

18 Step 6: Estimate the Cost This is done for each physical plan 18 Select the cheapest to execute…

19 Overview of Query Execution 19 SQL Query  Compile  Optimize  Execute

20 Evaluating Relational Operators 20

21 Common Techniques Algorithms for evaluating relational operators use some simple ideas extensively: Indexing: Can use WHERE conditions to retrieve small set of tuples (selections, joins) Iteration: Sometimes, faster to scan all tuples even if there is an index. (And sometimes, we can scan the data entries in an index instead of the table itself.) Partitioning: By using sorting or hashing, we can partition the input tuples and replace an expensive operation by similar operations on smaller inputs. 21

22 Another Categorization One Pass Algorithms – Need one pass over the input relation(s) – Puts limitations on the size of the inputs vs. memory Two Pass Algorithms – Need two pass over the input relation(s) – Puts limitations on the size of the inputs vs. memory Multi-Pass Algorithms – Scale to any size and may need several passes over the input relation(s) 22

23 Common Statistics over Relation R 23 B(R): # of blocks to hold all R tuples T(R): # tuples in R S(R): # of bytes in each of R’s tuple V(R, A): # distinct values in attribute R.A M: # of memory buffers available R R R is “clustered”  R’s tuples are packed into blocks  Accessing R requires B(R) I/Os R is “not clustered”  R’s tuples are distributed over the blocks  Accessing R requires T(R) I/Os

24 Example: Join (R,S) 24 One Pass Iteration Open(): read S into memory GetNext(): for b in blocks of R: for t in tuples of b: if t matches tuple s: return join (t,s) return NotFound Close(): Clean memory Assume S is smaller than R Key Metrics: – M >= B(S) + 1 I/O Cost: – B(S) + B(R) Notes: – Can use prefetching for R


Download ppt "CS4432: Database Systems II Query Processing- Part 1 1."

Similar presentations


Ads by Google