CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: 845-4259 1 Notes #9.

Slides:



Advertisements
Similar presentations
6.830 Lecture 10 Query Optimization 10/6/2014. Selinger Optimizer Algorithm algorithm: compute optimal way to generate every sub-join: size 1, size 2,...
Advertisements

Lecture 10 Query Optimization II Automatic Database Design.
6.830 Lecture 11 Query Optimization & Automatic Database Design 10/8/2014.
Statistics and Probability Theory
CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #2.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #10.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #7.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #6.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #4.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #11.
Traveling Salesman Problems Repetitive Nearest-Neighbor and Cheapest-Link Algorithms Chapter: 6.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #13.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #9.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #8.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #12.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #14.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes 1.
Choosing an Order for Joins (16.6) Neha Saxena (214) Instructor: T.Y.Lin.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #6.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes 1.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #2.
Combinations & Permutations. Essentials: Permutations & Combinations (So that’s how we determine the number of possible samples!) Definitions: Permutation;
1 Algorithms CSCI 235, Fall 2012 Lecture 9 Probability.
Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.
Lecture 9 Query Optimization.
Computing the chromatic number for block intersection graphs of Latin squares Ed Sykes CS 721 project McMaster University, December 2004 Slide 1.
Phrase-structure grammar A phrase-structure grammar is a quadruple G = (V, T, P, S) where V is a finite set of symbols called nonterminals, T is a set.
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #6.
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #8.
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 15 – Query Optimization.
Counting Techniques Tree Diagram Multiplication Rule Permutations Combinations.
CS 440 Database Management Systems Query Optimization 1.
Query Optimization Cases. D. ChristozovINF 280 DB Systems Query Optimization: Cases 2 Executable Block 1 Algorithm using Indices (if available) Temporary.
CPSC-310 Database Systems
Scholastic Dishonesty
Reducing Number of Candidates
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
Combinations & Permutations
Make an Organized List and Simulate a Problem
تصنيف التفاعلات الكيميائية
External Joins Query Optimization 10/4/2017
Combinations & Permutations
CPSC-608 Database Systems
A Series of Slides in 5 Parts Movement 2. BFS
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
Scholastic Dishonesty
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
A Series of Slides in 5 Parts Movement 4. Best-First
CPSC-608 Database Systems
A Series of Slides in 5 Parts Movement 3. IDFS
Presentation transcript:

CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #9

LQP Optimization with Size 2

Two techniques: 3

LQP Optimization with Size Two techniques: Estimating sizes of immediate relations For natural join: T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 4

LQP Optimization with Size Two techniques: Estimating sizes of immediate relations For natural join: T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} Consider different order of an operation (((R S) T) U) = (R U) (S T) 5

Consider: A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 We want to have a good LQP for A B C D 6

Left-deep join tree 7

8 ? ? ? ?

Left-deep join tree (all 4! = 24 permutations) 9 AB C D BA C D AB D C BA D C CA B D CA D B BC A D CB A D DA B C DA C B DB A C DB C A DC A B DC B A CB D A CD A B CD B A BC D A BD A C BD C A AC B D AC D B AD B C AD C B

10 AB C D BA C D AB D C BA D C CA B D CA D B BC A D CB A D DA B C DA C B DB A C DB C A DC A B DC B A CB D A CD A B CD B A BC D A BD A C BD C A AC B D AC D B AD B C AD C B

Left-deep join tree (all 4! = 24 permutations) 11 AB C D BA C D AB D C BA D C CA B D CA D B BC A D CB A D DA B C DA C B DB A C DB C A DC A B DC B A CB D A CD A B CD B A BC D A BD A C BD C A AC B D AC D B AD B C AD C B

Left-deep join tree (all 4!/2 = 12 permutations) 12 AB C D BA C D AB D C BA D C CA B D CA D B BC A D CB A D DA B C DA C B DB A C DB C A DC A B DC B A CB D A CD A B CD B A BC D A BD A C BD C A AC B D AC D B AD B C AD C B

Left-deep join tree (all 4!/2 = 12 permutations) 13 AB C D BA C D AB D C BA D C CA B D CA D B BC A D CB A D DA B C DA C B DB A C DB C A DC A B DC B A CB D A CD A B CD B A BC D A BD A C BD C A AC B D AC D B AD B C AD C B

Left-deep join tree 14 A B C D B D A C A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)}

Left-deep join tree 15 A B C D B D A C A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 5000 V(*, c) = 500

Left-deep join tree 16 A B C D B D A C A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 5000 V(*, c) =

Left-deep join tree 17 A B C D B D A C A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 5000 V(*, c) = cost = 15000

Left-deep join tree 18 A B C D B D A C A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 5000 V(*, c) = V(*, a) = 50 V(*,b) = cost = cost =

Left-deep join tree (all 4!/2 = 12 permutations) 19 AB C D BA C D AB D C BA D C CA B D CA D B BC A D CB A D DA B C DA C B DB A C DB C A DC A B DC B A CB D A CD A B CD B A BC D A BD A C BD C A AC B D AC D B AD B C AD C B

Left-deep join tree (all 4!/2 = 12 permutations) 20 AB C D BA C D AB D C BA D C CA B D CA D B BC A D CB A D DA B C DA C B DB A C DB C A DC A B DC B A CB D A CD A B CD B A BC D A BD A C BD C A AC B D AC D B AD B C AD C B

Left-deep tree: general algorithm Input: n relations R 1, R 2, …, R n Output: the best left-deep join of R 1, R 2, …, R n 1.Construct a left-deep tree T of n leaves; 2.For each P of the permutations of the n relations R 1, R 2, …, R n Do assign the n relations to the leaves of T in order of P; evaluate the cost of the plan; 3. Pick the plan with the permutation that gives the minimum cost. 21

Dynamic Programming Consider all tree structures. 22

Dynamic Programming Consider all tree structures. Again consider A B C D Five tree structures: Each of (a)-(d) has 12 different assignments, and (e) has 3 different assignments. So totally there are 51 different ways to join the 4 relations. Too many when the number of relations is relatively large. 23 (a) (e) (d) (c)(b)

Dynamic Programming Consider 24 D D D D A A A A B B B B C C C C

Dynamic Programming Consider 25 D D D D A A A A B B B B C C C C

Dynamic Programming Consider We really only need to find the best way to join A B C, then join D with this best join. 26 D D D D A A A A B B B B C C C C

Dynamic Programming Consider We really only need to find the best way to join A B C, then join D with this best join. How do we find the best join of A B C? 27 D D D D A A A A B B B B C C C C

Dynamic Programming Consider We really only need to find the best way to join A B C, then join D with this best join. How do we find the best join of A B C? We consider all possible ways: (A B) C, (A C) B, (B C) A. 28 D D D D A A A A B B B B C C C C

Dynamic programming: general algorithm Input: n relations R 1, R 2, …, R n Output: the best join of R 1, R 2, …, R n 1.FOR each R i DO {cost(R i ) = 0; size(R i ) = 0}; 2.FOR each pair of R i and R j DO {cost(R i, R j ) = 0; compute size(R i R j )}; 3.FOR k = 3 TO n DO FOR any k relations S 1, S 2, …, S k of R 1, R 2, …, R n DO FOR each partition P = {(S i 1, …, S i j ), (S i j+1,…, S i k )} of S 1, S 2, …, S k DO cost(P) = cost(S i 1, …, S i j ) + size(S i 1 … S i j ) + cost(S i j+1, …, S i k ) + size(S i j+1 … S i k ); let cost(S 1, S 2, …, S k ) be the smallest cost(P) among the above partitions; computer size(S 1 S 2 … S k ) (and remember this partition P); 4. Return cost(R 1, R 2, …, R n ). 29

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 30 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 31 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size =

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 32 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 33 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 34 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D CB B D B C C D D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 35 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D CB B D B C C D D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 36 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D CB B D B C C D D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 37 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D CB B D B C C D D 2000

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 38 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D CB B D B C C D D 2000

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 39 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D CB B D B C C D D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 40 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, DA, C, DA, B, D CB B D B C C D D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 41 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, CB, C, D cost = 1000 size = 2000 A, C, DA, B, D B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 43 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 44 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 45 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} 3000 B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 46 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 47 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 48 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 49 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} A B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 50 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} A B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 51 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} A B C D B C D

Dynamic Programming: Example A(a, b): T(A) = 1000, V(A, a) = 100, V(A, b) = 200 B(b, c): T(B) = 1000, V(B, b) = 100, V(B, c) = 500 C(c, d): T(C) = 1000, V(C, c) = 20, V(C, d) = 1000 D(d, a): T(D) = 1000, V(D, d) = 1000, V(D, a) = 50 T(R(X, y) S(y, Z)) = T(R)T(S)/max{V(R, y), V(S, y)} 52 A cost = 0 size = 0 D cost = 0 size = 0 C cost = 0 size = 0 B cost = 0 size = 0 A, B cost = 0 size = 5000 C, D cost = 0 size = 1000 B, D cost = 0 size = B, C cost = 0 size = 2000 A, D cost = 0 size = A, C cost = 0 size = A, B, C cost = 2000 size = B, C, D cost = 1000 size = 2000 A, C, D cost = 1000 size = A, B, D cost = 5000 size = A, B, C, D cost = 3000 A {B,C,D}DC B {A,C,D} {A,B,D} {A,B,C} {A,B} {C,D} {A,C} {B,D}{A,D} {B,C} A B C D

LQP Optimization with Size: Summary Estimating sizes of immediate relations Consider different order of an operation left-deep tree dynamic programming 53

Construction of Physical Query Plan

Input: an optimized LQP T, and a main memory constraint M × ∩ π σ σ σ G F E D C B A

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A)

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; 2.Combining the “scan’s” with other operations; × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; 2.Combining the “scan’s” with other operations; 3.Replacing each internal node v of T by a proper algorithm; × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan J2P J1P CJ I1P

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; 2.Combining the “scan’s” with other operations; 3.Replacing each internal node v of T by a proper algorithm; 4.For each edge e in T, decide if e should be “materialized”; × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan J2P J1P CJ I1P

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; 2.Combining the “scan’s” with other operations; 3.Replacing each internal node v of T by a proper algorithm; 4.For each edge e in T, decide if e should be “materialized”; × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan J2P J1P CJ I1P

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; 2.Combining the “scan’s” with other operations; 3.Replacing each internal node v of T by a proper algorithm; 4.For each edge e in T, decide if e should be “materialized”; 5.Cut all materialized edges; × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan J2P J1P CJ I1P

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; 2.Combining the “scan’s” with other operations; 3.Replacing each internal node v of T by a proper algorithm; 4.For each edge e in T, decide if e should be “materialized”; 5.Cut all materialized edges; 6.Each subtree is a call to the subroutine at the root of the subtree. The order of the calls follows the bottom-up order in the structure. × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan J2P J1P CJ I1P 1 2 3

Construction of Physical Query Plan Input: an optimized LQP T, and a main memory constraint M 1.Replacing each leaf R of T by “scan(R)”; 2.Combining the “scan’s” with other operations; 3.Replacing each internal node v of T by a proper algorithm; 4.For each edge e in T, decide if e should be “materialized”; 5.Cut all materialized edges; 6.Each subtree is a call to the subroutine at the root of the subtree. The order of the calls follows the bottom-up order in the structure. × ∩ π σ σ σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan J2P J1P CJ I1P This produces an executable code for the input DB program

Physical Query Plan: Summary Replacing internal nodes of a LQP by proper algorithms; Deciding if a subroutine call should be pipelined or materialized; Many optimization techniques are involved here; In practice, heuristic optimization techniques are used to construct good physical query plans; The resulting physical query plan is an executable code.

secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery graduate database

secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery graduate database