CPSC-608 Database Systems Fall 2018 Instructor: Jianer Chen Office: HRBB 315C Phone: 845-4259 Email: chen@cse.tamu.edu Notes #17
parse tree-lqp convertor Query Optimization An input database program P Prepare a collection C of efficient algorithms for operations in relational algebra; parser View processing, Semantic checking parse tree preprocessing parse tree parse tree-lqp convertor logic query plan push selections, group joins apply logic laws logic query plan reduce the size of intermediate results Optimization via logic and size logic query plan Lqp-pqp convertor take care of issues in optimization and security. physical query plan choices of algorithms, data structures, and computational modes Optimization via algorithms and cost Machine executable code
parse tree-lqp convertor Query Optimization An input database program P Prepare a collection C of efficient algorithms for operations in relational algebra; parser View processing, Semantic checking parse tree preprocessing parse tree parse tree-lqp convertor logic query plan push selections, group joins apply logic laws logic query plan reduce the size of intermediate results Optimization via logic and size logic query plan Lqp-pqp convertor take care of issues in optimization and security. physical query plan choices of algorithms, data structures, and computational modes Optimization via algorithms and cost Machine executable code
Improving logic plan via logic laws Major Steps: Move selections σ so they can be applied early (σ reduces table size); Combine cross product × with selections σ to make natural joins and theta joins ( has more efficient algorithms); 3. group commutative and associative binary operations (e.g., ∩, U, ) (for later opt.); 4. May consider other operations (e.g., π, δ, τ, γ) C
Improving logic plan via logic laws Major Steps: Move selections σ so they can be applied early (σ reduces table size); Combine cross product × with selections σ to make natural joins and theta joins ( has more efficient algorithms); 3. group commutative and associative binary operations (e.g., ∩, U, ) (for later opt.); 4. May consider other operations (e.g., π, δ, τ, γ) C
Improving logic plan via logic laws σC(R⋃S) = σC(R)⋃σC(S) σC(R∩S) = σC(R)∩σC(S) = σC(R)∩S = R∩σC(S) σC(RS) = σC(R) σC(S) = σC(R) S ╳, ⨝, ⨝D: σC can only be pushed to the arguments that have all attributes in C. σC and D(R) = σC(σD(R)) = σD(σC(R))
Improving logic plan via logic laws Major Steps: Move selections σ so they can be applied early (σ reduces table size); Combine cross product × with selections σ to make natural joins and theta joins ( has more efficient algorithms); 3. group commutative and associative binary operations (e.g., ∩, U, ) (for later opt.); 4. May consider other operations (e.g., π, δ, τ, γ) Question: Do we also push other unary operations such as π, δ, τ, γ? Answer: Yes/maybe, they are in general less significant and should be done carefully.
Improving logic plan via logic laws Major Steps: Move selections σ so they can be applied early (σ reduces table size); Combine cross product × with selections σ to make natural joins and theta joins ( has more efficient algorithms); 3. group commutative and associative binary operations (e.g., ∩, U, ) (for later opt.); 4. May consider other operations (e.g., π, δ, τ, γ) Question: Do we also push other unary operations such as π, δ, τ, γ? Answer: Yes/maybe, they are in general less significant and should be done carefully.
Improving logic plan via logic laws Dealing with projections π
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size.
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care:
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: π a,d b>c R(a,b) S(c,d)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: π a,d b>c π a π d b>c R(a,b) S(c,d) R(a,b) S(b,c)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: π a,d ?? b>c π a π d b>c R(a,b) S(c,d) R(a,b) S(b,c)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression.
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved”
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ × σC ⋈ R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ × σC insert πL ⋈ R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ × σC ⋈ πL R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ × σC ⋈ πL the shared attributes should not be in A-L R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ × σC the attributes in C should not be in A-L ⋈ πL the shared attributes should not be in A-L R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ × σC no restrictions the attributes in C should not be in A-L ⋈ πL the shared attributes should not be in A-L R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ × all attributes are used σC no restrictions the attributes in C should not be in A-L ⋈ πL the shared attributes should not be in A-L R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ final result, all attributes are used × all attributes are used σC no restrictions the attributes in C should not be in A-L ⋈ πL the shared attributes should not be in A-L R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. Not “moved” × ∩ final result, all attributes are used × all attributes are used also include those later-used attributes L’ σC no restrictions the attributes in C should not be in A-L ⋈ πL, L’ the shared attributes should not be in A-L R(A)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πL(σC(R)) Not “moved” R πL σC
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πL(σC(R)) = πL(σC(πL,L(C)(R))) Not “moved” R πL σC πL σC πL,L(P) R
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πL(σC(R)) = πL(σC(πL,L(C)(R))) Not “moved” R πL σC πL σC C may contain attributes not in L πL,L(P) R
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πL(σC(R)) = πL(σC(πL,L(C)(R))) Not “moved” attributes in C but not in L R πL σC πL σC πL,L(C) R
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πL(σC(R)) = πL(σC(πL,L(C)(R))) Not “moved” attributes in C but not in L R πL σC πL σC πL,L(C) R
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πL(σC(R)) = πL(σC(πL,L(C)(R))) Not “moved” attributes in C but not in L R πL σC πL σC “inserted” instead of “moved” πL,L(C) R
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved”
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved” πA,D σC ⋈ R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved” πA,D σC ⋈ inserted πA inserted πD R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved” πA,D σC ⋈ πA πD R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved” πA,D σC ⋈ πA πD R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved” πA,D final result is fine σC C may use B, A1 ⋈ B is used πA πD R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved” πA,D final result is fine σC C may use B, A1, D1 ⋈ B is used πA πD R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) Not “moved” πA,D final result is fine σC C may use B, A1, D1 ⋈ B is used πA,A’ πD,D’ A’ = B plus those in A1 that are used in C D’ = B plus those in D1 that are used in C R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with projections π In general reduce the size of tables, but not as significantly as selections. Extended projections may even increase table size. Apply with care: General rule: a projection can be added anywhere as long as the eliminated attributes neither are used by any operation above the projection nor appear (explicitly/implicitly) in the final expression. πA,D(σC(R(A,A1,B)⋈S(B,D,D1))) = πA,D(σC(πA,A’(R(A,A1,B)))⋈πD,D’(S(B,D,D1)))) Not “moved” πA,D σC ⋈ πA,A’ πD,D’ A’ = B plus those in A1 that are used in C D’ = B plus those in D1 that are used in C R(A,A1,B) S(B,D,D1)
Improving logic plan via logic laws Dealing with duplicate elimination δ
Improving logic plan via logic laws Dealing with duplicate elimination δ δ can be pushed in join operations: δ(R×S) = δ(R)×δ(S), δ(R⋈S) = δ(R)⋈δ(S), δ(R ⋈C S) = δ(R) ⋈C δ(S).
Improving logic plan via logic laws Dealing with duplicate elimination δ δ can be pushed in join operations: δ(R×S) = δ(R)×δ(S), δ(R⋈S) = δ(R)⋈δ(S), δ(R ⋈C S) = δ(R) ⋈C δ(S). δ can swap with selections: δ(σC(R)) = σC(δ(R))
Improving logic plan via logic laws Dealing with duplicate elimination δ δ can be pushed in join operations: δ(R×S) = δ(R)×δ(S), δ(R⋈S) = δ(R)⋈δ(S), δ(R ⋈C S) = δ(R) ⋈C δ(S). δ can swap with selections: δ(σC(R)) = σC(δ(R)) but not with projections: δ(πL(R)) ≠ πL(δ(R))
Improving logic plan via logic laws Dealing with duplicate elimination δ δ can be pushed in join operations: δ(R×S) = δ(R)×δ(S), δ(R⋈S) = δ(R)⋈δ(S), δ(R ⋈C S) = δ(R) ⋈C δ(S). δ can swap with selections: δ(σC(R)) = σC(δ(R)) but not with projections: δ(πL(R)) ≠ πL(δ(R)) 3. δ can be pushed in set operations and bag-intersection: δ(R ∪S S) = δ(R) ∪S δ(S), δ(R ∩S S) = δ(R) ∩S δ(S), δ(R −S S) = δ(R) −S δ(S), δ(R ∩B S) = δ(R) ∩B δ(S)
Improving logic plan via logic laws Dealing with duplicate elimination δ δ can be pushed in join operations: δ(R×S) = δ(R)×δ(S), δ(R⋈S) = δ(R)⋈δ(S), δ(R ⋈C S) = δ(R) ⋈C δ(S). δ can swap with selections: δ(σC(R)) = σC(δ(R)) but not with projections: δ(πL(R)) ≠ πL(δ(R)) 3. δ can be pushed in set operations and bag-intersection: δ(R ∪S S) = δ(R) ∪S δ(S), δ(R ∩S S) = δ(R) ∩S δ(S), δ(R −S S) = δ(R) −S δ(S), δ(R ∩B S) = δ(R) ∩B δ(S)
Improving logic plan via logic laws Dealing with duplicate elimination δ δ can be pushed in join operations: δ(R×S) = δ(R)×δ(S), δ(R⋈S) = δ(R)⋈δ(S), δ(R ⋈C S) = δ(R) ⋈C δ(S). δ can swap with selections: δ(σC(R)) = σC(δ(R)) but not with projections: δ(πL(R)) ≠ πL(δ(R)) 3. δ can be pushed in set operations and bag-intersection: δ(R ∪S S) = δ(R) ∪S δ(S), δ(R ∩S S) = δ(R) ∩S δ(S), δ(R −S S) = δ(R) −S δ(S), δ(R ∩B S) = δ(R) ∩B δ(S)
Improving logic plan via logic laws Dealing with duplicate elimination δ δ can be pushed in join operations: δ(R×S) = δ(R)×δ(S), δ(R⋈S) = δ(R)⋈δ(S), δ(R ⋈C S) = δ(R) ⋈C δ(S). δ can swap with selections: δ(σC(R)) = σC(δ(R)) but not with projections: δ(πL(R)) ≠ πL(δ(R)) 3. δ can be pushed in set operations and bag-intersection: δ(R ∪S S) = δ(R) ∪S δ(S), δ(R ∩S S) = δ(R) ∩S δ(S), δ(R −S S) = δ(R) −S δ(S), δ(R ∩B S) = δ(R) ∩B δ(S) δ cannot be pushed in other bag operations: δ(R ∪B S) ≠ δ(R) ∪B δ(S), δ(R −B S) ≠ δ(R) −B δ(S)
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations.
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A.
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A. Grouping reduces table size, but could be expensive if the table is unorganized.
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A. Grouping reduces table size, but could be expensive if the table is unorganized. δ(γL,A(R)) = γL,A(R)
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A. Grouping reduces table size, but could be expensive if the table is unorganized. δ(γL,A(R)) = γL,A(R) If L and A are entirely contained in M, γL,A(R) = γL,A(πM(R))
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A. Grouping reduces table size, but could be expensive if the table is unorganized. δ(γL,A(R)) = γL,A(R) If L and A are entirely contained in M, γL,A(R) = γL,A(πM(R)) Sorting τ:
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A. Grouping reduces table size, but could be expensive if the table is unorganized. δ(γL,A(R)) = γL,A(R) If L and A are entirely contained in M, γL,A(R) = γL,A(πM(R)) Sorting τ: Sorting unchanges table size, and is expensive
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A. Grouping reduces table size, but could be expensive if the table is unorganized. δ(γL,A(R)) = γL,A(R) If L and A are entirely contained in M, γL,A(R) = γL,A(πM(R)) Sorting τ: Sorting unchanges table size, and is expensive Can be extremely helpful for later computation
Improving logic plan via logic laws Dealing with grouping and sorting γ, τ More case dependent. Also need to consider earlier/later operations. Grouping γL,A: group the tuples in terms of the values of the attributes in L, and for each group, compute the aggregations in A. Grouping reduces table size, but could be expensive if the table is unorganized. δ(γL,A(R)) = γL,A(R) If L and A are entirely contained in M, γL,A(R) = γL,A(πM(R)) Sorting τ: Sorting unchanges table size, and is expensive Can be extremely helpful for later computation Given for free if an index such as B+ tree is used.
Improving logic plan via logic laws General Remarks on LQP Optimization No transformation is always good; Except pushing selections down is usually always good