Lecture 26 Monday, December 3, 2001.

Slides:



Advertisements
Similar presentations
Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
Advertisements

CS4432: Database Systems II
Databases and Information Systems 1 Prof. Dr. Stefan Böttcher Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents: selectivity.
CS CS4432: Database Systems II Logical Plan Rewriting.
Query Optimization May 31st, Today A few last transformations Size estimation Join ordering Summary of optimization.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Query Optimization Goal: Declarative SQL query
Query Optimization. Query Optimization Process (simplified a bit) Parse the SQL query into a logical tree: –identify distinct blocks (corresponding to.
Lecture 14: Query Optimization. This Lecture Query rewriting Cost estimation –We have learned how atomic operations are implemented and their cost –We’ll.
Cost-Based Transformations. Why estimate costs? Well, sometimes we don’t need cost estimations to decide applying some heuristic transformation. –E.g.
Query Rewrite: Predicate Pushdown (through grouping) Select bid, Max(age) From Reserves R, Sailors S Where R.sid=S.sid GroupBy bid Having Max(age) > 40.
Query Optimization: Transformations May 29 th, 2002.
THE QUERY COMPILER 16.6 CHOOSING AN ORDER FOR JOINS By: Nitin Mathur Id: 110 CS: 257 Sec-1.
1 Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Advanced Database Technology March 25, 2004 QUERY COMPILATION II Lecture based on [GUW,
Cost-Based Transformations. Why estimate costs? Sometimes we don’t need cost estimations to decide applying some heuristic transformation. –E.g. Pushing.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
CS 4432logical query rewriting - lecture 151 CS4432: Database Systems II Lecture #15 Logical Query Rewriting Professor Elke A. Rundensteiner.
16.5 Introduction to Cost- based plan selection Amith KC Student Id: 109.
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
CS411 Database Systems Kazuhiro Minami 12: Query Optimization.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
CPS216: Advanced Database Systems Notes 08:Query Optimization (Plan Space, Query Rewrites) Shivnath Babu.
Query Optimization March 6 th, Query Optimization Process (simplified a bit) Parse the SQL query into a logical tree: –identify distinct blocks.
Query Optimization Imperative query execution plan: Declarative SQL query Ideally: Want to find best plan. Practically: Avoid worst plans! Goal: Purchase.
Query Optimization March 10 th, Very Big Picture A query execution plan is a program. There are many of them. The optimizer is trying to chose a.
CPS216: Data-Intensive Computing Systems Introduction to Query Processing Shivnath Babu.
1 Lecture 23: Query Execution Wednesday, March 8, 2006.
1 Lecture 25 Friday, November 30, Outline Query execution –Two pass algorithms based on indexes (6.7) Query optimization –From SQL to logical.
CS411 Database Systems Kazuhiro Minami 12: Query Optimization.
1 Lecture 25: Query Optimization Wednesday, November 26, 2003.
CS 440 Database Management Systems Query Optimization 1.
Query Optimization Problem Pick the best plan from the space of physical plans.
CSE 544: Lecture 14 Wednesday, 5/15/2002 Optimization, Size Estimation.
CS4432: Database Systems II Query Processing- Part 1 1.
Tallahassee, Florida, 2016 COP5725 Advanced Database Systems Query Optimization Spring 2016.
Database Applications (15-415) DBMS Internals- Part IX Lecture 20, March 31, 2016 Mohammad Hammoud.
Query Processing and Query Optimization Database System Implementation CSE 507 Slides adapted from Silberschatz, Korth and Sudarshan Database System Concepts.
DBMS Internals Execution and Optimization May 10th, 2004.
1 Ullman et al. : Database System Principles Notes 6: Query Processing.
Chapter 14: Query Optimization
CS 440 Database Management Systems
Query Optimization Heuristic Optimization
Database System Implementation CSE 507
Chapter 13: Query Optimization
CS222P: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Lecture 24: Query Execution and Optimization
Data Engineering Query Optimization (Cost-based optimization)
Indexing and Execution
Overview of Query Optimization
Introduction to Database Systems CSE 444 Lecture 22: Query Optimization November 26-30, 2007.
Lecture 26: Query Optimization
Focus: Relational System
Database Applications (15-415) DBMS Internals- Part IX Lecture 21, April 1, 2018 Mohammad Hammoud.
Query Optimization and Perspectives
Algebraic Laws.
Lecture 25: Query Execution
Lecture 24: Query Execution
Lecture 25: Query Optimization
CPSC-608 Database Systems
Monday, 5/13/2002 Hash table indexes, query optimization
Query Optimization March 7th, 2003.
CPS216: Data-Intensive Computing Systems Query Processing (contd.)
Query Optimization May 16th, 2002
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Lecture 23: Monday, November 25, 2002.
CSE 544: Optimizations Wednesday, 5/10/2006.
Lecture 27 Wednesday, December 5, 2001.
Lecture 24: Wednesday, November 27, 2002.
Presentation transcript:

Lecture 26 Monday, December 3, 2001

Outline Query optimization Algebraic laws Heuristic-based optimizations Choosing an order for Joins (7.6)

Algebraic Laws Commutative and Associative Laws Distributive Laws R U S = S U R, R U (S U T) = (R U S) U T R ∩ S = S ∩ R, R ∩ (S ∩ T) = (R ∩ S) ∩ T R S = S R, R (S T) = (R S) T Distributive Laws R (S U T) = (R S) U (R T)

Algebraic Laws Laws involving selection: s C AND C’(R) = s C(s C’(R)) = s C(R) ∩ s C’(R) s C OR C’(R) = s C(R) U s C’(R) s C (R S) = s C (R) S When C involves only attributes of R s C (R – S) = s C (R) – S s C (R U S) = s C (R) U s C (S) s C (R ∩ S) = s C (R) ∩ S

Algebraic Laws Example: R(A, B, C, D), S(E, F, G) s F=3 (R S) = ? s A=5 AND G=9 (R S) = ? D=E D=E

Algebraic Laws Laws involving projections PM(R S) = PN(PP(R) PQ(S)) Where N, P, Q are appropriate subsets of attributes of M PM(PN(R)) = PM,N(R) Example R(A,B,C,D), S(E, F, G) PA,B,G(R S) = P ? (P?(R) P?(S)) D=E D=E

Heuristic Based Optimizations Query rewriting based on algebraic laws Result in better queries most of the time Heuristics number 1: Push selections down Heuristics number 2: Sometimes push selections up, then down

Predicate Pushdown pname pname s price>100 AND city=“Seattle” maker=name maker=name price>100 city=“Seattle” Product Company Product Company The earlier we process selections, less tuples we need to manipulate higher up in the tree (but may cause us to loose an important ordering of the tuples, if we use indexes).

Predicate Pushdown Select y.name, Max(x.price) From product x, company y Where x.maker = y.name GroupBy y.name Having Max(x.price) > 100 Select y.name, Max(x.price) From product x, company y Where x.maker=y.name and x.price > 100 GroupBy y.name Having Max(x.price) > 100 For each company, find the maximal price of its products. Advantage: the size of the join will be smaller. Requires transformation rules specific to the grouping/aggregation operators. Won’t work if we replace Max by Min.

Pushing predicates up Bargain view V1: categories with some price<20, and the cheapest price Select V2.name, V2.price From V1, V2 Where V1.category = V2.category and V1.p = V2.price Create View V1 AS Select x.category, Min(x.price) AS p From product x Where x.price < 20 GroupBy x.category Create View V2 AS Select y.name, x.category, x.price From product x, company y Where x.maker=y.name

Query Rewrite: Pushing predicates up Bargain view V1: categories with some price<20, and the cheapest price Select V2.name, V2.price From V1, V2 Where V1.category = V2.category and V1.p = V2.price AND V1.p < 20 Create View V1 AS Select x.category, Min(x.price) AS p From product x Where x.price < 20 GroupBy x.category Create View V2 AS Select y.name, x.category, x.price From product x, company y Where x.maker=y.name

Query Rewrite: Pushing predicates up Bargain view V1: categories with some price<20, and the cheapest price Select V2.name, V2.price From V1, V2 Where V1.category = V2.category and V1.p = V2.price AND V1.p < 20 Create View V1 AS Select x.category, Min(x.price) AS p From product x Where x.price < 20 GroupBy x.category Create View V2 AS Select y.name, x.category, x.price From product x, company y Where x.maker=y.name AND V1.p < 20

Cost-based Optimizations Unit of Optimization Select-project-join Push selections down, pull projections up Hence: remains to choose the join order This is the main focus of the cost-based optimizer

Join Trees R1 R2 …. Rn Join tree: A join tree represents a plan. An optimizer needs to inspect many (all ?) join trees R3 R1 R2 R4

Types of Join Trees Left deep: R4 R2 R5 R3 R1

Types of Join Trees Bushy: R3 R2 R4 R1 R5

Types of Join Trees Right deep: R3 R1 R5 R2 R4

Problem Given: a query R1 R2 … Rn Assume we have a function cost() that gives us the cost of every join tree Find the best join tree for the query

Dynamic Programming Idea: for each subset of {R1, …, Rn}, compute the best plan for that subset In increasing order of set cardinality: Step 1: for {R1}, {R2}, …, {Rn} Step 2: for {R1,R2}, {R1,R3}, …, {Rn-1, Rn} … Step n: for {R1, …, Rn} A subset of {R1, …, Rn} is also called a subquery

Dynamic Programming For each subquery Q ⊆ {R1, …, Rn} compute the following: Size(Q) A best plan for Q: Plan(Q) The cost of that plan: Cost(Q)

Dynamic Programming Step 1: For each {Ri} do: Size({Ri}) = B(Ri) Plan({Ri}) = Ri Cost({Ri}) = (cost of scanning Ri)

Dynamic Programming Step i: For each Q ⊆ {R1, …, Rn} of cardinality i do: Compute Size(Q) (later…) For every pair of subqueries Q’, Q’’ s.t. Q = Q’ U Q’’ compute cost(Plan(Q’) Plan(Q’’)) Cost(Q) = the smallest such cost Plan(Q) = the corresponding plan

Dynamic Programming Return Plan({R1, …, Rn})

Dynamic Programming To illustrate, we will make the following simplifications: Cost(P1 P2) = Cost(P1) + Cost(P2) + size(intermediate result(s)) Intermediate results: If P1 = a join, then the size of the intermediate result is size(P1), otherwise the size is 0 Similarly for P2 Cost of a scan = 0

Dynamic Programming Example: Cost(R5 R7) = 0 (no intermediate results) Cost((R2 R1) R7) = Cost(R2 R1) + Cost(R7) + size(R2 R1) = size(R2 R1)

Dynamic Programming Relations: R, S, T, U Number of tuples: 2000, 5000, 3000, 1000 Size estimation: T(A B) = 0.01*T(A)*T(B)

Subquery Size Cost Plan RS RT RU ST SU TU RST RSU RTU STU RSTU