Cs44321 CS4432: Database Systems II Query Optimizer – Cost Based Optimization.

Slides:



Advertisements
Similar presentations
Examples of Physical Query Plan Alternatives
Advertisements

Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapters 14.
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
1 Overview of Query Evaluation Chapter Objectives  Preliminaries:  Core query processing techniques  Catalog  Access paths to data  Index matching.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
1 Relational Query Optimization Module 5, Lecture 2.
Cs4432optimization1 CS4432: Database Systems II Lecture #18 Query Optimizer – Wrap Up Professor Elke A. Rundensteiner.
Relational Query Optimization 198:541. Overview of Query Optimization  Plan: Tree of R.A. ops, with choice of alg for each op. Each operator typically.
Query Rewrite: Predicate Pushdown (through grouping) Select bid, Max(age) From Reserves R, Sailors S Where R.sid=S.sid GroupBy bid Having Max(age) > 40.
Relational Query Optimization (this time we really mean it)
Query Optimization Chapter 15. Query Evaluation Catalog Manager Query Optmizer Plan Generator Plan Cost Estimator Query Plan Evaluator Query Parser Query.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
Query Optimization Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Slide content courtesy Raghu Ramakrishnan.
THE QUERY COMPILER 16.6 CHOOSING AN ORDER FOR JOINS By: Nitin Mathur Id: 110 CS: 257 Sec-1.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Query Optimization Overview Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 1, 2005 Some slide content derived.
CS186 Final Review Query Optimization.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapter 15.
Query Optimization Overview Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2004 Some slide content derived.
Overview of Query Optimization v Plan : Tree of R.A. ops, with choice of alg for each op. –Each operator typically implemented using a `pull’ interface:
Query Processing Presented by Aung S. Win.
Query Optimization R&G, Chapter 15 Lecture 16. Administrivia Homework 3 available today –Written exercise; will be posted on class website –Due date:
Query Optimization, part 2 CS634 Lecture 13, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Overview of Implementing Relational Operators and Query Evaluation
Introduction to Database Systems1 Relational Query Optimization Query Processing: Topic 2.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
Query Optimization. overview Histograms A histogram is a data structure maintained by a DBMS to approximate a data distribution Equiwidth vs equidepth.
1 Overview of Query Evaluation Chapter Overview of Query Evaluation  Plan : Tree of R.A. ops, with choice of alg for each op.  Each operator typically.
Relational Query Optimization R & G Chapter 12/15.
Database systems/COMP4910/Melikyan1 Relational Query Optimization How are SQL queries are translated into relational algebra? How does the optimizer estimates.
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
1 Relational Query Optimization Chapter Query Blocks: Units of Optimization  An SQL query is parsed into a collection of query blocks :  An SQL.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Implementing Relational Operators and Query Evaluation Chapter 12.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Relational Query Optimization R & G Chapter 12/15.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Query Optimization Chapter 13.
1 Database Systems ( 資料庫系統 ) December 13, 2004 Chapter 15 By Hao-hua Chu ( 朱浩華 )
Completing the Physical- Query-Plan and Chapter 16 Summary ( ) CS257 Spring 2009 Professor Tsau Lin Student: Suntorn Sae-Eung Donavon Norwood.
CS 440 Database Management Systems Query Optimization 1.
Implementation of Database Systems, Jarek Gryz1 Relational Query Optimization Chapters 12.
1 Choosing an Order for Joins. 2 What is the best way to join n relations? SELECT … FROM A, B, C, D WHERE A.x = B.y AND C.z = D.z Hash-Join Sort-JoinIndex-Join.
Cost Estimation For each plan considered, must estimate cost: –Must estimate cost of each operation in plan tree. Depends on input cardinalities. –Must.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Database Applications (15-415) DBMS Internals- Part X Lecture 21, April 3, 2016 Mohammad Hammoud.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction To Query Optimization and Examples Chpt
CS4432: Database Systems II Query Processing- Part 1 1.
Database Applications (15-415) DBMS Internals- Part IX Lecture 20, March 31, 2016 Mohammad Hammoud.
Query Optimization. overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,
Prepared by : Ankit Patel (226)
CS222P: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Introduction to Query Optimization
Overview of Query Optimization
Chapter 15 QUERY EXECUTION.
Introduction to Database Systems
Examples of Physical Query Plan Alternatives
Query Optimization Overview
Relational Query Optimization
Database Applications (15-415) DBMS Internals- Part IX Lecture 21, April 1, 2018 Mohammad Hammoud.
Query Optimization Overview
Relational Query Optimization
Overview of Query Evaluation
Relational Query Optimization
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Relational Query Optimization (this time we really mean it)
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Completing the Physical-Query-Plan and Chapter 16 Summary ( )
Relational Query Optimization
Relational Query Optimization
Presentation transcript:

cs44321 CS4432: Database Systems II Query Optimizer – Cost Based Optimization

CS parse convert apply laws estimate result sizes consider physical plans estimate costs pick best execute {P1,P2,…..} {(P1,C1),(P2,C2)...} Pi answer SQL query parse tree logical query plan “improved” l.q.p l.q.p. +sizes statistics

A Query (Evaluation) Plan  An extended relational algebra tree  Annotations at each node indicate:  access methods to use for each table.  implementation methods used for each relational operator. Reserves Sailors sid=sid bid=100 rating > 5 sname Reserves Sailors sid=sid bid=100 rating > 5 sname (Nested Loop Join) (On-the-fly)

How to cost a physical plan? We need estimated size of intermediate results – Chapter 16.4 Cost of each operator/algorithm – Chapter 15 Buffer available for the query cs44324

Result of cost-based optimization Good physical plan –Consider different join orderings –Consider different access methods for accessing the relations cs44325

6 Many alternate search algorithms are possible: 1.Exhaustive listing of all possible plans 2.Dynamic programming 3.Branch and bound 4.Greedy bottom-up plan construction NOTE: often only left-deep trees are being considered to keep the search space small. How to generate that ‘good’ Physical Plan?

Why left-deep trees? Fundamental decision in System R (IBM): –Only left-deep join trees are considered. – Left-deep trees can generate all fully pipelined plans. Intermediate results not written to temporary files. Not all left-deep trees are fully pipelined (e.g., SM join). B A C D B A C D C D B A

Left-deep trees differ in : –the order of relations, –the access method for each relation, and –the join method for each join. Number of left deep plans still exponential – n relations implies n! left- deep tree orderings Enumeration of Left-Deep Trees

Enumerated using N passes (if N relations joined): – Pass 1: Find best 1-relation plan for each relation. – Pass 2: Find best way to join result of each 1-relation plan (as outer) to another relation. (All 2-relation plans.) – Pass N: Find best way to join result of a (N-1)-relation plan (as outer) to the N’th relation. (All N-relation plans.) For each subset of relations, retain: – Cheapest plan overall, plus – Cheapest plan for each interesting order of the tuples. Enumeration of Left-Deep Trees BACD Pass 1 Pass 2 Pass 3

Enumeration Example Example , Also read Chapter If too many relations (Chapter ): –Dynamic Programming expensive if too many relations (say more than 6 relations). –Use greedy (faster algorithm, but may yield plans not as good as Dynamic Programming) cs443210

cs Operator Types Stateful versus stateless operators –Select is stateless –Join is stateful Blocking versus non-blocking operators –Select is non-blocking –Agg functions are blocking Pipelined versus non-pipelined operators –Select is pipelinable –What about Join ? (see next slide)

cs Join? Join : Revelation is that it depends on the implementation strategy chosen for an operator –Iteration-join : pipelinable –Merge-sort join : blocking –Index join : pipelinable –Hash join : blocking

cs Costing of a complete plan We went over an example query plan Important: first we classify operators as pipelined or not-pipelined If pipelined, then for stateless operators the IO cost is zero (for example, for Select or Project)

cs Costing of a Complete Query Plan What about a Select? How is it implemented? If in middle of plan, pipeline it (one tuple at a time iteration) If at leaf of plan, identify any potential index to use index-lookup to implement the Select If index available, cost of implemention the select operator is equal to cost of an index lookup

cs Costing of a complete plan Main idea: –Determine # of distinct values – V(R,a) –Determine physical impl. Strategies per operator –Then, compute IO costs for each operator –Then, sum up all costs. –Done.