Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Optimization (CB Chapter 23.1-23.3) CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems: An Application Oriented.

Similar presentations


Presentation on theme: "Query Optimization (CB Chapter 23.1-23.3) CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems: An Application Oriented."— Presentation transcript:

1 Query Optimization (CB Chapter 23.1-23.3) CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems: An Application Oriented Approach 2ed by Kifer, Bernstein & Lewis, © Addison Wesley 2005)

2 SQL Widely used (only?) standard query language for relational databases Once SEQUEL (Structured English QUEry Language), now Structured Query Language Objectives –Easy to learn, easy to use –Create and modify the database and query from it DDL defines DML manipulates

3 SQL is Declarative, RA is Procedural SQL Statements describe the desired results, but do not specify a sequence of operations to get those results Relational Algebra expressions describe a specific sequence of operations to perform To evaluate a SQL statement, it needs to be translated into (a computer implementation of) RA first!

4 Query Processing Query Processing is the translation of SQL into RA- like nested function calls One query can have multiple translations SELECT roomNo, HotelName FROM Room, Hotel WHERE HotelName = ‘Savoy’ and Room.hotelNo=Hotel.hotelNo;  roomNo,HotelName (  hotelName=Savoy and Room.hotelNo=Hotel.hotelNo (Room x Hotel) )  roomNo,HotelName (  Room.hotelNo=Hotel.hotelNo ( (  hotelName=Savoy (Hotel)) x Room) )

5 Query Optimization Choose the translation that minimizes resource use (time, space) The second translation below is better. (Why?) SELECT roomNo, HotelName FROM Room, Hotel WHERE HotelName = ‘Savoy’ and Room.hotelNo=Hotel.hotelNo;  roomNo,HotelName (  hotelName=Savoy and Room.hotelNo=Hotel.hotelNo (Room x Hotel) )  roomNo,HotelName ( (  hotelName=Savoy (Hotel))  Room.hotelNo=Hotel.hotelNo Room)

6 Query Processing User SQL Decom- position Relational Algebra Optimi- zation Processing Engine Result (table) Efficient Rel. Algebra

7 More Detail of Query Processing

8 Steps in Query Processing Query Decomposition (create relational algebra expression) Query Optimization (create execution plan) Code Generation Query Execution

9 Parts of Optimization Query Plan Generator –Comes up with viable relational algebra expressions to improve the initial naïve one Cost Estimator –Estimates the cost (time / space) of each plan Optimization –Choosing the plan with the lowest cost, or at least “reasonably cheap”

10 Query Decomposition Check Syntax Build a relational algebra tree HotelRoom  hotelName=Savoy  Room.hotelNo=Hotel.hotelNo X

11 Query Transformation (Selection) Select with multiple AND conditions can be sequence of selects  hotelName=Savoy and Room.hotelNo=Hotel.hotelNo ( …) =  hotelName=Savoy (  Room.hotelNo=Hotel.hotelNo ( …)) Order of Select operations doesn’t matter =  Room.hotelNo=Hotel.hotelNo (  hotelName=Savoy ( …))

12 Query Transformation (Projection) Extra intermediate projections don’t matter  Name (  Name, Status (Student))=  Name (Student) Order of select and project doesn’t matter  Status=‘SR’ (  Status (Student)) =  Status (  Status=‘SR’ (Student))

13 Query Transformation (Join) Push Select through Join –Replace a select on a cross-product with a join –Joins can be implemented at the lowest level more efficiently than “materialized cross-product” Push Select through Product –If the attributes of the condition all belong to one table of the join, put the select on only the one table, so a smaller table is joined –Joins on smaller tables are faster than on larger ones More rules pp. 640-642

14 Pushing Select Example Find all seniors that take CPSC356  stu_id=id & crs=‘CPSC356’ (Student x Transcript) Separate the selects  stu_id=id (  crs=‘CPSC356’ (Student x Transcript)) Push the inner select  stu_id=id (Student x (  crs=‘CPSC356’ (Transcript))) Replace Select/Product by Join Student |x| stu_id=id (  crs=‘CPSC356’ (Transcript)))

15 Query Processing Example

16

17 Execution Plans Add specific algorithms to each relational algebra step Determine whether/how indices will be used Add pipelining (not storing intermediate data) where possible

18 Choosing Transformations Estimate cost of each tree based on –Table sizes –Numbers of distinct attribute values –Average number of tuples for selection condition –Methods used for join (e.g. indexed, hashed) Choose lowest cost tree Because estimates aren’t perfect, the absolute best tree might not be chosen!

19 Heuristics (Rules of Thumb) Perform Selection as early as possible –Unless doing it later lets you use an index Combine X and Selection into join operation Execute most restrictive Selections first Perform Projection as early as possible Compute common expressions once –Creating a view is a way to do this!

20 Consequences for SQL Programmer Using RA operations in SQL (e.g. explicit Join) constrains optimization –Good when “programmer knows best” –Bad when programmer prevents a better optimization Intermediate tables (i.e. views) can constrain optimization –Use views to compute common subexpressions When performance is substandard, tweaking the SQL can help! (Remember the heuristics).


Download ppt "Query Optimization (CB Chapter 23.1-23.3) CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems: An Application Oriented."

Similar presentations


Ads by Google