Download presentation
Presentation is loading. Please wait.
Published byShavonne Walsh Modified over 9 years ago
1
Academic Year 2014 Spring
2
MODULE CC3005NI: Advanced Database Systems “QUERY OPTIMIZATION” Academic Year 2014 Spring
3
Query Optimization: Query Optimization is the process of choosing the most efficient way to execute a SQL statement. When the cost-based optimizer was offered for the first time with Oracle 7, Oracle supported only standard relational data. Query Optimization is an important component of a modern relational database system. Relational Database Systems provide a system managed optimization facility by making use of available tools.
5
Query Optimization: Description A Query Optimizer is essentially a program for efficient evaluation of relational queries, making use of relevant statistic information Objective To choose the most efficient strategy for implementing a given relational query, thereby improve the efficiency and performance of a relational database system
6
Need of Query Optimization: 1. To perform automatic navigation: A relational database system (based on non-navigational relational model) allows users to simply state what data they require and leave system to locate and process that data in database
7
Need of Query Optimization: 2. To achieve acceptable performance: There may be different plans (called query plan) to perform a single user query and query optimizer aims to select and execute most efficient query plan based on information available to system
8
Need of Query Optimization: 3. To minimize existing differences: Due to existing difference in speed between CPU and I/O devices, a query optimizer aims to minimize I/O activities by choosing ‘cheapest’ query plan for a given query
9
Effects of Optimization – Example: Consider following Student, Lending and Book tables: Student (student_no, student_name, gender, address) Lending (lending_no, student_no, book_no) Book (book_no, title, author, edition)
10
Effects of Optimization – Example Assume that database tables contains 100 students in Student table 1000 lending in Lending table, of which only 50 are for book ‘B1’ 5000 books in Book table Further assume that only results (intermediate relations) of up to 50 tuples can be kept in memory during query processing
11
Effects of Optimization – Example: Query Retrieve names of students who have borrowed book ‘B1’ SQL SELECT DISTINCT student_name FROM student, lending WHERE student.student_no = lending.student_no AND lending.book_no = ‘B1’
12
Query Plan A – No Optimization: Operation Sequence – Join – Select – Project Step 1 Join student and lending over student_no giving T1 Step 2 Select T1 where book_no = ‘B1’ giving T2 Step 3 Project T2 over student_name giving result
13
Query Plan A – No Optimization: We calculate number of database accesses (tuple I/O operations) required for each item Number of tuple I/O is described as number of tuples (records) to be read and written during operation
14
Query Plan A – Calculation: Step 1 – Join student and lending over student: no giving T1 Step 2 – Select T1 where book_no = ‘B1’ giving T2 Step 3 – Project T2 over student_name giving result IR: Intermediate Relation Total tuple I/O: 1,02,0000 StepReadWriteIRSubtotal 1100 x 10,00010,000 1,01,0000 210,00005010,000 300<= 500
15
Query Plan B – with Optimization: Operation Sequence – Select – Join – Project Step 1 Select lending where book_no = ‘B1’ giving T1 Step 2 Join T1 and student over student_no giving T2 Step 3 Project T2 over student_name giving result
16
Query Plan B – with Optimization: We again calculate number of tuple I/O operations required for each step
17
Query Plan B – Calculation: Step 1 – Select lending where book_no = ‘B1’ giving T1 Step 2 – Join T1 and student over student_no giving T2 Step 3 – Project T2 over student_name giving result IR: Intermediate Relation Total tuple I/O: 10,100 StepReadWriteIRSubtotal 110,00005010,000 2100050100 300<= 500
18
Comparison Plan A vs. Plan B: Ratio of I/O tuples (Plan A to Plan B): 1,02,0000 / 10,100 Intermediate relations in Plan B are much smaller than those in Plan A Tuple I/O can be further reduced by using indexes If there is an index on book_no in lending table, tuples to be read will be just 50 instead of 10000
19
Four Stages of Optimization: what how The query processing activity therefore acts as an interface between the querying individual/process and the database. It relieves the querying individual/ process of the burden of deciding the best execution strategy. So while the querying individual/ process specifies what, the query processor determines how.
20
Four Stages of Optimization: Stage 1 Convert query into some internal form more suitable for machine manipulation e.g. Query Tree Relational Algebra Stage 2 Further convert internal form into some equivalent and more efficient Canonical Form making use of well defined transformation rules
21
Four Stages of Optimization: Example of Query Tree – Plan A (Join – Select – Project) StudentLending Join Restrict Project Result Over student_no Where book_no = ‘B1’ Over student_name
22
Four Stages of Optimization: Stage 3 Choose a set of low-level procedures using statistics about database Low Level Operations (e.g. join, select, project) Implementation procedures (one for each low level operation based on varying conditions) Cost formulae (one for each implementation procedure)
23
Four Stages of Optimization: Stage 4 Generate a set of candidate query plans and choose best of those plans by evaluating cost formulae Process of selecting a query plan is also called ‘access path’ selection ‘cheapest’ query plan is normally considered to be one which produces minimum I/O tuple operations and smallest set of intermediate relations
24
Database Statistics: Selection of ‘optimal’ query plans in optimization process makes use of database statistics stored in System Catalogue or Data Dictionary of database system In other words, without this information (meta data) being available, query optimizer will not be able to choose most efficient query plan for implementing a given query
25
Database Statistics: Typical Database Statistics include For each base table Cardinality Number of pages for this tables For each column of each base table Number of distinct values Maximum, minimum and average value Actual values and their frequencies
26
Database Statistics: Typical Database Statistics include (continued) For each index Number of levels Number of leaf pages
27
Thank you!!! Questions are WELCOME Academic Year 2014 Spring
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.