Download presentation
Presentation is loading. Please wait.
Published byJunior Parks Modified over 8 years ago
1
Switch off your Mobiles Phones or Change Profile to Silent Mode
2
Query Optimisation
3
Query Optimisation is an important component of a modern relational database system. Relational Database Systems provide a system managed optimisation facility by making use of a wealth of statistical information (meta data) available to the system
4
Query Optimisation Description A Query Optimiser is essentially a program for efficient evaluation of relational queries, making use of relevant statistic information Objective To choose the most efficient strategy for implementing a given relational query, thereby improve the efficiency and performance of a relational database system.
5
Need for Query Optimisation To perform automatic navigation A relational database system (based on non-navigational relational model) allows users to simply state what data they require and leave system to locate and process that data in database
6
Need for Query Optimisation To achieve acceptable performance There may be different plans (called query plan) to perform a single user query and query optimiser aims to select and execute most efficient query plan based on information available to system
7
Need for Query Optimisation To minimize existing discrepancies Due to existing discrepancy in speed between CPU and I/O devices, a query optimiser aims to minimise I/O activities by choosing ‘cheapest’ query plan for a given query
8
Effects of Optimisation -Example Consider following Student, Lending and Book tables: Student (student-no, student-name, gender, address) Lending (lending-no, student-no, book- no) Book (book-no, title, author, edition)
9
Effects of Optimisation -Example Assume that database tables contains 100 students in Student table 1000 lendings in Lending table, of which only 50 are for book ‘B1’ 5000 books in Book table Further assume that only results (intermediate relations) of up to 50 tuples can be kept in memory during query processing
10
Effects of Optimisation -Example Query Retrieve names of students who have borrowed book ‘B1’ SQL SELECT DISTINCT student-name FROM student, lending WHERE student.student-no = lending.student-no AND lending.book-no = ‘B1’
11
Query Plan A – No Optimisation Operation Sequence – Join – Select – Project Step 1 Join student and lending over student-no giving T1 Step 2 Select T1 where book-no = ‘B1’ giving T2 Step 3 Project T2 over student-name giving result
12
Query Plan A – No Optimisation We calculate number of database accesses (tuple I/O operations) required for each stem Number of tuple I/O is described as number of tuples (records) to be read and written during operation
13
Query Plan A – Calculation Step 1 – Join student and lending over student-no giving T1 Step 2 – Select T1 where book-no = ‘B1’ giving T2 Step 3 – Project T2 over student-name giving result IR: Intermediate Relation Total tuple I/O: 1020000 StepReadWriteIRSubtotal 1100 x 1000010000 1010000 21000005010000 300<= 500
14
Query Plan B–With Optimisation Operation Sequence – Select – Join – Project Step 1 Select lending where book-no = ‘B1’ giving T1 Step 2 Join T1 and student over student-no giving T2 Step 3 Project T2 over student-name giving result
15
Query Plan B–With Optimisation We again calculate number of tuple I/O operations required for each step
16
Query Plan B – Calculation Step 1 – Select lending where book-no = ‘B1’ giving T1 Step 2 – Join T1 and student over student-no giving T2 Step 3 – Project T2 over student-name giving result IR: Intermediate Relation Total tuple I/O: 10100 StepReadWriteIRSubtotal 11000005010000 2100050100 300<= 500
17
Comparison Plan A vs Plan B Ratio of I/O tuples (Plan A to Plan B): 1020000 / 10100 Intermediate relations in Plan B are much smaller than those in Plan A Tuple I/Os can be further reduced by using indexes If there is an index on book-no in lending table, tuples to be read will be just 50 instead of 10000
18
Four Stages of Optimisation Stage 1 Convert query into some internal form more suitable for machine manipulation E.g. Query tree Relational Algebra Stage 2 Further convert internal form into some equivalent and more efficient canonical form making use of well defined transformation rules
19
Four Stages of Optimisation StudentLending Join Restrict Project Result Example of Query Tree – Plan A (Join – Select – Project) Over student no Where book-no = ‘B1’ Over student-name
20
Four Stages of Optimisation Some Transformation Rules Rule 1 (A where Restrict-1) where Restrict-2 = A (where Restrict-1 AND Restrict-2) Rule 2 A([Project]) where Restrict = (A where Restrict) [Project] Rule 3 (A [Project-1]) [project-2] = A [Project-2] Rule 4 (A join B) where Restrict-on-A AND Restrict-on-B = (A where Restrict-on-A0 Join (B where Restrict-on-B)
21
Four Stages of Optimisation Rule 5 where p OR (q AND r) = where (p OR q) AND (p OR r) Rule 6 (A Join B) Join C = A Join (B Join C) Rule 7 Perform projects as early as possible Rule 8 Perform restrictions as early as possible
22
Four Stages of Optimisation Stage 3 Choose a set of candidate low-level procedures using statistics about database Low Level Operations (e.g. join, select, project) Implementation procedures (one for each low level operation based on varying conditions) Cost formulae (one for each implementation procedure)
23
Four Stages of Optimisation Stage 4 Generate a set of candidate query plans and choose best (cheapest) of those plans by evaluating cost formulae Process of selecting a query plan is also called ‘access path’ selection ‘cheapest’ query plan is normally considered to be one which produces minimum I/O tuple operations and smallest set of intermediate relations
24
Database Statistics Selection of ‘optimal’ query plans in optimisation process makes use of database statistics stored in System Catalogue or Data Dictionary of database system In other words, without this information (meta data) being available, query optimiser will not be able to choose most efficient query plan for implementing a given query
25
Database Statistics Typical Database Statistics include For each base table Cardinality Number of pages for this tables For each column of each base table Number of distinct values Maximum, minimum and average value Actual values and their frequencies For each index Number of levels Number of leaf pages
26
Any Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.