Query Optimization Cases. D. ChristozovINF 280 DB Systems Query Optimization: Cases 2 Executable Block 1 Algorithm using Indices (if available) Temporary.

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
Introduction to Database Systems1 Records and Files Storage Technology: Topic 3.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe Algorithms for SELECT and JOIN Operations (8) Implementing the JOIN Operation: Join (EQUIJOIN, NATURAL.
CS 540 Database Management Systems
Calculate the record size R in bytes.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
1 Introduction to Database Systems CSE 444 Lectures 19: Data Storage and Indexes November 14, 2007.
1 Lecture 8: Data structures for databases II Jose M. Peña
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
SPRING 2004CENG 3521 Query Evaluation Chapters 12, 14.
Query Evaluation. SQL to ERA SQL queries are translated into extended relational algebra. Query evaluation plans are represented as trees of relational.
1 Overview of Storage and Indexing Chapter 8 (part 1)
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
David Konopnicki Choosing Access Path ä The basic methods. ä The access paths and when they are available. ä How the optimizer chooses among the.
1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears,
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Query Processing: The Basics Chapter Topics How does DBMS compute the result of a SQL queries? The most often executed operations: –Sort –Projection,
Query Processing & Optimization
CS 4432query processing - lecture 171 CS4432: Database Systems II Lecture #17 Join Processing Algorithms (cont). Professor Elke A. Rundensteiner.
DBMS Internals: Storage February 27th, Representing Data Elements Relational database elements: A tuple is represented as a record CREATE TABLE.
Layers of a DBMS Query optimization Execution engine Files and access methods Buffer management Disk space management Query Processor Query execution plan.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 8.
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Oracle Data Block Oracle Concepts Manual. Oracle Rows Oracle Concepts Manual.
1 Physical Data Organization and Indexing Lecture 14.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
1 Chapter 17 Disk Storage, Basic File Structures, and Hashing Chapter 18 Index Structures for Files.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 8 “How index-learning turns no student pale Yet holds.
Datafaces Data Base Management Software (DBMS) is a tool used to transform Data into Information. What is Data…? What is Information…? What is a Database…?
SCUHolliday - COEN 17814–1 Schedule Today: u Query Processing overview.
1 Overview of Storage and Indexing Chapter 8 (part 1)
Chapter- 14- Index structures for files
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Database Management COP4540, SCS, FIU Physical Database Design (2) (ch. 16 & ch. 6)
CS4432: Database Systems II Query Processing- Part 2.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
CSCI Query Processing1 QUERY PROCESSING & OPTIMIZATION Dr. Awad Khalil Computer Science Department AUC.
Advance Database Systems Query Optimization Ch 15 Department of Computer Science The University of Lahore.
1 B + -Trees: Search  If there are n search-key values in the file,  the path is no longer than  log  f/2  (n)  (worst case).
CS 440 Database Management Systems Lecture 5: Query Processing 1.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Indexing. 421: Database Systems - Index Structures 2 Cost Model for Data Access q Data should be stored such that it can be accessed fast q Evaluation.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Module 11: File Structure
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
CS 440 Database Management Systems
Database Management System
Secondary Storage Data Retrieval.
Chapter 12: Query Processing
COST ESTIMATION FOR THE RELATIONAL ALGEBRA OPERATIONS MIT 813 GROUP 15 PRESENTATION.
File Processing : Query Processing
Query Processing B.Ramamurthy Chapter 12 11/27/2018 B.Ramamurthy.
Lecture 21: Indexes Monday, November 13, 2000.
Advance Database Systems
Chapter 12 Query Processing (1)
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

Query Optimization Cases

D. ChristozovINF 280 DB Systems Query Optimization: Cases 2 Executable Block 1 Algorithm using Indices (if available) Temporary data set Executable Block 2 Algorithm without using indices, Indices not available for temporary data sets Execution Pipe-line

D. ChristozovINF 280 DB Systems Query Optimization: Cases 3 Disk with block-size – 512 B; size of physical address – 8 B; access time – 10μs (time for one I/O operation. Tables: A with 1,000,000 records, and B with 500,000 record, different values for A.A3 are 50,000; and for B.B3 – 10,000. CREATE TABLE A ( A1 CHAR(10) NOT NULL UNIQUE, A2 CHAR(20) NOT NULL UNIQUE, A3 CHAR(10), A4 CHAR(40), PRIMARY KEY (A1)); CREATE TABLE B( B1 CHAR(10) NOT NULL UNIQUE, B2 CHAR(22) NOT NULL UNIQUE, B3 CHAR(30), B4 CHAR(30), PRIMARY KEY (B1)); CREATE INDEX A.A1; CREATE INDEX A.A2; CREATE INDEX A.A3; CREATE INDEX B.B1; CREATE INDEX B.B2; CREATE INDEX B.B3; SELECT * FROM A WHERE A3 IN ( SELECT B1 FROM B WHERE B4 = ‘ABCDEFGHIJ’); CASE1CASE1

D. ChristozovINF 280 DB Systems Query Optimization: Cases 4 1. Execution Plan SELECT * FROM A WHERE A3 IN ( SELECT B1 FROM B WHERE B4 = ‘ABCDEFGHIJ’); SELECT * FROM A WHERE A3 IN list SELECT B1 FROM B WHERE B4 = ‘ABCDEFGHIJ’); There is no index for B4 BRUTE FORCE SEARCH There is cluster index for A3 Search for every value in the list Natural Pipe-line

D. ChristozovINF 280 DB Systems Query Optimization: Cases 5 B+ tree for A.A3 Disk with block-size – 512 B; size of physical address – 8 B; access time – 10μs (time for one I/O operation. Tables: A with 1,000,000 records, and B with 500,000 record, different values for A.A3 are 50,000; and for B.B3 – 10, In average every category of values A.A3 has 1,000,000 / 50,000 = 20 records. 2.In one block of the clusters we may store 512/8 = 64 physical addresses – 63 addresses of records = one link to the other blocks in the same cluster. OR (in average) we need 1 block for a cluster or 1 I/O operation to retrieve all addresses of relevant records. 3.The B+ tree will have 50,000 entries (one for each category). The size of A.A3 field is 10 bytes. The order of B+ tree: 8*M + 10*(M-1) < 512 or M=29. 50% of 29 is 15 – worst case. Live-level = 3334 blocks. 4.Levels: (1) 223; (2) 15; (3) 1 OR search in the index need 4 I/O operations. 5.Size: B+ tree: = 3573; clusters: 50,000*1 = 50,000 OR totally 53,573 blocks

D. ChristozovINF 280 DB Systems Query Optimization: Cases 6 Time (I/O and seconds) for execution SELECT B1 FROM B WHERE B4 = ‘ABCDEFGHIJ’); 1.500,000 records, 90 bytes each ( ); 2.5 records in one block 3.100,000 blocks  100,000 I/O SELECT * FROM A WHERE A3 IN list Assumption: The result may have approximately 1000 values of B1. 1.For every value in the list we need in average 25 I/O operations (4 to search in index, 1 to retrieve the cluster and 20 to retrieve records from master file). 2.OR approximately 25,000 I/O. Totally: 125,000 I/O operations OR 1,250,000 µs = 1,250 s

D. ChristozovINF 280 DB Systems Query Optimization: Cases 7 CREATE TABLE A ( A1 CHAR(10) NOT NULL UNIQUE, A2 CHAR(20) NOT NULL UNIQUE, A3 CHAR(30), A4 CHAR(20), PRIMARY KEY (A1)); CREATE TABLE B( B1 CHAR(8) NOT NULL UNIQUE, B2 CHAR(22) NOT NULL UNIQUE, B3 CHAR(30), B4 CHAR(30), PRIMARY KEY (B1)); CREATE INDEX A.A1; CREATE INDEX A.A2; CREATE INDEX A.A3; CREATE INDEX B.B1; CREATE INDEX B.B2; CREATE INDEX B.B3; SELECT * FROM A INNER JOIN B ON A.A3 = B.B3 WHERE A1 = ‘ABCDEFGHIJ’; 1.Explain the query execution plan 2.Evaluate the order and height of the B+ trees for index-files, which will be used in query execution. 3.Evaluate the time (in I/O operation and in seconds) for execution of the query. 4.Rewrite the query to guarantee optimal execution Tables: A with 1,000,000 records, and B with 500,000 records. Disk: block-size – 512 B; size of physical address – 8 B; access time – 10μs. Where, different values for A.A3 are 50,000; and for B.B3 – 10,000. CASE2CASE2

D. ChristozovINF 280 DB Systems Query Optimization: Cases 8 1.Explain the query execution plan Step 1:select: using the primary index for A.A1 Step 2:join: single loop, using the cluster index created for B.B3 2.Evaluate the order and height of the B+ trees for index-files, which will be used in query execution. B.B3: (m-1)*30 + m*8 <= 512  m = 14; 50%m = 7 Leaf: 10,000 values/7 = 1428; next = 204; next = 29; next = 4 B+ tree will have 4 levels Cluster: average 50 records will have the same value; 50*8 (physical address) = 400 bytes or one block. A.A1: (m-1)*10 +m*8 <=512  m = 29; 50%m = 15 (it is ok if you use 14) Number of records in one block: 512/80 = 6; Number of blocks: 1,000,000/6 = 166,667 blocks in the Master file Leaf: 166,667/15 = 11,112; next =740; next = 49; next = 3; B+ tree will have 4 levels. 3.Evaluate the time (in I/O operation and in seconds) for execution of the query. Step 1: = 5 I/O operation to select the record from A Step 2: to combine with B rows 4 (B+ tree) + 1 (cluster) + 50 (average number of matching rows in B) = 55 I/O (could be more if the cluster needs more than one block). Totally: 60 I/O operations

Do you trust the optimizer of your DBMS? D. ChristozovINF 280 DB Systems Query Optimization: Cases 9 SELECT * FROM A INNER JOIN B ON A.A3 = B.B3 WHERE A1 = ‘ABCDEFGHIJ’; YES NO SELECT * FROM (SELECT * FROM A WHERE A1 = ‘ABCDEFGHIJ’) INNER JOIN B ON A.A3 = B.B3; This will ensure optimal order of execution and will guarantee use of indices in the two steps.

D. ChristozovINF 280 DB Systems Query Optimization: Cases 10 CREATE TABLE A ( A1 CHAR(10) NOT NULL UNIQUE, A2 CHAR(20) NOT NULL UNIQUE, A3 CHAR(30), A4 CHAR(20), PRIMARY KEY (A1)); CREATE TABLE B( B1 CHAR(10) NOT NULL UNIQUE, B2 CHAR(10) NOT NULL UNIQUE, B3 CHAR(40), B4 CHAR(30), PRIMARY KEY (B1)); CREATE INDEX A.A1; CREATE INDEX A.A2; CREATE INDEX A.A3; CREATE INDEX B.B1; CREATE INDEX B.B2; CREATE INDEX B.B3; Tables: A with 1,000,000 records, and B with 500,000 records: Disk: block-size – 512 B; size of physical address – 8 B; access time – 10μs Where, different values for A.A3 are 50,000; and for B.B3 – 10,000. SELECT * FROM B INNER JOIN A ON A.A1 = B.B2 WHERE A1 = ‘ABCDEFGHIJ’ CASE3CASE3 1.Explain the query execution plan 2.Evaluate the order and height of the B+ trees for index-files, which will be used in query execution. 3.Evaluate the time (in I/O operation and in seconds) for execution of the query. 4.Rewrite the query to guarantee optimal execution

D. ChristozovINF 280 DB Systems Query Optimization: Cases 11 1.Explain the query execution plan Step 1:select: using the primary index for A.A1 Step 2:join: using the primary index created for B.B1 2.Evaluate the order and height of the B+ trees for index-files, which will be used in query execution. A.A1: (m-1)*10 +m*8 <=512  m = 29; 50%m = 15 (it is ok if you use 14) Number of records in one block: 512/80 = 6; Number of blocks: 1,000,000/6 = 166,667 blocks in the Master file Leaf: 166,667/15 = 11,112; next =740; next = 49; next = 3; B+ tree will have 4 levels. B.B2: (m-1)*10 +m*8 <=512  m = 29; 50%m = 15 (it is ok if you use 14) Number of records in one block: 512/90 = 5; Number of blocks: 500,000/5 = 100,000 blocks in the Master file Leaf: 500,000/15 = 33,334; next =2,224; next = 148; next = 10; next 1 B+ tree will have 5 levels. 3.Evaluate the time (in I/O operation and in seconds) for execution of the query. Step 1: = 5 I/O operation to select the record from A Step 2: to combine with the row in B (B.B2 is a secondary key – condition for equality return a single row) 5 (B+ tree) + 1 (master file) = 6 I/O. Totally: 11 I/O operations = 110μs

D. ChristozovINF 280 DB Systems Query Optimization: Cases 12 SELECT * FROM B INNER JOIN A ON A.A1 = B.B2 WHERE A1 = ‘ABCDEFGHIJ’ SELECT * FROM B INNER JOIN (SELECT * FROM A WHERE A1 = ‘ABCDEFGHIJ’) ON A.A1 = B.B2; Instead of