Optimization of Nested Queries Sujatha Thanigaimani COSC 6421.

Slides:



Advertisements
Similar presentations
A Guide to SQL, Seventh Edition. Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables.
Advertisements

February 18, 2012 Lesson 3 Standard SQL. Lesson 3 Standard SQL.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
Aliaksei A. HolubeuAdvances in Database Query Processing Universität Konstanz, 2005 Optimizing an SQL-like Nested Query.
1 Advanced SQL Queries. 2 Example Tables Used Reserves sidbidday /10/04 11/12/04 Sailors sidsnameratingage Dustin Lubber Rusty.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe Algorithms for SELECT and JOIN Operations (8) Implementing the JOIN Operation: Join (EQUIJOIN, NATURAL.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5 Modified by Donghui Zhang.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
SQL Subqueries Objectives of the Lecture : To consider the general nature of subqueries. To consider simple versus correlated subqueries. To consider the.
Query Optimization Goal: Declarative SQL query
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Introduction to Oracle9i: SQL1 Subqueries. Introduction to Oracle9i: SQL2 Chapter Objectives Determine when it is appropriate to use a subquery Identify.
Query Rewrite: Predicate Pushdown (through grouping) Select bid, Max(age) From Reserves R, Sailors S Where R.sid=S.sid GroupBy bid Having Max(age) > 40.
FALL 2004CENG 351 File Structures and Data Management1 SQL: Structured Query Language Chapter 5.
Sub Queries Pertemuan 5 Matakuliah: T0413/Current Popular IT II Tahun: 2007.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 8 Advanced SQL.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. SQL - part 2 - Database Management Systems I Alex Coman, Winter 2006.
Database Systems More SQL Database Design -- More SQL1.
Chapter 6 SQL: Data Manipulation Cont’d. 2 ANY and ALL u ANY and ALL used with subqueries that produce single column of numbers u ALL –Condition only.
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries.
SQL. Basic Structure SQL is based on set and relational operations with certain modifications and enhancements A typical SQL query has the form: select.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 3: Introduction.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Access Path Selection in a Relational Database Management System Selinger et al.
A Guide to MySQL 5. 2 Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables Use a subquery.
1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.
CSC271 Database Systems Lecture # 12. Summary: Previous Lecture  Row selection using WHERE clause  WHERE clause and search conditions  Sorting results.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
Query Rewrite Starburst Model (IBM). DB2 Query Optimizer (Starburst) Parsing and Semantic Checking Query Rewrite Plan Optimization Query Evaluation System.
Chapter 4 Multiple-Table Queries
More SQL Today:  Nested Queries  More SQL Nested Queries in SQL  Queries containing other queries  Inner query:  Can appear in FROM or WHERE clause.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
Chapter 12 Subqueries and Merge Statements
1 Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen Subqueries These slides are licensed under.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
A Guide to SQL, Eighth Edition Chapter Five Multiple-Table Queries.
1 SQL: The Query Language (Part II). 2 Expressions and Strings v Illustrates use of arithmetic expressions and string pattern matching: Find triples (of.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
IST 210 More SQL Todd Bacastow IST 210: Organization of Data.
In this session, you will learn to: Query data by using joins Query data by using subqueries Objectives.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
CS 540 Database Management Systems
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples. v If the.
SQL and Query Execution for Aggregation. Example Instances Reserves Sailors Boats.
SQL: Structured Query Language Instructor: Mohamed Eltabakh 1 Part II.
Chapter 7 Subqueries. Chapter Objectives  Determine when it is appropriate to use a subquery  Identify which clauses can contain subqueries  Distinguish.
Slide 1 of 32ASH-Training Querying and Managing Data Using SQL Server 2014 By: Segla In this session, you will learn to: Query data by using joins Query.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
More SQL: Complex Queries,
Slides are reused by the approval of Jeffrey Ullman’s
Tuning Transact-SQL Queries
CS 540 Database Management Systems
COP Introduction to Database Structures
Chapter 12 Subqueries and MERGE Oracle 10g: SQL
CS 440 Database Management Systems
CS222P: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
SQL Structured Query Language 11/9/2018 Introduction to Databases.
CS 405G: Introduction to Database Systems
Introduction to Database Systems
Database Applications (15-415) SQL-Part II Lecture 9, February 04, 2018 Mohammad Hammoud.
Instructor: Mohamed Eltabakh
More SQL: Complex Queries, Triggers, Views, and Schema Modification
CMPT 354: Database System I
SQL: Structured Query Language
Lecture 5- Query Optimization (continued)
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Presentation transcript:

Optimization of Nested Queries Sujatha Thanigaimani COSC 6421

Outline Introduction Kim’s Algorithm for efficient processing Count bug – Solution inequality bug – Solution Alternate Algorithm Modification of Kim’s algorithm

Nested Queries Queries containing other queries Inner query: –Can appear in FROM or WHERE clause “outer query” “inner query” Example: SELECT cname FROM borrower WHERE cname IN (SELECT cname FROM depositor) think this as a function that returns the result of the inner query

Evaluation of Nested Queries Naive method : Tuple Iteration Semantics (TIS) - inefficient. Kim’s Algorithm  Rationale : Interesting and powerful feature of SQL.  Unnesting : Process of transforming nested queries into canonical form.  Classified the Nested Queries for better understanding and processing

Types : SUPPLIER(sno, sname, sloc, sbudget), PARTS(pno,pname,qoh,color), PROJECT(jno,jname,pno,jbudget,jloc) SHIPMENT(sno,pno,jno,qty,shipdate) Type-A Nesting: Not correlated, aggregated sub query Example : SELECT SNO FROM SP WHERE PNo= (SELECT MAX(PN0) FROM P) can be evaluated independently of the outer query block, and the result of its evaluation will be a single constant

Type-N Nesting : Non correlated, not aggregated subquery SELECT SNO FROM SP WHERE PNO IS lN (SELECT PNO FROM P WHERE WEIGHT> 50) Evaluation : inner query block Q is processed, resulting in a list of values X which can then be substituted for the inner query block so that PNO IS IN Q becomes PNO IS IN X.The resulting query is then evaluated by nested iteration

Type-J Nesting : Correlated, not aggregated subquery SELECT SNAME FROM S WHERE SNO IS IN (SELECT SNO FROM SP WHERE QTY> 100 AND SPORIGIN = S. CITY). Type-JA Nesting : Correlated, aggregated subquery SELECT PNAM FROM P WHERE PNO= (SELECT MAX(PN0) FROM SP WHERE SPORlGlN = P.CITY) Evaluation : In TIS, the inner query block is processed once for each tuple of the outer relation which satisfies all simple predicates on the outer relation inefficient Kim developed alternate algorithms for efficient processing of nested queries.

Algorithm NEST-N-J (for type-N or type-J) 1. Combine the FROM clauses of all query blocks into one FROM clause 2. AND together the WHERE clauses of all query blocks, replacing IS IN by = 3. Retain the SELECT clause of the outermost query block The result is a canonical query logically equivalent to the original nested query.SELECT RiCk FROM RiFROM Ri,Rj WHERE RiCh IS IN WHERE RiCh = RjCm (SELECT RjCm FROM Rj)

Algorithm NEST-JA 1. Generate a temporary relation Rt(C1,Cn,Cn+l) from R2 such that Rt Cn+l is the result of applying the aggregate function AGG on the Cn+l column of R2 which have matching values of RI for Cl,C2, etc SELECT R1.Cn+2Rt(C1,..,Cn,Cn+1)=(SELECT FROM R1 C1,Cn,AGG(Cn+1) WHERE R1.Cn+1 = FROM R2 (SELECT AGG(R2.Cn+1) GROUP BY C1,..,Cn) FROM R2 WHERE R2.C1 = R1.C1 AND R2.C1 = R1.C1 AND … R2.Cn = R1.C1);

2. Transform the inner query block of the initial query by changing all references to R2 columns Join predicates which also reference Rl to the corresponding Rt columns. The result is a type-J nested query, which can be passed to algorithm NEST-N-J for transformation to its canonical equivalent. SELECT R1.Cn+2 FROM R1 WHERE R1.Cn+1 = (SELECT Rt.Cn+1 FROM Rt WHERE Rt.C1 = R1.C1 AND Rt.C2 = R1.C2 AND Rt.Cn = R1.C1);

Count bug : PARTS (PNUM,QOH) SUPPLY (PNUM,QUAN,SHIPDATE) SELECT PNUM FROM PARTS WHERE QOH = (SELECT COUNT( SHlPDATE ) FROM SUPPLY WHERE SUPPLY. PNUM = PARTS.PNUM AND SHIPDATE < l – l - 80) Parts PNUMQOH PNUMQUANSHIPDATE Supply PNUM 10 8 Result by TISResult PNUM 10

Solution using Outer Join R X A B S Y B C E R=+S XY Anull BB C E

Solution with outer joins temp (SUPPNUM,CT) = (select parts.PNUM, count(SHIPDATE) from parts, supply where SHIPDATE < and parts.PNUM =+ supply.PNUM group by parts.PNUM) parts.PNUM =+ supply.PNUM (for SHIPDATE < ) Parts.PNUMParts.QOHSupply.PNUMSupply.QUONSupply.SHIPDATE null

TEMP SUPPNUMCT Final Result PNUM 10 8 Drawbacks : 1.If the sub query has COUNT(*), this will always return a result > 0 because of the outer join. The '*' must be changed to a column name from the inner relation. SELECT PNUM FROM PARTS,TEMP WHERE PARTS.QOH = TEMP.CT AND PARTS.PNUM = TEMP.SUPPNUM

2. Duplicates Problem : Parts PNUMQOH Supply PNUMQUANSHIPDATE Result by TIS Our Result PNUM PNUM 8 SUPPNUMCT

Solution: 1.Remove duplicates before the join in the creation of Temp table is performed. TEMPI(PNUM) = (SELECT DISTINCT PNUM FROM PARTS) 2. Use the projection instead of outer relation in any join required to build the temp table TEMP2(SUPPNUM,CT) = (SELECT TEMP1.PNUM,COUNT(SHIPDATE) FROM TEMP1, SUPPLY WHERE SUPPLY.SHIPDATE < AND TEMP1.PNUM =+ SUPPLY.PNUM GROUP BY TEMP1.PNUM) SUPPNUMCT PNUM

Another bug : Relations other than equality SELECT PNUM FROM PARTS WHERE QOH = (SELECT MAX(QUAN) FROM SUPPLY WHERE SUPPLY. PNUM < PARTS.PNUM AND SHIPDATE < l – l - 80) TEMP (SUPPNUM, MAXQUAN) = SELECT PNUM, MAX(QUAN) FROM SUPPLY WHERE SHIPDATE < l-l-80 GROUP BY PNUM SELECT PNUM FROM PARTS, TEMP WHERE QOH = TEMP.MAXQUAN AND TEMP.SUPPNUM<PARTS.PNUM Max is calculated for each S.pnum but required is Max should be taken for a set of S.Pnum which are lesser than given P.Pnum Problem

Solution : 1. First join, then aggregate (Kim' was: First group, then join). TEMP (SUPPNUM, MAXQUAN) = SELECT PNUM, MAX(QUAN) FROM PARTS,SUPPLY WHERE SHIPDATE < l-l-80 AND SUPPLY.PNUM < PARTS.PNUM GROUP BY PNUM SELECT PNUM FROM PARTS,TEMP WHERE PARTS.QOH = TEMP.MAXQUAN AND PARTS.PNUM = TEMP.SUPPNUM

Modified Algorithm : Nest JA2 1. Project the Join column of the outer relation, and restrict it with any simple predicates applying to the outer relation TEMPI(PNUM) = (SELECT DISTINCT PNUM FROM PARTS) 2. Create a temporary relation, Joining the inner relation with the projection of the outer relation. If the aggregate function is COUNT, the Join must be an outer Join TEMP2(PNUM)= (SELECT PNUM FROM SUPPLY WHERE SHIPDATE < l-1-80) TEMP3 (PNUM,CT) = (SELECT TEMPl. PNUM, COUNT(TEMP2. SHIPDATE) FROM TEMPl,TEMP2 WHERE TEMPl.PNUM=+TEMP2.PNUM GROUP BY TEMPl. PNUM)

3.Join the outer relation with the temporary relation, according to the transformed version of the original query SELECT PNUM FROM PARTS,TEMP3 WHERE PARTS.QOH = TEMP3.CT AND PARTS.PNUM = TEMP3.PNUM Processing a General Nested Query : Recursive Approach procedure nest_g (query-block) for each predicate in the WHERE clause of query-block if predicate is a nested predicate (i.e contains inner query block) nest_g (inner_query_block) /* Determine type of nesting and call appropriate transformation procedure*/ /* if nesting is type-JA */ nest-JA2(inner_query_block)

Nest_g contd nest-N-J(query_block,inner_query_block) Else /* if nesting is type-A */ nest_a(inner_query_block) Else nest-N-J (query_block, inner_query_block) Return Advantage : Simplicity

Analysis

Modified Kim’s Algorithm : R.B OP1 TEMP1.COUNT : R.B OP1 O ITEMPI < I R OJ S I,Hence better than alternate algorithm

References: 1.Optimisation of Nested SQL Queries Revisited - Richard A Ganski, Harry K T Wong 2.Improved Unnesting Algorithms for Join Aggregate SQL Queries – M.Muralikrishna

Thank You