Query Processing and Optimization

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Copyright © 2011 Ramez Elmasri and Shamkant Navathe Algorithms for SELECT and JOIN Operations (8) Implementing the JOIN Operation: Join (EQUIJOIN, NATURAL.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 19 Algorithms for Query Processing and Optimization.
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Fan Qi Database Lab 1, com1 #01-08 CS3223 Tutorial 8.
Query Optimization Goal: Declarative SQL query
DB performance tuning using indexes Section 8.5 and Chapters 20 (Raghu)
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
David Konopnicki Choosing Access Path ä The basic methods. ä The access paths and when they are available. ä How the optimizer chooses among the.
SPRING 2006CENG 352 Database Management Systems1 Query Optimization.
1 An Overview of Query Optimization Chapter Query Evaluation Problem: An SQL query is declarative – does not specify a query execution plan. A relational.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization See Sections 15.1, 2, 3, 7.
Chapter 19 Query Processing and Optimization
Relational Database Performance CSCI 6442 Copyright 2013, David C. Roberts, all rights reserved.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 14 – Join Processing.
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
Database Management 9. course. Execution of queries.
Query optimization in relational DBs Leveraging the mathematical formal underpinnings of the relational model.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Advance Database Systems Query Optimization Ch 15 Department of Computer Science The University of Lahore.
David Konopnicki –1997, Rev. MS Optimizing Join Statements To choose an execution plan for a join statement, the optimizer must choose: ä Access.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
Alon Levy 1 Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation. – Projection ( ) Deletes.
Query Processing and Query Optimization CS 157B Dennis Le Weishan Wang.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Fan Qi Database Lab 1, com1 #01-08 CS3223 Tutorial 8.
Fan Qi Database Lab 1, com1 #01-08 CS3223 Tutorial 5.
Chapter 13: Query Processing
1 Query Optimization. 2 Query Evaluation Problem: An SQL query is declarative – does not specify a query execution plan. A relational algebra expression.
Execution Plans Detail From Zero to Hero İsmail Adar.
CHAPTER 19 Query Optimization. CHAPTER 19 Query Optimization.
More SQL: Complex Queries,
CS 440 Database Management Systems
Database Management System
Prepared by : Ankit Patel (226)
Choosing Access Path The basic methods.
The Relational Algebra and Relational Calculus
An Overview of Query Optimization
Database Performance Tuning and Query Optimization
Using Subqueries to Solve Queries
Overview of Query Optimization
COST ESTIMATION FOR THE RELATIONAL ALGEBRA OPERATIONS MIT 813 GROUP 15 PRESENTATION.
Chapter 15 QUERY EXECUTION.
Evaluation of Relational Operations: Other Operations
CS222: Principles of Data Management Notes #13 Set operations, Aggregation Instructor: Chen Li.
Examples of Physical Query Plan Alternatives
Efficiency (Chapter 2).
File Processing : Query Processing
Relational Operations
The Relational Algebra and Relational Calculus
CS143:Evaluation and Optimization
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Database Applications (15-415) DBMS Internals- Part IX Lecture 21, April 1, 2018 Mohammad Hammoud.
Database Management Systems (CS 564)
Advance Database Systems
Chapter 11 Database Performance Tuning and Query Optimization
Relational Query Optimization
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Evaluation of Relational Operations: Other Techniques
Query Optimization.
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

Query Processing and Optimization

One View of Basic Query Processing Steps © Dr. Philip Cannata Data Management

Relational Algebra Relational Algebra Operation Relational Algebra Operation Name Equivalent SQL Operation A1, A2, …, Ak (r) Project select A1, A2, … Ak from r p (r) Select … from r where p r x s Cartesian-Product … from r, s A=C (r x s) Equijoin … from r, s where A = C  (r x s) Theta Join … from r, s where A  C r s Natural Join … from r natural join s  x (E) Rename (Alias) … from E x … x (A1, A2, An) (E) Rename select x.x A1, x. y A2, x. z An from E x table  Assignment create table tmp as (select … ) © Dr. Philip Cannata Data Management

Selection and Projection Rules Relational Algebra Rules Selection and Projection Rules Break complex selection into simpler ones: Cond1Cond2 (R)  Cond1 (Cond2 (R) ) Break projection into stages: attr (R)   attr ( attr (R)), if attr  attr Commute projection and selection:  attr (Cond(R))  Cond ( attr (R)), if attr  all attributes in Cond © Dr. Philip Cannata Data Management

Commutativity and Associativity of Join Relational Algebra Rules Commutativity and Associativity of Join Join commutativity: R S  S R used to reduce cost of nested loop evaluation strategies (smaller relation should be in outer loop) Join associativity: R (S T)  (R S) T used to reduce the size of intermediate relations in computation of multi-relational join – first compute the join that yields smaller intermediate result N-way join has T(N) N! different evaluation plans T(N) is the number of parenthesized expressions N! is the number of permutations Query optimizer cannot look at all plans (might take longer to find an optimal plan than to compute query brute-force). Hence it does not necessarily produce optimal plan © Dr. Philip Cannata Data Management

Pushing Selections and Projections Relational Algebra Rules Pushing Selections and Projections Cond (R  S)  R  Cond S Cond relates attributes of both R and S Reduces size of intermediate relation since rows can be discarded sooner Cond (R  S)  Cond (R)  S Cond involves only the attributes of R Reduces size of intermediate relation since rows of R are discarded sooner attr(R  S)  attr(attr (R)  S), if attributes(R)  attr  attr reduces the size of an operand of product © Dr. Philip Cannata Data Management

Oracle Join Methods Before showing query processing examples, we need to discuss some Oracle Join Methods. Nested loops join The nested loop iterates over all rows of the outer table. If there are conditions in the where clause of the SQL statement that apply to the outer table only, it checks whether those apply. If they do, the corresponding rows (from the where condition) in the joined inner table are searched. These rows from the inner table are either found using an index (if a suitable index exists) or by doing a full table scan. Hash join Hash joins are used when the joining large tables. The optimizer uses the smaller of the 2 tables to build a hash table in memory and the scans the large table and compares the hash value (of rows from large table) with this hash table to find the joined rows. Merge join (also called sort merge join) Sort merge join is used to join two independent data sources. They perform better than nested loop joins when the volume of data is big in tables but not as good as hash joins in general. They perform better than hash join when the join condition columns are already sorted or there is no sorting required. © Dr. Philip Cannata Data Management

Another View of Basic Query Processing Steps Example of a Bind Variable: select * from emp where sal = :salary © Dr. Philip Cannata Data Management

Another View of Basic Query Processing Steps © Dr. Philip Cannata Data Management

Query Processing Example – SQL, Rational Algebra, and Query Tree select * from s_dept d join s_region r on d.region_id = r.id join s_warehouse w on r.id = w.region_id; * (r.id=w.region_id (d.region_id=r.id ( d (s_dept) x  r (s_region)) x  w (s_warehouse))) * r.id=w.region_id x d.region_id=r.id x  w (s_warehouse)  d (s_dept)  r (s_region) © Dr. Philip Cannata Data Management

Query Processing Example select * from s_dept d join s_region r on d.region_id = r.id join s_warehouse w on r.id = w.region_id; Estimation on the number of rows of each operation © Dr. Philip Cannata Data Management

Do the same for the S_Region and S_Dept Tables. Query Processing Example So let’s gather statistics for the S_Warehouse Table. Do the same for the S_Region and S_Dept Tables. © Dr. Philip Cannata Data Management

Query Processing Example select * from s_dept d join s_region r on d.region_id = r.id join s_warehouse w on r.id = w.region_id; © Dr. Philip Cannata Data Management

© Dr. Philip Cannata Data Management

Query Processing Example Run these. Then get the Execution Plan for the query. create index dept_index on s_dept(region_id); create index warehouse_index on s_warehouse(region_id); select * from s_dept d join s_region r on d.region_id = r.id join s_warehouse w on r.id = w.region_id; © Dr. Philip Cannata Data Management

SQL, Rational Algebra, Query Tree, and Optimized Query Tree select * from s_dept d join s_region r on d.region_id = r.id join s_warehouse w on r.id = w.region_id where w.country = 'US’ d.name, r.name, w.city (w.county=‘US’ (r.id=w.region_id (d.region_id=r.id ( d (s_dept) x  r (s_region)) x  w (s_warehouse)))) d.name, r.name, w.city d.name, r.name, w.city w.county=‘US’ Optimized Query Tree r.id=w.region_id r.id=w.region_id Query Tree x x d.region_id=r.id d.region_id=r.id w.county=‘US’ x  w (s_warehouse) x  d (s_dept)  r (s_region)  w (s_warehouse)  d (s_dept)  r (s_region) © Dr. Philip Cannata Data Management

Query Processing Example select * from s_dept d join s_region r on d.region_id = r.id join s_warehouse w on r.id = w.region_id where w.country = 'US’ © Dr. Philip Cannata Data Management

(this only works on Dr. Cannata machine) A Really Hairy Query Processing Example – see the next page for the Execution Plan (this only works on Dr. Cannata machine) select null link, round(log(2, message_size)) label, ((obytes*8)/1000000)/nvl(kstat_dur, 10) "kstat 64 Stream" from (SELECT distinct eid, rid, NTH_VALUE(obytes, 2) OVER (PARTITION BY rid ORDER BY snaptime ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) obytes from (SELECT eid, rid, snaptime, avg(obytes) OVER (PARTITION BY rid ORDER BY snaptime ROWS BETWEEN 0 PRECEDING AND UNBOUNDED FOLLOWING) obytes from obytes - LAG(obytes, 1, 0) OVER (ORDER BY snaptime) AS obytes FROM (SELECT to_number(ltrim(eid, ':')) eid, to_number(ltrim(rid, ':')) rid, to_number(ltrim(snaptime, ':')) snaptime, to_number(ltrim(obytes, ':')) obytes FROM TABLE(SEM_MATCH( '(?sub :rid ?rid) (?sub :eid ?eid) (?sub :snaptime ?snaptime) (?sub :obytes ?obytes) (?sub :name :2c903000ac562-1_data_stats) (?sub :class :hca)', SEM_Models('OBSERV_RDF_MODEL'), null, SEM_ALIASES(SEM_ALIAS('',':')), null))) where eid = :P10_EXP) order by rid, snaptime)) k, ibdatarun d where d.eid = :P10_EXP and d.rid = k.rid and streams = 64 order by label © Dr. Philip Cannata Data Management

© Dr. Philip Cannata Data Management