Chapter 12 Query Processing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Ramez Elmasri and Shamkant Navathe Algorithms for SELECT and JOIN Operations (8) Implementing the JOIN Operation: Join (EQUIJOIN, NATURAL.
Advertisements

CSCI 5708: Query Processing II Pusheng Zhang University of Minnesota Feb 5, 2004.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Query Processing.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Processing (overview)
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Query Processing Overview Catalog Information for Cost Estimation
1 40T1 60T2 30T3 10T4 20T5 10T6 60T7 40T8 20T9 R S C C R JOIN S?
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Evaluation and Optimization Main Steps 1.Translate into RA: select/project/join 2.Greedy optimization of RA: by pushing selection and projection.
Dr. Kalpakis CMSC 461, Database Management Systems Query Processing.
1 Chapter 13 CS 157 B Presentation -- Query Processing (origional from Silberschatz, Korth and Sudarshan) Presented By Laptak Lee.
Query Processing Chapter 12
COMP 5138 Relational Database Management Systems Semester 2, 2007 Lecture 12 Query Processing and Optimization.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts 3 rd Edition Chapter 12: Query Processing  Overview  Catalog Information for Cost Estimation.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
©Silberschatz, Korth and Sudarshan7.1 Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations.
12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 13: Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Chapter 13: Query Processing Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations.
Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
CPT-S Topics in Computer Science Big Data 1 1 Yinghui Wu EME 49.
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Chapter 13: Query Processing
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
CSCI 5708: Query Processing II Pusheng Zhang University of Minnesota Feb 5, 2004.
Computing & Information Sciences Kansas State University Monday, 03 Nov 2008CIS 560: Database System Concepts Lecture 27 of 42 Monday, 03 November 2008.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Query Processing CS 405G Introduction to Database Systems.
Lecture 3 - Query Processing (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
13.1 Chapter 13: Query Processing n Overview n Measures of Query Cost n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation.
Chapter 13: Query Processing. Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
©Silberschatz, Korth and Sudarshan1 Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation.
Query Processing and Query Optimization Database System Implementation CSE 507 Some slides adapted from Silberschatz, Korth and Sudarshan Database System.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Database Management System
Chapter 13: Query Processing
Chapter 12: Query Processing
Query Processing.
Chapter 13: Query Processing
Evaluation of Relational Operations: Other Operations
File Processing : Query Processing
File Processing : Query Processing
Dynamic Hashing Good for database that grows and shrinks in size
Yan Huang - CSCI5330 Database Implementation – Access Methods
Chapter 13: Query Processing
Chapter 12: Query Processing
Chapter 13: Query Processing
Module 13: Query Processing
Chapter 12: Query Processing
Lecture 2- Query Processing (continued)
Chapter 13: Query Processing
Chapter 12 Query Processing (1)
Chapter 13: Query Processing
Evaluation of Relational Operations: Other Techniques
Sorting We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each.
Chapter 13: Query Processing
Chapter 13: Query Processing
Chapter 13: Query Processing
Chapter 12: Query Processing
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

Chapter 12 Query Processing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park

Outline  Overview(already covered)  Measures of Query Cost(already covered)  Selection Operation(already covered)  Sorting(already covered)  Join Operation  Other Operations  Evaluation of Expressions

Join Operation  Several different algorithms to implement joins  Nested-loop join  Block nested-loop join  Indexed nested-loop join  Merge-join  Hash-join  Choice based on cost estimate  Examples use the following information  Number of records of student5,000takes10,000  Number of blocks of student100takes400

Nested-Loop Join (1/2)  To compute the theta join r  s for each tuple t r in r do begin for each tuple t s in s do begin test pair (t r, t s ) to see if they satisfy the join condition  if they do, add t r t s to the result end end  r is called the outer relation and s the inner relation of the join  Requires no indices and can be used with any kind of join condition  Expensive since it examines every pair of tuples in the two relations

Nested-Loop Join (2/2)  In the worst case, if the buffer can hold only one block of each relation, the estimated cost is n r  b s + b r block accesses  If the smaller relation fits entirely in memory, use that as the inner relation. The cost is reduced to b r + b s block accesses  Assuming worst case memory availability, cost estimate is  5000  = 2,000,100 block accesses with student as outer relation   = 1,000,400 block accesses with takes as outer relation  If smaller relation (student) fits entirely in memory, the cost estimate will be 500 block accesses  Block nested-loop join algorithm (next slide) is preferable

Block Nested-Loop Join  Variant of nested-loop join in which every block of inner relation is paired with every block of outer relation for each block B r of r do begin for each block B s of s do begin for each tuple t r in B r do begin for each tuple t s in B s do begin Check if (t r, t s ) satisfy the join condition if they do, add t r t s to the result end end end end  Worst case estimate: b r  b s + b r block accesses

Indexed Nested-Loop Join  Index lookups can replace file scans if  Join is an equi-join or natural join and  An index is available on the inner relation’s join attribute  For each tuple t r in the outer relation r, use the index to look up tuples in s that satisfy the join condition with tuple t r  Cost of the join: b r + n r  c  Where c is the cost of a single selection using the join condition  If indices are available on join attributes of both r and s, use the relation with fewer tuples as the outer relation

Merge-Join (1/2)  Sort both relations on their join attribute (if not already sorted on the join attributes)  Merge the sorted relations to join them

Merge-Join (2/2)  Can be used only for equi-joins and natural joins  Each block needs to be read only once  Thus number of block accesses for merge-join is b r + b s + the cost of sorting if relations are unsorted

Hash-Join (1/2)  Applicable for equi-joins and natural joins  A hash function h is used to partition tuples of both relations  h maps JoinAttrs values to {0,1,…,n} where JoinAttrs denotes the common attributes of r and s used in the natural join  r 0, r 1,..., r n denote partitions of r tuples  s 0, s 1..., s n denote partitions of s tuples  r tuples in r i need only to be compared with s tuples in s i

Hash-Join (2/2)

Other Operations  Duplicate elimination can be implemented via sorting or hashing  On sorting, duplicates will come adjacent to each other  Hashing is similar – duplicates will come into the same bucket  Projection is implemented by performing projection on each tuple, which gives a relation that could have duplicate records, and then removing duplicate records  Aggregation can be implemented in a manner similar to duplicate elimination  Sorting or hashing can be used to bring tuples in the same group together, and then the aggregate functions can be applied on each group

Evaluation of Expressions  So far, we have studied how individual operations are carried out; now we consider how to evaluate an expression containing multiple operations  The obvious way is simply to evaluate one operation at a time, in an appropriate order; the result of each evaluation is materialized in a temporary relation for subsequent use  A disadvantage to this method is the need to build the temporary relation, which (unless they are small) must be written to disk  An alternative approach is to evaluate several operations simultaneously in a pipeline with the results of one operation passed on to the next, without the need to store a temporary relation

Materialization  Evaluate one operation at a time, starting at the lowest-level; store intermediate results into temporary relations to evaluate next-level operations  Consider the expression: ∏ name (  building = “Watson” (department) instructor)

Pipelining  Evaluate several operations simultaneously, passing the results of one operation to the next  Consider the previous example again:  Don’t store result of  building=“Watson” (department)  Instead, pass tuples directly to the join  Similarly, don’t store result of join, pass tuples directly to the projection  Much cheaper than materialization  Pipelining may not always be possible