Yan Huang - CSCI5330 Database Implementation – Access Methods

Slides:



Advertisements
Similar presentations
Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
Advertisements

Copyright © 2011 Ramez Elmasri and Shamkant Navathe Algorithms for SELECT and JOIN Operations (8) Implementing the JOIN Operation: Join (EQUIJOIN, NATURAL.
CS 540 Database Management Systems
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
CSCI 5708: Query Processing II Pusheng Zhang University of Minnesota Feb 5, 2004.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Processing and Optimization
1 40T1 60T2 30T3 10T4 20T5 10T6 60T7 40T8 20T9 R S C C R JOIN S?
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Sorting and Query Processing Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 29, 2005.
Dr. Kalpakis CMSC 461, Database Management Systems Query Processing.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Department of Computer Science and Engineering, HKUST Slide Query Processing and Optimization Query Processing and Optimization.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
Chapter 13: Query Processing
CS4432: Database Systems II Query Processing- Part 3 1.
CSCI 5708: Query Processing II Pusheng Zhang University of Minnesota Feb 5, 2004.
Computing & Information Sciences Kansas State University Monday, 03 Nov 2008CIS 560: Database System Concepts Lecture 27 of 42 Monday, 03 November 2008.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
13.1 Chapter 13: Query Processing n Overview n Measures of Query Cost n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation.
Chapter 12 Query Processing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
CS 540 Database Management Systems
Chapter 4: Query Processing
CS 540 Database Management Systems
CS 440 Database Management Systems
Database Management System
Chapter 13: Query Processing
Chapter 12: Query Processing
Query Processing.
Chapter 13: Query Processing
COST ESTIMATION FOR THE RELATIONAL ALGEBRA OPERATIONS MIT 813 GROUP 15 PRESENTATION.
Database Management Systems (CS 564)
Cse 344 April 25th – Disk i/o.
File Processing : Query Processing
File Processing : Query Processing
Dynamic Hashing Good for database that grows and shrinks in size
Query Processing and Optimization
Query Processing.
Chapter 13: Query Processing
CPT-S 415 Big Data Yinghui Wu EME B45 1.
Query processing and optimization
Chapter 13: Query Processing
CS143:Evaluation and Optimization
Chapter 12: Query Processing
Chapter 13: Query Processing
Performance Join Operator Select * from R, S where R.a = S.a;
Module 13: Query Processing
Chapter 13: Query Processing
Chapter 12: Query Processing
Lecture 2- Query Processing (continued)
Query Processing Overview Measures of Query Cost Selection Operation
Chapter 15: Query Processing
Chapter 13: Query Processing
Chapter 13: Query Processing
Query processing and optimization
Overview of Query Evaluation: JOINS
Sorting We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each.
Chapter 13: Query Processing
Chapter 13: Query Processing
Lecture 11: B+ Trees and Query Execution
Chapter 13: Query Processing
Yan Huang - CSCI5330 Database Implementation – Query Processing
Chapter 13: Query Processing
Chapter 12: Query Processing
Chapter 15: Query Processing
Lecture 20: Query Execution
Presentation transcript:

Yan Huang - CSCI5330 Database Implementation – Access Methods Query Processing II 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Yan Huang - CSCI5330 Database Implementation – Access Methods Join Operation Several different algorithms to implement joins Block nested-loop join Indexed nested-loop join Merge-join Hash-join 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Block Nested-Loop Join for each block Br of r do begin for each block Bs of s do begin for each tuple tr in Br do begin for each tuple ts in Bs do begin Check if (tr,ts) satisfy the join condition if they do, add tr • ts to the result. end end end end r :outer relation s: inner relation cost = br  bs + br, Assuming one buffer for each 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Block Nested-Loop Join r … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Block Nested-Loop Join Requires no indices Can be used with any join condition cost = br  bs + br Which relation shall be out relations? 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Allocate More Buffers to Outer Loop Cost? s r … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Allocate More Buffers to Inner Loop Cost? r s How to allocate M buffers to inner loop/out loop/result? … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Indexed Nested-Loop Join Index available on the inner relation’s join attribute Cost = br + nr  c Where c is the cost of traversing index and fetching all matching s tuples for one tuple or r Index available on the outer relation’s join attribute We do not use it, can we? r … s … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Indices available on join attributes of both r and s Which one should be the out loop? r … s … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Yan Huang - CSCI5330 Database Implementation – Access Methods Exercise Compute depositor customer, with depositor as the outer relation. Customer has a secondary B+-tree index on customer-name Blocking factor 20 keys #customer = 400b/10,000t #depositor =100b/5,000t Block nested loops 400*100 + 100 = 40,100 Indexed nested loops 100 + 5000 * 5 = 25,100 disk accesses. CPU cost likely to be less than that for block nested loops join 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Merge-Join Sort both relations on their join attribute (if not already sorted on the join attributes). Merge the sorted relations to join them Join step is similar to the merge stage of the sort-merge algorithm. Every pair with same value on join attribute must be matched Cost = br + bs + the cost of sorting if relations are unsorted 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

? Merge-Join But there is a problem! s r A, 1 A, 2 A, 3 B, 1 B, 2 C,1 D, 1 A, 5 A, 3 A, 1 A, 7 A, 2 C,1 C, 2 C, 3 ? We assume records with same value are in the same block! 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Yan Huang - CSCI5330 Database Implementation – Access Methods Merge-Join Can be used only for equi-joins and natural joins Why not on inequality, great than, less than joins? 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Yan Huang - CSCI5330 Database Implementation – Access Methods Exercise Compute depositor customer, with depositor as the outer relation. Customer has a secondary B+-tree index on customer-name Blocking factor 20 keys #customer = 400b/10,000t #depositor =100b/5,000t Merge join log(400) + log(100) + 400 + 100 = 516 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Yan Huang - CSCI5330 Database Implementation – Access Methods Hybrid Merge-join If one relation is sorted, and the other has a secondary B+-tree index on the join attribute Merge the sorted relation with the leaf entries of the B+-tree . Sort the result on the addresses of the unsorted relation’s tuples Scan the unsorted relation in physical address order and merge with previous result, to replace addresses by the actual tuples Sequential scan more efficient than random lookup 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods

Yan Huang - CSCI5330 Database Implementation – Access Methods Hybrid Merge-join r … s … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods