Download presentation
Presentation is loading. Please wait.
Published byΜνήμη Ελευθερόπουλος Modified over 6 years ago
1
Yan Huang - CSCI5330 Database Implementation – Access Methods
Query Processing II 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
2
Yan Huang - CSCI5330 Database Implementation – Access Methods
Join Operation Several different algorithms to implement joins Block nested-loop join Indexed nested-loop join Merge-join Hash-join 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
3
Block Nested-Loop Join
for each block Br of r do begin for each block Bs of s do begin for each tuple tr in Br do begin for each tuple ts in Bs do begin Check if (tr,ts) satisfy the join condition if they do, add tr • ts to the result. end end end end r :outer relation s: inner relation cost = br bs + br, Assuming one buffer for each 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
4
Block Nested-Loop Join
r … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
5
Block Nested-Loop Join
Requires no indices Can be used with any join condition cost = br bs + br Which relation shall be out relations? 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
6
Allocate More Buffers to Outer Loop
Cost? s r … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
7
Allocate More Buffers to Inner Loop
Cost? r s How to allocate M buffers to inner loop/out loop/result? … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
8
Indexed Nested-Loop Join
Index available on the inner relation’s join attribute Cost = br + nr c Where c is the cost of traversing index and fetching all matching s tuples for one tuple or r Index available on the outer relation’s join attribute We do not use it, can we? r … s … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
9
Indices available on join attributes of both r and s
Which one should be the out loop? r … s … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
10
Yan Huang - CSCI5330 Database Implementation – Access Methods
Exercise Compute depositor customer, with depositor as the outer relation. Customer has a secondary B+-tree index on customer-name Blocking factor 20 keys #customer = 400b/10,000t #depositor =100b/5,000t Block nested loops 400* = 40,100 Indexed nested loops * 5 = 25,100 disk accesses. CPU cost likely to be less than that for block nested loops join 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
11
Merge-Join Sort both relations on their join attribute (if not already sorted on the join attributes). Merge the sorted relations to join them Join step is similar to the merge stage of the sort-merge algorithm. Every pair with same value on join attribute must be matched Cost = br + bs + the cost of sorting if relations are unsorted 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
12
? Merge-Join But there is a problem! s r A, 1 A, 2 A, 3 B, 1 B, 2 C,1
D, 1 A, 5 A, 3 A, 1 A, 7 A, 2 C,1 C, 2 C, 3 ? We assume records with same value are in the same block! 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
13
Yan Huang - CSCI5330 Database Implementation – Access Methods
Merge-Join Can be used only for equi-joins and natural joins Why not on inequality, great than, less than joins? 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
14
Yan Huang - CSCI5330 Database Implementation – Access Methods
Exercise Compute depositor customer, with depositor as the outer relation. Customer has a secondary B+-tree index on customer-name Blocking factor 20 keys #customer = 400b/10,000t #depositor =100b/5,000t Merge join log(400) + log(100) = 516 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
15
Yan Huang - CSCI5330 Database Implementation – Access Methods
Hybrid Merge-join If one relation is sorted, and the other has a secondary B+-tree index on the join attribute Merge the sorted relation with the leaf entries of the B+-tree . Sort the result on the addresses of the unsorted relation’s tuples Scan the unsorted relation in physical address order and merge with previous result, to replace addresses by the actual tuples Sequential scan more efficient than random lookup 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
16
Yan Huang - CSCI5330 Database Implementation – Access Methods
Hybrid Merge-join r … s … … 1/14/2005 Yan Huang - CSCI5330 Database Implementation – Access Methods
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.