Presentation is loading. Please wait.

Presentation is loading. Please wait.

@ Carnegie Mellon Databases Inspector Joins Shimin Chen Phillip B. Gibbons Todd C. Mowry Anastassia Ailamaki 2 Carnegie Mellon University Intel Research.

Similar presentations


Presentation on theme: "@ Carnegie Mellon Databases Inspector Joins Shimin Chen Phillip B. Gibbons Todd C. Mowry Anastassia Ailamaki 2 Carnegie Mellon University Intel Research."— Presentation transcript:

1 @ Carnegie Mellon Databases Inspector Joins Shimin Chen Phillip B. Gibbons Todd C. Mowry Anastassia Ailamaki 2 Carnegie Mellon University Intel Research Pittsburgh 2 1,2 1 1 1

2 @ Carnegie Mellon Databases Inspector Joins 2 Exploiting Information about Data  Ability to improve query depends on information quality  General stats on relations are inadequate  May lead to incorrect decisions for specific queries  Especially true for join queries  Previous approaches exploiting dynamic information  Collecting information from previous queries  Multi-query optimization [Sellis’88]  Materialized views [Blakeley et al. 86]  Join indices [Valduriez’87]  Dynamic re-optimization of query plans [Kabra&DeWitt’98] [Markl et al. 04]  This study exploits the inner structure of hash joins

3 @ Carnegie Mellon Databases Inspector Joins 3  Idea:  Examine the actual data in I/O partitioning phase  Extract useful information to improve join phase Exploiting Multi-Pass Structure of Hash Joins I/O Partitioning Join Extra information greatly helps phase 2 Inspection

4 @ Carnegie Mellon Databases Inspector Joins 4 Using Extracted Information  Enable a new join phase algorithm  Reduce the primary performance bottleneck in hash joins i.e. Poor CPU cache performance  Optimized for multi-processor systems  Choose the most suitable join phase algorithm for special input cases I/O Partitioning decide Cache Partitioning Cache Prefetching Simple Hash Join Inspection Join Phase New Algorithm Extracted Information

5 @ Carnegie Mellon Databases Inspector Joins 5 Outline  Motivation  Previous hash join algorithms  Hash join performance on SMP systems  Inspector join  Experimental results  Conclusions

6 @ Carnegie Mellon Databases Inspector Joins 6 Hash Table  Join Phase: (simple hash join)  Build hash table, then probe hash table GRACE Hash Join  I/O Partitioning Phase:  Divide input relations into partitions with a hash function Build Probe Build Probe  Random memory accesses cause poor CPU cache performance Over 70% execution time stalled on cache misses!

7 @ Carnegie Mellon Databases Inspector Joins 7 Cache Partitioning [Shatdal et al. 94] [Boncz et al.’99] [Manegold et al.’00]  Recursively produce cache-sized partitions after I/O partitioning  Avoid cache misses when joining cache-sized partitions  Overhead of re-partitioning Build Probe Memory-sized Partitions Cache-sized Partitions

8 @ Carnegie Mellon Databases Inspector Joins 8 Cache Prefetching [Chen et al. 04]  Reduce impact of cache misses  Exploit available memory bandwidth  Overlap cache misses and computations  Insert cache prefetch instructions into code  Still incurs the same number of cache misses Hash Table Probe Build

9 @ Carnegie Mellon Databases Inspector Joins 9 Outline  Motivation  Previous hash join algorithms  Hash join performance on SMP systems  Inspector join  Experimental results  Conclusions

10 @ Carnegie Mellon Databases Inspector Joins 10 Hash Joins on SMP Systems  Previous studies mainly focus on uni-processors  Memory bandwidth is precious  Each processor joins a pair of partitions in join phase Main Memory Shared bus Cache CPU Cache CPU Cache CPU Cache CPU Build 1 Probe 1 Build 4 Probe 4 Build 2 Probe 2 Build 3 Probe 3

11 @ Carnegie Mellon Databases Inspector Joins 11 Previous Algorithms on SMP Systems  Join phase performance of joining a 500MB and a 2GB relations (details later in the talk)  Aggregate performance degrades dramatically over 4 CPUs  Reduce data movement (memory to memory, memory to cache) Wall clock timeAggregate time on all CPUs Re-partition cost Bandwidth- sharing

12 @ Carnegie Mellon Databases Inspector Joins 12 Inspector Joins  Extracted information: summary of matching relationships  Every K contiguous pages in a build partition forms a sub-partition  Tells which sub-partition(s) every probe tuple matches Build Partition Sub-partition 0 Sub-partition 1 Sub-partition 2 Probe Partition I/O Partitioning Join Summary of Matching Relationship

13 @ Carnegie Mellon Databases Inspector Joins 13 Cache-Stationary Join Phase  Recall cache partitioning: re-partition cost I/O Partitioning Join Build Partition Probe Partition Hash Table CPU Cache  We want to achieve zero copying Copying cost

14 @ Carnegie Mellon Databases Inspector Joins 14 Cache-Stationary Join Phase  Joins a sub-partition and its matching probe tuples  Sub-partition is small enough to fit in CPU cache  Cache prefetching for the remaining cache misses  Zero copying for generating recursive cache-sized partitions I/O Partitioning Join Build Partition Probe Partition Hash Table CPU Cache Sub-partition 0 Sub-partition 1 Sub-partition 2

15 @ Carnegie Mellon Databases Inspector Joins 15 Filters in I/O Partitioning  How to extract the summary efficiently?  Extend filter scheme in commercial hash joins  Conventional single-filter scheme  Represent all build join keys  Filter out probe tuples having no matches Build Relation Filter Mem-sized Partitions Construct Test I/O Partitioning Join Probe Relation

16 @ Carnegie Mellon Databases Inspector Joins 16 Background: Bloom Filter  A bit vector  A key is hashed d (e.g. d=3) times and represented by d bits  Construct: for every build join key, set its 3 bits in vector  Test: given a probe join key, check if all its 3 bits are 1  Discard the tuple if some bits are 0  May have false positives 00011100011001000001 Bit 0 =H 0 (key)Bit 1 =H 1 (key)Bit 2 =H 2 (key) Filter

17 @ Carnegie Mellon Databases Inspector Joins 17 Multi-Filter Scheme  Single filter: a probe tuple  entire build relation  Our goal: a probe tuple  sub-partitions  Construct a filter for every sub-partition  Replace a single large filter with multiple small filters Single Filter Build Relation Partition 0 Partition 1 Partition 2 Sub0,0 Sub0,1 Sub0,2 Sub1,0 Sub1,1 Sub1,2 Sub2,0 Sub2,1 Sub2,2 Multi-Filter I/O Partitioning Join

18 @ Carnegie Mellon Databases Inspector Joins 18 Testing Multi-Filters When partitioning the probe relation  Test a probe tuple against all the filters of a partition  Tells which sub-partition(s) the tuple may have matches  Store summary of matching relationships in partitions Probe Relation Partition 0 Partition 1 Partition 2 Multi- Filter Test I/O Partitioning Join

19 @ Carnegie Mellon Databases Inspector Joins 19 Minimizing Cache Misses for Testing Filters  Single filter scheme:  Compute 3 bit positions  Test 3 bits  Multi-filter scheme: if there are S sub-partitions in a partition  Compute 3 bit positions  Test the same 3 bits for every filter, altogether 3*S bits  May cause 3*S cache misses ! Test Probe Relation Partition 0 Partition 1 Partition 2 Multi- Filter 001 111 011 S filters

20 @ Carnegie Mellon Databases Inspector Joins 20 Vertical Filters for Testing  Bits at the same position are contiguous in memory  3 cache misses instead of 3*S cache misses!  Horizontal  vertical conversion after partitioning build relation  Very small overhead in practice Probe Relation Partition 0 Partition 1 Partition 2 Test 0 0 1 1 1 1 0 1 1 S filters Contiguous in memory I/O Partitioning Join

21 @ Carnegie Mellon Databases Inspector Joins 21 More Details in Paper  Moderate memory space requirement for filters  Summary information representation in intermediate partitions  Preprocessing for cache-stationary join phase  Prefetching for improving efficiency and robustness

22 @ Carnegie Mellon Databases Inspector Joins 22 Outline  Motivation  Previous hash join algorithms  Hash join performance on SMP systems  Inspector join  Experimental results  Conclusions

23 @ Carnegie Mellon Databases Inspector Joins 23 Experimental Setup  Relation schema: 4-byte join attribute + fixed length payload  No selection, no projection  50MB memory per CPU available for the join phase  Same join algorithm run on every CPU joining different partitions  Detailed cycle-by-cycle simulations  A shared-bus SMP system with 1.5GHz processors  Memory hierarchy is based on Itanium 2 processor

24 @ Carnegie Mellon Databases Inspector Joins 24 Partition Phase Wall-Clock Time  I/O partitioning can take advantage of multiple CPUs  Cut input relations into equal-sized chunks  Partition one chunk on every CPU  Concatenate outputs from all CPUs  Enhanced cache partitioning: cache partitioning + advanced prefetching  Inspection incurs very small overhead GRACE Cache prefetching Cache partitioning Enhanced cache partitioning Inspector join 500MB joins 2GB 100B tuples, 4B keys 50% probe tuples no matches A build matches 2 probe tuples Number of CPUs used

25 @ Carnegie Mellon Databases Inspector Joins 25 Join Phase Aggregate Time  Inspector join achieves significantly better performance when 8 or more CPUs are used  1.7-2.1X speedups over cache prefetching  1.6-2.0X speedups over enhanced cache partitioning 500MB joins 2GB 100B tuples, 4B keys 50% probe tuples no matches A build matches 2 probe tuples Number of CPUs used GRACE Cache prefetching Cache partitioning Enhanced cache partitioning Inspector join

26 @ Carnegie Mellon Databases Inspector Joins 26 Results on Choosing Suitable Join Phase  Case #1: a large number of duplicate build join keys  Choose enhanced cache partitioning  When a probe tuple on average matches 4 or more sub-partitions  Case #2: nearly sorted input relations  Surprisingly: cache-stationary join is very good I/O Partitioning decide Cache Partitioning Cache Prefetching Simple Hash Join Inspection Join Phase Cache Stationary Extracted Info

27 @ Carnegie Mellon Databases Inspector Joins 27 Conclusions  Exploit multi-pass structure for higher quality info about data  Achieve significantly better cache performance  1.6X speedups over previous cache-friendly algorithms  When 8 or more CPUs are used  Choose most suitable algorithms for special input cases  Idea may be applicable to other multi-pass algorithms

28 @ Carnegie Mellon Databases Inspector Joins 28 Thank You !

29 @ Carnegie Mellon Databases Inspector Joins 29 Partition Phase Wall-Clock Time  I/O partitioning can take advantage of multiple CPUs  Cut input relations into equal-sized chunks  Partition one chunk on every CPU  Concatenate outputs from all CPUs  Inspection incurs very small overhead 500MB joins 2GB 100B tuples, 4B keys 50% probe tuples no matches A build matches 2 probe tuples Number of CPUs used GRACE Cache prefetching Cache partitioning Inspector join

30 @ Carnegie Mellon Databases Inspector Joins 30 Join Phase Aggregate Time  Inspector join achieves significantly better performance when 8 or more CPUs are used  1.7-2.1X speedups over cache prefetching  1.6-2.0X speedups over enhanced cache partitioning 500MB joins 2GB 100B tuples, 4B keys 50% probe tuples no matches A build matches 2 probe tuples Number of CPUs used GRACE Cache prefetching Cache partitioning Inspector join

31 @ Carnegie Mellon Databases Inspector Joins 31 CPU-Cache-Friendly Hash Joins  Recent studies focus on CPU cache performance  I/O partitioning gives good I/O performance  Random memory accesses cause poor CPU cache performance  Cache Partitioning [Shatdal et al. 94] [Boncz et al.’99] [Manegold et al.’00]  Recursively produce cache-sized partitions from memory-sized partitions  Avoid cache misses during join phase  Pay re-partitioning cost  Cache Prefetching [Chen et al. 04]  Exploit memory system parallelism  Use prefetches to overlap multiple cache misses and computations Hash Table Probe Build

32 @ Carnegie Mellon Databases Inspector Joins 32 Example Special Input Cases  Example case #1: a large number of duplicate build join keys  Count the average number of sub-partitions a probe tuple matches  Must check the tuple against all possible sub-partitions  If too large, cache stationary join works poorly  Example case #2: nearly sorted input relations  A merge-based join phase might be better? Build Partition Probe Partition Sub-partition 0 Sub-partition 1 Sub-partition 2 A probe tuple

33 @ Carnegie Mellon Databases Inspector Joins 33 Varying Number of Duplicates per Build Join Key  Join phase aggregate performance  Choose enhanced cache part  When a probe tuple on average matches 4 or more sub-partitions

34 @ Carnegie Mellon Databases Inspector Joins 34 Nearly Sorted Cases  Sort both input relations, then randomly move 0%-5% of tuples  Join phase aggregate performance  Surprisingly: cache-stationary join is very good  Even better than merge join when over 1% tuples are out-of-order

35 @ Carnegie Mellon Databases Inspector Joins 35 Analyzing Nearly Sorted Case  Partitions are also nearly sorted  Probe tuples matching a sub-partition are almost contiguous  Similar memory behavior as merge join  No cost for sorting out-of-order tuples Build Partition Probe Partition Sub-partition 0 Sub-partition 1 Sub-partition 2 A probe tuple Nearly Sorted


Download ppt "@ Carnegie Mellon Databases Inspector Joins Shimin Chen Phillip B. Gibbons Todd C. Mowry Anastassia Ailamaki 2 Carnegie Mellon University Intel Research."

Similar presentations


Ads by Google