Join Processing for Flash SSDs: Remembering Past Lessons

Join Processing for Flash SSDs: Remembering Past Lessons
Jaeyoung Do, Jignesh M. Patel Department of Computer Sciences University of Wisconsin-Madison All right. Welcome to my presentation. My name is Jaeyoung, Do. Today I’m going to talk about the performance of join algorithms on flash SSDs.

Flash Solid State Drives (SSDs)
▶ Benefits of Flash SSDs Low power consumption High shock resistance Fast Random Access ▶ Flash Densities MB NAND Flash GB NAND Flash TB NAND Flash (predicted by Samsung) ▶ Flash Prices Decreasing continuously “Flash is disk, disk is tape, and tape is dead” GB What is flash Solid State Drives. Flash SSDs are data storage devices that use NAND flash memory chips. Currently, the most available and popular storage device is absolutely magnetic hard disk drives. However, since flash SSDs have many benefits compared to magnetic hard disk drives, they are expected to gradually replace hard disks as the primary permanent storage media in large data centers. These two graphs show the change of flash densities and flash prices as time goes by. As its capacity continues to increase and price drops, Jim Gray’s prediction of “Flash is disk, disk is tape, and tape is dead” is coming close to reality in many applications. Then, what are advantages of flash SSDs over magnetic hard disks? $/MB DaMoN 09

Flash SSDs for DBMSs still apply to flash SSDs?
▶ Many previous works with flash SSDs Flash-based DBMS [Gray ACM Queue 08, Graefe DaMoN07] In-Page Logging [Lee VLDB07] Transaction Processing [Lee VLDB08] B+ Tree Index [Li ICDE09] I/O Benchmarks [Bouganim CIDR09] ▶ We focus on Join algorithms New join algorithms [Shah DaMoN08, Tsirogiannis SIGMOD09] How can we use flash SSDs for DBMSs? So, the use of flash SSDs for DBMSs are being actively studied in various fields including transaction processing and b+ tree index. In this work, we focus on joins which are common, but expensive query processing operations. There are already a few researches in order to develop new join algorithms on flash SSDs. Rather than developing new join algorithm, our aim of this research is to see what lessons join techniques that worked for magnetic hard disk drives continue to work with flash SSDs. So, we wanna see what happen if we just replace magnetic HDDs with flash SSDs, and examine the effects of tuning parameters of join performance. For joins, we already have a plenty of lessons that we have learnt over three decases of desinging and turinng join algorihtms for hard disk drive based systems. If joins were done with flash SSDs, would theses lessons be meaningless? As we will see later, the answer is NO. Many lessons still important with flash SSDs Obviously many lessons learnt for magneis HDDs still work for flash SSDs. The focust of this work Purpose here is to Different XX however, What lessons learnt for magnetic HDDs still apply to flash SSDs? DaMoN 09

Goals 2. Providing better insights for join performance on flash SSDs
▶Demonstrate the importance of recalling past lessons about efficient joins in magnetic HDDs ▶ Explore the effects of various parameters for joins on flash SSDs 1. Not inventing new join algorithms 2. Providing better insights for join performance on flash SSDs Therefore, the first goal of our research is that we want to recall some of the important lessons about efficient join processing in magnetic hard disks determining if these lessons also apply to joins using flash SSDs. In addition, we want to see how tuned parameters based on characteristic of magnetic HDDs works for joins on flash SSDs. Better staring points when designing new join algori Wthms for flash SSDs. We want to see the effects of tuned parameters that are based on characteristics of HDD, and how they work for join on flahs SSDs DaMoN 09

Our Approach ▶ Investigate four popular ad hoc join algorithms
Block Nested Loops Join (BNL) Sort-Merge Join (SM) Grace Hash Join (GH) Hybrid Hash Join (HH) ▶ Conduct experiments as varying various parameters Memory buffer pool size Page size I/O unit size To achieve the goals, we have investigated four ad hoc join algorithms which are popular in both literacture and industry. four popular ad hoc join algorithms, namely BNL, SM, GH, HH. Then, we have conducted some experiments to see the effect of tuned parameters such as … DaMoN 09

Roadmap ▶ Introduction / Goals ▶ Ad hoc join algorithms ▶ Experimental Results ▶ Conclusion / Future Work DaMoN 09

Assumptions ▶ Blocked I/O is available ▶ We use the buffer allocation strategy tuned for magnetic HDDs [Haas et al. VLDB97] Before explaining the join algorithms considered in our work, let me give some assumptions. First, Rather than fetching one page in each I/O operation we used blocked I/O to sequentially read and write multiple pages per an I/O operation. In the case of magnetic hard disk drives, blocked I/O is useful because it amortizes the cost of expensive disk seeks and rotational delays. Buffer allocation strategy means how we divide the buffer pool in the course of join processings. Also, a lot is known about how to optimize joins with magnetic HDDs to use the available memory buffer pool effectively. Specifically, Hass showed that the right buffer pool allocation strategy can have a huge impact – upto 400% improvements in some cases. In this work, we use the same buffer alolcation method for both flash SSDs and magnetic HDDs. While these allocations may not be optimal for flash SSDs, our goal here is to start with the best allocation strategy for magnetic HDDs, and explore what happens if we simply use the same settings when replacing a magnetic HDD with a flash SSD. per a disk I/O. DaMoN 09

Block Nested Loops Join
IS= Block Nested Loops Join Input Buffer for R Result Input Buffer for S Buffer Pool R S IS= , IR=B-IS Disk DaMoN 09

Sort-Merge Join Buffer Pool Buffer Pool Result Disk Disk Working Space
Input Buffer Input Buffer Result Input Buffer Output Buffer Input Buffer R Optimization is out of concerns why?> S Disk Disk DaMoN 09

Grace Hash Join Buffer Pool Buffer Pool Result Disk Disk Input
Bucket 1 Input Buffer for R Result Input Buffer for S h Bucket k R S Disk Disk DaMoN 09

Hybrid Hash Join Buffer Pool Buffer Pool Result Result Disk Disk In
Memory Bucket Bucket 1 Result Input Buffer for R Result Input Buffer Input Buffer for S h Bucket k R S Disk Disk DaMoN 09

Roadmap ▶ Introduction / Goals ▶ Ad-hoc join algorithms ▶ Experimental Results ▶ Conclusion / Future Work DaMoN 09

Experimental Setup ▶ A single-thread and light-weight database engine
▶ Flash SSD and magnetic HDD OCZ Core Series 2.5” SATA 60 GB TOSHIBA 5400 RPM 320 GB ▶ Data Set: TPC-H Customer : 730 MB Orders : 5 GB ▶ Platform Dual Core 3.2 GHz Intel Pentium, Red Hat Max. DB buffer pool size 600 MB SSD HDD None Avg. Seek Time 12 ms 0.35 ms Avg. Latency 5.56 ms 120 MB/sec Read Data Transfer Rate 34 MB/sec 80 MB/sec Write Data $ (3.85 $/GB) Price $ (0.36 $/GB) MAX BUFER pAGE Source: OCZ and TOSHIBA DaMoN 09

Effect of Blocked I/O (HDD)
500 MB Buffer Pool, 8 KB Page Size Non-Blocked I/O Blocked I/O 2.1X CPU time I/O time 2.2X Join Time (sec) 2.0X 2.3X 500 MB, 8 KB BNL SM GH HH BNL SM GH HH HDD DaMoN 09

Effect of Blocked I/O (SSD)
500 MB Buffer Pool, 8 KB Page Size Non-Blocked I/O Blocked I/O CPU time I/O time 1.7X Join Time (sec) 1.6X 1.7X 1.9X BNL SM GH HH BNL SM GH HH SSD Using blocked I/O is critical DaMoN 09

Joins are I/O Bound? (HDD)
8 KB Page Size, Blocked I/O Buffer Pool Size 200 MB 500 MB CPU time I/O time 0.68 0.65 0.69 Join Time (sec) Join Time (sec) 0.62 0.58 0.34 0.61 0.31 8 KB, Blocked I/O 200 MB 500 MB GH SM is IO bound, 8 KB Page Size, Blocked I/O BNL SM GH HH BNL SM GH HH HDD DaMoN 09

Joins are I/O Bound? (SSD)
8 KB Page Size, Blocked I/O Buffer Pool Size 200 MB 500 MB CPU time I/O time Join Time (sec) 1.78 1.09 0.70 1.35 1.69 0.64 1.79 0.8 8 KB, Blocked I/O 200 MB 500 MB GH SM is IO bound, 8 KB Page Size, Blocked I/O BNL SM GH HH BNL SM GH HH SSD Joins may become CPU-bound sooner DaMoN 09

Effect of Varying the Page Size (SSD)
500 MB Buffer Pool, Blocked I/O BNL SM GH HH CPU time I/O time Join Time (sec) bigger Page size (KB) When using Blocked I/O, the page size has a small impact on join performance DaMoN 09

Performance Tendency (SSD)
8 KB Page Size, Blocked I/O Buffer Pool Size 200 MB MB MB CPU time I/O time Join Time (sec) BNL SM GH HH BNL SM GH HH BNL SM GH HH 1. Superiority of HH DaMoN 09

8 KB Page Size, Blocked I/O Buffer Pool Size 200 MB MB MB CPU time I/O time Join Time (sec) BNL SM GH HH BNL SM GH HH BNL SM GH HH 2. Compatibility of BNL DaMoN 09

More details in our DaMoN’09 paper Performance Tendency (SSD) 8 KB Page Size, Blocked I/O Buffer Pool Size 200 MB MB MB CPU time I/O time 1.6X 2.8X 1.7X Join Time (sec) BNL SM GH HH BNL SM GH HH BNL SM GH HH 3. GH is not the winner! [Shah et al. DaMoN08] DaMoN 09

Conclusions / Future Work
▶ Traditional Join optimizations continue to be important with flash SSDs Blocked I/O dramatically improves join performance Buffer allocation strategy has an impact on join performance It is even more critical to consider both CPU and I/O costs ▶ Future Work Expand the range of hardware, and consider other HDD-based configurations Derive detailed cost models for existing join algorithms, and explore the optimal buffer allocations for flash SSDs Provide better insights for join performance on flash SSDs Before you go to develop new join algorithms on flash SSDs, m ake sure that you can get large performace improvements with basic join algorithms by using many traditional join optimizations. DaMoN 09

Backup Slides Research developers, not the vendors.
Not to trying to say “this is what oricle you should do” Look. We took old stnadard algorihtms. And we modified some parameters We got performance differences there Before you go to develop new join algorihtms on flash, Make sure you really tuned basic algorithms with which you can get performacne improvements. Just don’t throw the algorihtms, and Make sure that you can do with these basic algorithms. You can improve almost the factor of two. Just by doing blocking. If you look at other papers, they claims new join They don’t do this blocking, they don’t do this buffer allocation strategy DaMoN 09

Joins are I/O Bound? HDD 8 KB Page Size, Blocked I/O Join Time (sec)
CPU time I/O time 8 KB Page Size, Blocked I/O Buffer Pool Size HDD Avg. Seek Time 8.6 msec Avg. Latency 4.2 msec Read Data Transfer Rate 45 MB/sec Write Data 44 MB/sec 200 MB MB 0.78 0.73 0.76 0.79 0.65 Join Time (sec) 0.40 0.80 0.34 BNL SM GH HH BNL SM GH HH Dell 500GB 720 RPM SATA HDD HDD DaMoN 09

CPU times of block nested loops join
Magnetic HDD Flash SSD Page size User Kernel 2 KB 187.5 sec 108.4 sec 204.4 sec 324.0 sec 4 KB 192.8 sec 49.0 sec 205.5 sec 157.9 sec 8 KB 190.5 sec 27.9 sec 183.6 sec 80.6 sec 16 KB 186.6 sec 15.5 sec 188.8 sec 39.8 sec 32 KB 187.2 sec 8.6 sec 185.9 sec 22.4 sec DaMoN 09

Buffer Allocations DaMoN 09

Block Nested Loops Join
▶ BNL shows the biggest performance improvement Buffer Pool Size Algorithm 100 MB 200 MB 300 MB 400 MB 500 MB 600 MB BNL 1.64 X 1.59X 1.72X 1.73X 1.67X 1.65X SM 1.41X 1.45X 1.44X 1.43X 1.48X GH 1.34X 1.29X 1.33X 1.39X 1.30X HH 1.55X 1.35X 1.51X 1.50X DaMoN 09

SM and GH ▶ Random writes show poor performance with flash SSDs
Buffer Pool Size 200 MB MB MB MB CPU time I/O time 8 KB Page Size Blocked I/O Join Time (sec) 500 MB, 8 KB SM GH SM GH SM GH SM GH HDD SDD DaMoN 09

An Internal Structure of Flash SSDs
Flash Chip Flash block Flash page Operation Time of NAND Flash Time Page read 20 μs Page write 200 μs Block erase 1.5 ms Samsung Electronics [2005] Example) Case 1) Write 256 KB in an I/O operation = one flash block erase + 64 flash page writes Case 2) Write 256 KB in 64 I/O operations = (at worst) 64 flash block erases + 64 flash page writes DaMoN 09

An Internal Structure of Flash SSDs
Mapping Table LBA PBA 1 2 3 Example) Sequential Writes –> 1, 2, 3 = one flash block erase + 3 flash page writes Random Writes -> 3, 100, 99, 1, 2 = (at worst) two flash block erases + 3 flash page moves + 3 flash page writes DaMoN 09

Flash SSDs VS. Magnetic HDDs
Enterprise 2.5” SATA Consumer Drive Type 15K RPM 3.5” SCSI 160 7200 RPM SATA Intel X25-E Memoright GT Model Seagate ST LC Seagate Barracuda 64 GB 32 GB Capacity 300 GB 750 GB $ 749 ($12) $ 450 ($14) Price ($/GB) $440 ($1.47) $ 94 ($0.13) HDDs ▶ Lower power consumption SSDs To see the advantages, We compare two flash SSDs and two magnetic hard disks, which are widely-used in many researches. As you can see in this table, current offerings of flash SSDs are more expensive than magnetic hard disk drives. But, flash SSDs offer really fascinating advantages such as low power consumption and high shock resistance. And, perhaps the most curial benefit of flash SSDs over magnetic HDD is that they don’t have mechanically moving parts. ▶ Higher shock resistance SSDs HDDs DaMoN 09

Flash SSDs VS. Magnetic HDDs
Enterprise 2.5” SATA Consumer Drive Type 15K RPM 3.5” SCSI 160 7200 RPM SATA Intel X25-E Memoright GT Model Seagate ST LC Seagate Barracuda NONE Avg. Seek Time 4.0 ms 8.9 ms 0.085 ms 0.1 ms Avg. Latency 2 ms 4.17 ms 222 MB/sec 87 MB/sec Read Transfer Rate 57 MB/sec 47 MB/sec 178 MB/sec 85 MB/sec Write Transfer Rate 55 MB/sec 44 MB/sec ▶ Fast Access Time Fast Random Reads SSDs SSDs Therefore, with flash SSDs, there are no seek times and really small latency compared to magnetic hard disk drives. Consequently, they privde larger sequential read and write bandwidths and offer much faster access time. (Some of you might be wondering why …. I’ll explain this later) HDDs HDDs DaMoN 09

Join Processing for Flash SSDs: Remembering Past Lessons

Similar presentations

Presentation on theme: "Join Processing for Flash SSDs: Remembering Past Lessons"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Join Processing for Flash SSDs: Remembering Past Lessons

Similar presentations

Presentation on theme: "Join Processing for Flash SSDs: Remembering Past Lessons"— Presentation transcript:

Similar presentations

About project

Feedback