A Semi-Preemptive Garbage Collector for Solid State Drives

A Semi-Preemptive Garbage Collector for Solid State Drives
Junghee Lee, Youngjae Kim, Galen M. Shipman, Sarp Oral, Feiyi Wang, and Jongman Kim Presented by Junghee Lee

High Performance Storage Systems
Server centric services File, web & media servers, transaction processing servers Enterprise-scale Storage Systems Information technology focusing on storage, protection, retrieval of data in large-scale environments Google's massive server farms Hard Disk Drive As many server centric applications are emerging such as cloud computing, media servers and transaction processing servers, the demand for high performance computing systems is ever increasing. To support such high performance computing environment, a high performance storage system is also critical because the storage is still considered as a lagging device in a system. So, data centers employ a separate centralized storage systems which take charge of storage, protection and retrieval of data. It consists of multiple storage units and each storage unit has a number of hard disk drives. Storage Unit High Performance Storage Systems

Spider: A Large-scale Storage System
Jaguar Peta-scale computing machine 25,000 nodes with 250,000 cores and over 300 TB memory Spider storage system The largest center-wide Lustre-based file system Over 10.7 PB of RAID 6 formatted capacity 13,400 x 1 TB HDDs 192 Lustre I/O servers Over 3TB of memory (on Lustre I/O servers) As an specific example, Oak Ridge National Lab deployed Jaguar that is a peta-scale super computing machine. It employs Spider as a storage system. It has more than ten thousand HDDs and more than 3TB memory. Now, the storage system is not just a part of a super computer. It plays a key role to support high performance computing systems. The storage system alone is a large system.

Emergence of NAND Flash based SSD
NAND Flash vs. Hard Disk Drives Pros: Semi-conductor technology, no mechanical parts Offer lower access latencies μs for SSDs vs. ms for HDDs Lower power consumption Higher robustness to vibrations and temperature Cons: Limited lifetime 10K - 1M erases per block High cost About 8X more expensive than current hard disks Performance variability

Outline Introduction Background and Motivation
NAND Flash and SSD Garbage Collection Pathological Behavior of SSDs Semi-Preemptive Garbage Collection Evaluation Conclusion

File System (FAT, Ext2, NTFS …) Block Interface (SATA, SCSI, etc)
NAND Flash based SSD Process Process fwrite (file, data) Application File System (FAT, Ext2, NTFS …) Block write (LBA, size) OS Block Device Driver Page write (bank, block, page) SSD Block Interface (SATA, SCSI, etc) Device CPU (FTL) Memory Flash Flash Flash Flash

NAND Flash Organization
Plane 0 Package Read Register Die 0 Die 1 0.025 ms Block 0 Plane 0 Plane 1 Plane 2 Plane 3 Plane 0 Plane 1 Plane 2 Plane 3 Page 0 … Write Page 63 0.200 ms … Erased Block 2047 Page 0 … Erase Page 63 1.500 ms

Out-Of-Place Write Write to LPN2 Invalidate PPN2 Write to PPN3
Physical Blocks Logical-to-Physical Address Mapping Table P0 I P1 V LPN0 PPN1 P2 I V LPN1 PPN4 P3 V P3 E LPN2 PPN3 PPN2 P4 V P5 V LPN3 PPN5 P6 E P7 E Write to LPN2 Invalidate PPN2 Write to PPN3 Update table

Garbage Collection Physical Blocks Select Victim Block P0 E P1 P2 P3
Move Valid Pages P3 V P3 I P4 V P5 V Erase Victim Block P6 P6 V E P7 P7 V E 2 reads + 2 writes + 1 erase= 2* * = 1.950(ms) !!

Pathological Behavior of SSDs
Does GC have an impact on the foreground operations? If so, we can observe sudden bandwidth drop More drop with more write requests More drop with more bursty workloads Experimental Setup SSD devices Intel (SLC) 64GB SSD SuperTalent (MLC) 120GB SSD I/O generator Used libaio asynchronous I/O library for block-level testing

Bandwidth Drop for Write-Dominant Workloads
Experiments Measured bandwidth for 1MB by varying read-write ratio Intel SLC (SSD) SuperTalent MLC (SSD) Performance variability increases as we increase write-percentage of workloads.

Performance Variability for Bursty Workloads
Experiments Measured SSD write bandwidth for queue depth (qd) is 8 and 64 Normalized I/O bandwidth with a Z distribution Intel SLC (SSD) SuperTalent MLC (SSD) Performance variability increases as we increase the arrival-rate of requests (bursty workloads).

Lessons Learned From the empirical study, we learned: This is because:
Performance variability increases as the percentage of writes in workloads increases. Performance variability increases with respect to the arrival rate of write requests. This is because: Any incoming requests during the GC should wait until the on-going GC ends. GC is not preemptive

Semi-Preemptive Garbage Collection Semi-Preemption Further Optimization Level of Allowed Preemption Evaluation Conclusion

Technique #1: Semi-Preemption
Wz Request Rx Wx E Ry Wy GC Time Wz Preemptive GC Non-Preemptive GC Wz Rx Read page x Data transfer Wx Write page x Meta data update E Erase a block

Technique #2: Merge Ry Request Rx Wx Ry Wy E GC Time Rx Read page x
Data transfer Wx Write page x Meta data update E Erase a block

Technique #3: Pipeline Rz Request Rx Wx E Ry Wy GC Time Rz Rx
Read page x Data transfer Wx Write page x Meta data update E Erase a block

Level of Allowed Preemption
Drawback of PGC : The completion time of GC is delayed  May incur lack of free blocks  Sometimes need to prohibit preemption States of PGC Garbage collection Read requests Write requests State 0 X State 1 O O O State 2 O O X State 3 O X X

Semi-Preemptive Garbage Collection Evaluation Setup Synthetic Workloads Realistic Workloads Conclusion

Average request size (KB)
Setup Simulator MSR’s SSD simulator based on DiskSim Workloads Synthetic workloads Used the synthetic workload generator in DiskSim Realistic workloads Workloads Average request size (KB) Read ratio (%) Arrival rate (IOP/s) Write dominant Financial 7.09 18.92 47.19 Cello 7.06 19.63 74.24 Read dominant TPC-H 31.62 91.80 172.73 OpenMail 9.49 63.30 846.62

Performance Improvements for Synthetic Workloads
Varied four parameters: request size, inter-arrival time, sequentiality and read/write ratio Varied one at a time fixing others Average response time (ms) NPGC 2.5 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 Standard deviation PGC 2.0 NPGC std PGC std 1.5 1.0 0.5 8 16 32 64 Request size (KB)

Performance Improvement for Synthetic Workloads (con’t)
Inter-arrival time (ms) 10 5 3 1 Bursty 0.8 0.6 0.4 0.2 Probability of sequential access Random dominant 0.8 0.6 0.4 0.2 Probability of read access Write dominant

Performance Improvement for Realistic Workloads
Average Response Time Variance of Response Times Improvement of variance of response time by 49.8% and 83.3% for Financial and Cello. Financial Cello TPC-H OpenMail 0.2 0.4 0.6 0.8 1.0 Normalized standard deviation Normalized ave. response time 1.0 0.8 0.6 0.4 0.2 Financial Cello TPC-H OpenMail Improvement of average response time by 6.5% and 66.6% for Financial and Cello.

Conclusions Solid state drives Semi-preemptive garbage collection
Fast access speed Performance variation  garbage collection Semi-preemptive garbage collection Service incoming requests during GC Average response time and performance variation are reduced by up to 66.6% and 83.3%

Questions? Contact info Junghee Lee Electrical and Computer Engineering Georgia Institute of Technology

Thank you!

A Semi-Preemptive Garbage Collector for Solid State Drives

Similar presentations

Presentation on theme: "A Semi-Preemptive Garbage Collector for Solid State Drives"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Semi-Preemptive Garbage Collector for Solid State Drives

Similar presentations

Presentation on theme: "A Semi-Preemptive Garbage Collector for Solid State Drives"— Presentation transcript:

Similar presentations

About project

Feedback