Understanding Modern Flash Memory Systems Thomas McCormick Chief Engineer/Technologist, Swissbit ESC Minneapolis
Flash Memory Systems Flash memory systems are everywhere
Motivation Why do we care? Complicated sub-system Wear-Item Boot code – Failure is non-functional system
Floating Gate Transistor Limited P/E Cycles
Flash Memory: NOR & NAND Byte Addressable Reduced Scalability Costly Low Density NAND: Block Addressable Scalability Cost Effective High Density
Flash Memory: SLC, MLC, & TLC
Flash Memory: Endurance
Flash Memory: Retention
Challenges (NAND) Erase-then-program Course-grained structures (block & page) Errors Endurance (MLC) Retention (MLC) Problem: How do we design for a long lifetime? Solution: Managed flash memory system with well designed host
Interface Sector (LBA) (Wikipedia – Creative Commons)
FTL Flash Translation Layer (FTL) FTL Designs: Block, Hybrid, Page (~2012)
Sequential Writes
Random Writes
Garbage Collection (GC) Page-mode FTL is more efficient (deferral)
Dynamic Wear-Leveling
Static Wear-Leveling Flash for code & data will wear equally (retention)
Flash Systems: Failure Modes Erase Failures (hard, managed) Program Failures (hard, managed) Retention Failures (soft, additional work) Retention -> UECC (AVOID!!!!)
Retention Management: Active (1) Active – Engages when data is transferred Error Detection & Correction (EDC) Read-Retry
Retention Management: Active (2) ECC (Hamming, R-S, BCH, LPDC)
Retention Management: Active (3) Read-Retry
Retention Management: Passive Passive – Background Refresh Refresh & Mark Bad
Flash System Lifetime WAF = f (workloadnature) WAF ~1 (Sequential) WAF >> 1 (Random)
WAF (1) WAFRandom = f(Over-ProvisioningTrue, FTL Overhead) McCormick, Validating Analytic Write Amplification Models 2016 Flash Memory Summit Over-ProvisioningTrue = (Physical Size – User Data Size)/(User Data Size) Configuration IDEMA Capacities Extended (SLC) 32 GB, 64 GB, 128 GB, … Standard (MLC) 30 GB, 60 GB, 120 GB, … Enterprise 25 GB, 50 GB, 100 GB, …
WAF (2) [McCormick, FMS2016]
WAF Reduction Options: Page-mode FTL (2x) [McCormick, FMS 2015] Increased over-provisioning (2x – 4x) Sequential writes Large, contiguous files (up to 5x) File-system: FAT, ext (no journaling) Key Point: Managed drive (MLC) can approach older drive (SLC) Ex: 2x * 2x * 5x = 20 -> 33.3 (100KSLC/3KMLC)
Monitoring Endurance Host writes and flash writes -> WAF P/E Cycles & Rating Lifetime (%) Standard (SMART Interface) Proprietary (Vendor) Host writes and flash writes -> WAF
Lifetime Estimation Sample & Extrapolate
Conclusions Managed flash memory system (NAND) Code & data Retention = f(EnduranceUtilized) EnduranceUtilized = f(Workload, WAF) SLC -> MLC: 33x reduction Minimize WAF: Page FTL, OP, sequential Monitor
Speaker/Author Details Tom McCormick – Chief Engineer/Technologist Swissbit tom.mccormick@swissbit.com
Thank You! Questions? @ESC_Conf ESC Minneapolis