Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories

Slides:



Advertisements
Similar presentations
Thank you for your introduction.
Advertisements

LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider.
Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.
0 秘 Type of NAND FLASH Discuss the Differences between Flash NAND Technologies: SLC :Single Level Chip MLC: Multi Level Chip TLC: Tri Level Chip Discuss:
The Performance of Polar Codes for Multi-level Flash Memories
1 Eitan Yaakobi, Laura Grupp Steven Swanson, Paul H. Siegel, and Jack K. Wolf Flash Memory Summit, August 2010 University of California San Diego Efficient.
1 Error Correction Coding for Flash Memories Eitan Yaakobi, Jing Ma, Adrian Caulfield, Laura Grupp Steven Swanson, Paul H. Siegel, Jack K. Wolf Flash Memory.
Coding for Flash Memories
Yinglei Wang, Wing-kei Yu, Sarah Q. Xu, Edwin Kan, and G. Edward Suh Cornell University Tuan Tran.
Error Analysis and Management for MLC NAND Flash Memory Onur Mutlu (joint work with Yu Cai, Gulay Yalcin, Erich Haratsch, Ken Mai, Adrian.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
Yu Cai1 Gulay Yalcin2 Onur Mutlu1 Erich F. Haratsch3
ELE 745 – Digital Communications Xavier Fernando
3/20/2013 Threshold Voltage Distribution in MLC NAND Flash: Characterization, Analysis, and Modeling Yu Cai 1, Erich F. Haratsch 2, Onur Mutlu 1, and Ken.
Yu Cai1, Erich F. Haratsch2 , Onur Mutlu1 and Ken Mai1
Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;
Thank you for your introduction.
International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences 1 Cooperative Wireless.
Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation Yu Cai 1 Onur Mutlu 1 Erich F. Haratsch 2 Ken Mai 1 1 Carnegie.
1 11 Subcarrier Allocation and Bit Loading Algorithms for OFDMA-Based Wireless Networks Gautam Kulkarni, Sachin Adlakha, Mani Srivastava UCLA IEEE Transactions.
/38 Lifetime Management of Flash-Based SSDs Using Recovery-Aware Dynamic Throttling Sungjin Lee, Taejin Kim, Kyungho Kim, and Jihong Kim Seoul.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Yu Cai, Yixin Luo, Erich F. Haratsch*, Ken Mai, Onur Mutlu
2010 IEEE ICECS - Athens, Greece, December1 Using Flash memories as SIMO channels for extending the lifetime of Solid-State Drives Maria Varsamou.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
ENERGY-EFFICIENT FORWARDING STRATEGIES FOR GEOGRAPHIC ROUTING in LOSSY WIRELESS SENSOR NETWORKS Presented by Prasad D. Karnik.
Baseband Demodulation/Detection
Embedded System Lab. Daeyeon Son Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories Yu Cai 1, Gulay Yalcin 2, Onur Mutlu 1, Erich F. Haratsch.
EE 551/451, Fall, 2006 Communication Systems Zhu Han Department of Electrical and Computer Engineering Class 14 Oct. 5 th, 2006.
Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories Yu Cai 1 Gulay Yalcin 2 Onur Mutlu 1 Erich F. Haratsch 3 Adrian Cristal 2 Osman S.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Carnegie Mellon University, *Seagate Technology
Noise and Data Errors Nominal Observation for “1” Nominal Observation for “0” Probability density for “0” with Noise Probability density for “1” with Noise.
Data Retention in MLC NAND FLASH Memory: Characterization, Optimization, and Recovery. 서동화
March 2002 Jie Liang, et al, Texas Instruments Slide 1 doc.: IEEE /0207r0 Submission Simplifying MAC FEC Implementation and Related Issues Jie.
Performance of Digital Communications System
Carnegie Mellon University, *Seagate Technology
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
Six Easy Steps for an ANOVA 1) State the hypothesis 2) Find the F-critical value 3) Calculate the F-value 4) Decision 5) Create the summary table 6) Put.
Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, Onur Mutlu Carnegie Mellon University, Seagate Technology Online Flash Channel Modeling and Its Applications.
Computational Intelligence: Methods and Applications Lecture 14 Bias-variance tradeoff – model selection. Włodzisław Duch Dept. of Informatics, UMK Google:
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Biostatistics Lecture /5 & 6/6/2017.
Improving Multi-Core Performance Using Mixed-Cell Cache Architecture
DuraCache: A Durable SSD cache Using MLC NAND Flash Ren-Shuo Liu, Chia-Lin Yang, Cheng-Hsuan Li, Geng-You Chen IEEE Design Automation Conference.
Tan Hongbing, Liu Sheng†, Chen Haiyan School of National University of
Advanced Wireless Networks
Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu
Advanced Wireless Networks
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
Chapter 2 Simple Comparative Experiments
Neighbor-cell Assisted Error Correction for MLC NAND Flash Memories
Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu
Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Yu Cai, Saugata Ghose, Yixin Luo, Ken.
Error rate due to noise In this section, an expression for the probability of error will be derived The analysis technique, will be demonstrated on a binary.
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques HPCA Session 3A – Monday, 3:15 pm,
Fast Sequence Alignments
10701 / Machine Learning Today: - Cross validation,
Gaussian Mixture Models And their training with the EM algorithm
Use ECP, not ECC, for hard failures in resistive memories
Visual Search and Attention
Principles of the Global Positioning System Lecture 11
Chapter 8: Estimating with Confidence
Psych 231: Research Methods in Psychology
Kalman Filter: Bayes Interpretation
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Prof. Onur Mutlu ETH Zürich Fall November 2018
Yu Cai, Yixin Luo, Erich F. Haratsch, Ken Mai, Onur Mutlu
Presentation transcript:

Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories SIGMETRICS’14

Summary Problem: as P/E cycle increases, raw BER significantly increases beyond the fixed ECC capability Goal: this paper tries to extend lifetime by reducing # of bit errors in a page (to the extent which ECC can fix) How to reduce # of bit errors: when reading a page, they use multiple sets of reference voltages, instead of conventional single set of reference voltages Rationale of multiple sets of reference voltages: they observed “the threshold voltage distributions based on the values in the neighboring cell”, and reading with “the new reference voltage from the threshold voltage distributions” brings less error data Solution: When read page fails to pass ECC, Neighbor-cell Assisted Correction (NAC) mechanism reads a page several times using multiple reference voltages, which makes # of errors in the page drop by the degree ECC can correct

Background: Program Interference Prior works characterized and modeled this error type Threshold voltage of a cell (victim cell) can change when its neighbor cells (aggressor cells) are being programmed Program interference from neighbor cells on the same WL is negligible Program interference on a victim cell C(n, j) due to the aggressor cell on the WL that is immediately above the victim WL “C(n+1, j)” is dominant

Optimizing Read Reference Voltage Neighboring states (Pi and Pi+1) are overlapped By a reference voltage (Vref), blue is error due to “Pi misread as Pi+1” and read is error due to “Pi+1 misread as Pi” F(x) and g(x) are probability density functions (PDF) of cells programmed into state Pi and Pi+1, respectively Optimal read reference voltage to minimize above BER is at cross-point of neighbor distributions

Modeling Raw BER Algebraic manipulation with a set of assumptions Threshold voltage distributions (f(x) and g(x)) follow Gaussian distribution The Gaussian distribution has equal variance (σ1 = σ2 = σ) Random data are programmed (P0 = P1) Optimum read reference voltage is used, v= (μ1+μ2)/2 from previous slide Q(x) is a function of raw BER When x = (μ2-μ1)/2σ, Q(x) (or raw BER) can be minimum As x increases, Q(x) (or raw BER) monotonically decreases Higher value of (μ2-μ1)/2σ is desirable for minimizing raw BER Larger threshold voltage distance (μ2-μ1) between neighboring distributions Smaller variance (σ) of threshold voltage distribution, narrower distributions

Observations on Voltage Distribution [After] [Before Aggress WL is programmed] We want to read “victim page” (WL) “with minimum raw bit error” Before aggressor page (WL) is programmed, two neighboring distributions of victim page are easy to distinguish After aggressor page is programmed, program interference cause the distributions to overlap, increasing raw BER The threshold voltage distributions of all cells (overall distribution) can be further divided into four different threshold voltage distributions (conditional distribution) based on the values of aggressor cells

Overall vs Conditional Distribution Overall distribution is the sum of all four conditional distributions In perspective of minimizing raw BER Threshold voltage distance between neighboring distributions Variance of threshold voltage distribution Distance: overall distribution ≈ conditional distribution Variance: overall distribution > conditional distribution Using conditional distribution to read a page, instead of overall distribution can minimize raw BER Distance of conditional distribution pairs Variance of overall distribution Variance of conditional distribution

Multiple Sets of Reference Voltages Optimal read reference voltage is (μ1+μ2)/2 from the previous model REFx is the single read reference voltage for overall distribution REFx11 is the read reference voltage for conditional distribution whose neighbor (aggressor) cell is programmed with value “11” For 2-bit MLC flash, there can be additionally four different read voltages for conditional distributions Due to the small variance (or narrower distribution), using the multiple sets of reference voltages (REFx11, REFx00, REFx10, REFx01), instead of single set of reference voltage (REFx), can minimize raw BER

Measurement Analysis Conditional distributions ≈ > < > SNR (Signal to Noise Ratio): x = (μ2-μ1)/2σ, making Q(x) minimum Due to small variance (narrower distribution), conditional distribution is more likely to generate less error when reading threshold voltage

Neighbor-cell Assisted Correction (1) read target page using overall read reference (REFx) (2) check ECC, if it fails, NAC works (3) firstly read neighbor (aggressor) pages (MSB/LSB) using REFx (4) then read target page using conditional read references (REF00,11,10,01) (5) when partially corrected, try to run ECC again (6) if it fails, try to use another read reference (7) if ECC continuously fails until all conditional references, return error

NAC Implementation Page-to-be-Corrected Buffer: to store final read data Neighbor LSB/MSB Page Buffers: to store aggressor page data Bit1/Bit2: to determine which conditional reference voltage is used Local-Optimum-Read Buffer: to store temporary page read with one of four conditional references

Prioritized NAC P/E cycle increases NAC degrades latency due to the increased reads (up to +6) The first read with overall read reference (1) The read for neighbor MSB/LSB page (2) The read with conditional read reference (4) Observation reveals that specific errors are dominant Pi+1(11)->Pi : the cell in state Pi+1 whose neighbor cell is 11 misread as Pi Try to use REFx11 first among four conditional references If ECC still fails, then use REFx10  REFx01  REFx00

Lifetime Extension NAC with different strengths (different # of conditional references) Lifetime can be largely increased by lowering raw BER (to be corrected by ECC)

Performance Analysis Low P/E cycles: performance a little improved since neighbor MSB/LSB page read of NAC generates hits in SSD buffer due to good locality of some workloads 18K~24K P/E cycles: less than 5% degradation while providing 33% lifetime improvement Over 25K P/E cycles: sharply increased latency because to one of every 3 reads requires NAC due to ECC failure