Data Retention in MLC NAND FLASH Memory: Characterization, Optimization, and Recovery. 서동화 dhdh0113@gmail.com.

Slides:



Advertisements
Similar presentations
Floating Gate Devices Kyle Craig.
Advertisements

COEN 180 Flash Memory.
Circuit Modeling of Non-volatile Memory Devices
Flash storage memory and Design Trade offs for SSD performance
Thank you for your introduction.
0 秘 Type of NAND FLASH Discuss the Differences between Flash NAND Technologies: SLC :Single Level Chip MLC: Multi Level Chip TLC: Tri Level Chip Discuss:
CHALLENGES IN EMBEDDED MEMORY DESIGN AND TEST History and Trends In Embedded System Memory.
1 A 90nm 512Mb 166MHz Multilevel Cell Flash Memory with 1.5MByte/s Programming Adopted from ISSCC Dig. Tech. Papers, Feb.2005, Intel Corporation[2.6] Presented.
Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.
Breakdown in Solid Dielectrics
1 Eitan Yaakobi, Laura Grupp Steven Swanson, Paul H. Siegel, and Jack K. Wolf Flash Memory Summit, August 2010 University of California San Diego Efficient.
1 Error Correction Coding for Flash Memories Eitan Yaakobi, Jing Ma, Adrian Caulfield, Laura Grupp Steven Swanson, Paul H. Siegel, Jack K. Wolf Flash Memory.
Coding for Flash Memories
Yinglei Wang, Wing-kei Yu, Sarah Q. Xu, Edwin Kan, and G. Edward Suh Cornell University Tuan Tran.
Yu Cai1 Gulay Yalcin2 Onur Mutlu1 Erich F. Haratsch3
MOS Capacitors ECE Some Classes of Field Effect Transistors Metal-Oxide-Semiconductor Field Effect Transistor ▫ MOSFET, which will be the type that.
3/20/2013 Threshold Voltage Distribution in MLC NAND Flash: Characterization, Analysis, and Modeling Yu Cai 1, Erich F. Haratsch 2, Onur Mutlu 1, and Ken.
Yu Cai1, Erich F. Haratsch2 , Onur Mutlu1 and Ken Mai1
Flash Memory EECS 277A Fall 2008 Jesse Liang #
Sept Non-volatile Memory EEPROM – electrically erasable memory, a general-term –this is a historical term to differentiate from an older type of.
Khaled A. Al-Utaibi Memory Devices Khaled A. Al-Utaibi
Advanced Computing and Information Systems laboratory Device Variability Impact on Logic Gate Failure Rates Erin Taylor and José Fortes Department of Electrical.
Items for Discussion Chip reliability & testing Testing: who/where/what ??? GBTx radiation testing GBTx SEU testing Packaging – Low X0 options, lead free.
Lecture on Electronic Memories. What Is Electronic Memory? Electronic device that stores digital information Types –Volatile v. non-volatile –Static v.
Thank you for your introduction.
Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation Yu Cai 1 Onur Mutlu 1 Erich F. Haratsch 2 Ken Mai 1 1 Carnegie.
Emerging Memory Technologies
/38 Lifetime Management of Flash-Based SSDs Using Recovery-Aware Dynamic Throttling Sungjin Lee, Taejin Kim, Kyungho Kim, and Jihong Kim Seoul.
National Institute of Science & Technology Technical Seminar Presentation-2004 Presented By: Arjun Sabat [EE ] Flash Memory By Arjun Sabat Roll.
Yu Cai, Yixin Luo, Erich F. Haratsch*, Ken Mai, Onur Mutlu
2010 IEEE ICECS - Athens, Greece, December1 Using Flash memories as SIMO channels for extending the lifetime of Solid-State Drives Maria Varsamou.
Lecture 16: Storage and I/O EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
STORAGE DEVICES Presentation By: Saurabh Mishra. A data storage device is a device for recording (storing) information (data). CD, Hard Disk and Flash.
I/O Computer Organization II 1 Introduction I/O devices can be characterized by – Behavior: input, output, storage – Partner: human or machine – Data rate:
Embedded System Lab. Daeyeon Son Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories Yu Cai 1, Gulay Yalcin 2, Onur Mutlu 1, Erich F. Haratsch.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Abdullah Aldahami ( ) March 23, Introduction 2. Background 3. Simulation Techniques a.Experimental Settings b.Model Description c.Methodology.
Flash Memory. Points of Discussion  Flash Memory Generalities  Construction & Properties  History of Flash Memory  NOR & NAND Architectures  Optimizations.
Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.
Towards minimizing read time for NAND Flash Towards minimizing read time for NAND Flash Globecom December 5 th, 2012 Borja Peleato, Rajiv Agarwal, John.
Physical Memory and Physical Addressing By Alex Ames.
Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories Yu Cai 1 Gulay Yalcin 2 Onur Mutlu 1 Erich F. Haratsch 3 Adrian Cristal 2 Osman S.
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
Carnegie Mellon University, *Seagate Technology
A Class presentation for VLSI course by : Maryam Homayouni
F Don Lincoln, Fermilab f Fermilab/Boeing Test Results for HiSTE-VI Don Lincoln Fermi National Accelerator Laboratory.
Carnegie Mellon University, *Seagate Technology
3 차원 구조의 고집적 charge trap 플래시 메모리 개발 1/ 년 xx 월 xx 일 School of EE, Seoul National University 대표 학생김승현 과제 책임자박병국 교수님.
Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, Onur Mutlu Carnegie Mellon University, Seagate Technology Online Flash Channel Modeling and Its Applications.
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Rakan Maddah1, Sangyeun2,1 Cho and Rami Melhem1
What you should know about Flash Storage
Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories
Alireza Shafaei, Shuang Chen, Yanzhi Wang, and Massoud Pedram
Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu
Presented By: Rob Douglas, Alex Alexandrov
Understanding Latency Variation in Modern DRAM Chips Experimental Characterization, Analysis, and Optimization Kevin Chang Abhijith Kashyap, Hasan Hassan,
Controlling the Cost of Reliability in Peer-to-Peer Overlays
Neighbor-cell Assisted Error Correction for MLC NAND Flash Memories
Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu
Introduction I/O devices can be characterized by I/O bus connections
Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Yu Cai, Saugata Ghose, Yixin Luo, Ken.
Information Storage and Spintronics 10
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Introduction to CMOS VLSI Design Chapter 3: CMOS Processing Technology
Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu
Use ECP, not ECC, for hard failures in resistive memories
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Prof. Onur Mutlu ETH Zürich Fall November 2018
Yu Cai, Yixin Luo, Erich F. Haratsch, Ken Mai, Onur Mutlu
Presentation transcript:

Data Retention in MLC NAND FLASH Memory: Characterization, Optimization, and Recovery. 서동화 dhdh0113@gmail.com

Contents Introduction Background Retention Loss Characterization ROR (Retention Optimized Reading) RFR (Retention Failure Recovery)

Introduction Over the past decade, the capacity of NAND flash memory has been increasing continuoulsy, as a result of aggressive process scaling and the advent of multi-level cell technology. Retention errors, caused by charge leakage over time after a flash cell is programmed, are the dominant source of flash memory errors. In this paper, we pursue a better understanding of retention error behavior to improve NAND flash reliability and lifetime, and find better ways to mitigate flash retention errors. Retention Optimized Reading Retention Failure Recovery

Background Basics of NAND Flash Memory The minimum voltage that can turn on the channel is called the “threshold voltage” During a program operation, electrons are injected into the FG from the substrate when applying a high positive voltage to the CG. During a erase operation, electrons are ejected from the FG into the substrate when applying a high negative voltage to the CG. +10V -20V

Background Retention Loss Mechanisms Retention loss is the phenomenon that the threshold voltage changes over time without external simulation. It is caused by the unavoidable trapping of charge within the tunnel oxide. The amount of trapped charge increases with the electrical stress induced by repeated program and erase operations. Trap-assisted tunneling (TAT) At high P/E cycles, the amount of trapped charge is large enough to form percolation path that will signifi-cantly hamper the insulating properties of the gate dielectric, resulting in retention failure. Decreasing threshold voltage.

Retention Loss Characterization To characterize the threshold voltage distribution over different P/E cycles, we form multiple groups of flash blocks, and repeatedly erase and program them with random data to different predefined P/E cycle targets. To characterize the threshold voltage distribution over different retention ages, we first program predefined data to each block. We read and record the threshold voltage distribution of all flash blocks after a certain retention age at room temperature. We first use the characterization results from a single representative group at 8K P/E cycles. Threshold voltage distribution (Fig 2-4) Optimal read reference voltage (Fig 5) RBER (Fig 6) Lifetime (Fig 7)

Retention Loss Characterization Threshold Voltage Distribution under Retention Loss <Figure 2> <Figure 4> <Figure 3>

Retention Loss Characterization Threshold Voltage Distribution under Retention Loss Finding 1: the threshold voltage distributions of the P2 and P3 states systemically shift to lower voltages with retention age. In the P2 and P3 states, the intrinsic electric field strength is higher, making TAT the dominant source of retention loss. Finding 2: the threshold voltage distribution of each state becomes wider with retention age. TAT & charge de-trapping can either increase or decrease threshold voltage. Finding 3: the threshold voltage distribution of a higher voltage state shifts faster than that of a lower-voltage state. A higher-voltage cell experiences a greater amount of SILC, and hence a faster drop in its threshold voltage.

Retention Loss Characterization Optimal Read Reference Voltage There exists an optimal read reference voltages(OPT) that achieves the minimal RBER between every two neighboring sates. As the threshold voltage distributions change over retention age, we expect OPT to experience a similar shift. Finding 4: both P1-P2 OPT and P2-P3 OPT become smaller over retention ages. Finding 5: P2-P3 Opt changes more significantly over retention age than P1-P2 OPT. <Figure 5>

Retention Loss Characterization RBER for Suboptimal Read Reference Voltages Finding 6: the optimal read reference voltage corresponding to one retention age is suboptimal for reading data with a different retention age. Finding 7: RBER becomes lower when the retention age for which the used read reference voltage is optimized becomes closer to the actual retention age of the data. We conclude that one can reduce RBER by estimating and applying the OPT that corresponds to the actual retention age of data. <Figure 6>

Retention Loss Characterization Lifetime vs RBER Finding 8: the P/E cycles lifetime of flash memory can be extended if the optimal read reference voltage that corresponds to the retention age of the data is used. We conclude that if we actually apply the 7-day OPT when reading data with 7-day retention age, RBER reduces in Stage-0 and flash lifetime improves in Stage-1. <Figure 7>

Retention Optimized Reading(ROR) Design Rationale Observation 1: flash read latency can be reduced by minimizing the number of retreis. Observation 2: The number of read-retries can be reduced by using a closer-to-optimal starting read reference voltage. Observation 3: the optimal read reference voltages of pages in the same block are close, while those of pages in different blocks are not. Observation 4: the optimal read reference voltage of pages in a block is upper-bounded by the optimal read reference voltage of the last-programmed (read default voltages) Number of read retry <Read-retry> Close to optimal read reference voltage -> reduce “N”

Retention Optimized Reading(ROR) Retention Optimized Reading Mechanism An online pre-optimization algorithm triggered daily and after power-on to learn the starting read reference voltage for each block. 1) initialization 2) try with a lower read reference voltage 3) try with a higher read reference voltage 4) record the optimal read reference voltage An improved read-retry technique that uses the starting read reference voltage to approach OPT for each block.

Retention Optimized Reading(ROR) Evaluation We assume that all data is refreshed every 7 days. (retention age never exceeds 7days) Baseline Navie read-retry ROR Performance overhead Storage overhead For our evaluated NAND flash device, the total storage overhead is 768KB. This allow the flash controller to manage the ROR read reference voltage table completely within its DRAM buffer. Daily search-> Optimized read reference voltage reduction of number of ECC

Retention Failure Recovery (RFR) Increased temperature exponentially increases retention loss due to increase leakage and therefore increases the effective retention age of a flash memory. This phenomenon can be modeled by Arrhenius Law. Retention failures are unavoidable. When a mobile device is exposed under sunlight, the temperature can be elevated to as high 70 ℃. Today’s flash devices have a typical retention age of 1year at room temperature and unccorect-able errors may start to accumulate after the flash device experiences 70 ℃. AF : aging factor t : effective retention age T : temperature Ea : activation envergy constant K : Boltzman constant

Retention Failure Recovery (RFR) At a high retention age, the neighboring threshold voltage distribution become flatter and closer to each other and thus retention error count increases and a retention failure can appear. (P2 and P3) Fast- and Slow-Leaking cells fast slow fast slow By identifying the difference in the rate of reduction of the threshold voltages between these cells, we can guess the original state of a cell with a high enough success rate to recover the data from retention failure

Retention Failure Recovery (RFR) Retention Failure Recovery Mechanism RFR identifies fast- vs slow-leaking cells and uses selective bit flipping to correct retention failures and thus reduce RBER. Step 1) identify data with a retention failure. Step 2) identify risky cells using three read operations. Step 3) identify fast- and slow-leaking cells. Step 4) selective bit flipping. { (0,0,0) , (1,0,0), (1,1,0), (1,1,1) } Vopt-𝞪 Vopt Vopt+𝞪 (Vth) Vth : the threshold voltage of the cell Vth < read reference voltage : 0 Vth > read reference voltage : 1 Risky cell (1,0,0) (1,1,0)

Retention Failure Recovery (RFR) Evaluation We evaluate RFR on data programmed to random values that has 28-days equivalent retention age.