Coding and Algorithms for Memories Lecture 2 + 4

Slides:



Advertisements
Similar presentations
RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
Advertisements

Flash storage memory and Design Trade offs for SSD performance
Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.
Cyclic Code.
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
296.3Page :Algorithms in the Real World Error Correcting Codes II – Cyclic Codes – Reed-Solomon Codes.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Some Results on Codes for Flash Memory Michael Mitzenmacher Includes work with Hilary Finucane, Zhenming Liu, Flavio Chierichetti.
Chapter 11: File System Implementation
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Performance/Reliability of Disk Systems So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Overview Memory definitions Random Access Memory (RAM)
1 Eitan Yaakobi, Laura Grupp Steven Swanson, Paul H. Siegel, and Jack K. Wolf Flash Memory Summit, August 2010 University of California San Diego Efficient.
Avishai Wool lecture Introduction to Systems Programming Lecture 8.3 Non-volatile Memory Flash.
1 Error Correction Coding for Flash Memories Eitan Yaakobi, Jing Ma, Adrian Caulfield, Laura Grupp Steven Swanson, Paul H. Siegel, Jack K. Wolf Flash Memory.
Registers  Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
Coding for Flash Memories
Memory Key component of a computer system is its memory system to store programs and data. ITCS 3181 Logic and Computer Systems 2014 B. Wilkinson Slides12.ppt.
15-853Page :Algorithms in the Real World Error Correcting Codes I – Overview – Hamming Codes – Linear Codes.
Ger man Aerospace Center Gothenburg, April, 2007 Coding Schemes for Crisscross Error Patterns Simon Plass, Gerd Richter, and A.J. Han Vinck.
exercise in the previous class (1)
Solid State Drive Feb 15. NAND Flash Memory Main storage component of Solid State Drive (SSD) USB Drive, cell phone, touch pad…
EKT 221 Digital Electronics II
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Coding and Algorithms for Memories Lecture 2 1.
Memory and Programmable Logic
DIGITAL COMMUNICATION Error - Correction A.J. Han Vinck.
Learning Targets Identify the external parts of the computer Identify examples of input devices Identify examples of output devices Define basic computer.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
EKT 221 : Digital 2 Memory Basics
Application of Finite Geometry LDPC code on the Internet Data Transport Wu Yuchun Oct 2006 Huawei Hisi Company Ltd.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. NON.
GCSE Information Technology Storing data Data storage devices can be divided into 2 main categories: Backing storage is used to store programs and data.
2010 IEEE ICECS - Athens, Greece, December1 Using Flash memories as SIMO channels for extending the lifetime of Solid-State Drives Maria Varsamou.
Institute for Experimental Mathematics Ellernstrasse Essen - Germany On STORAGE Systems A.J. Han Vinck January 2011.
Institute for Experimental Mathematics Ellernstrasse Essen - Germany On STORAGE Systems A.J. Han Vinck June 2004.
Memory and Storage Dr. Rebhi S. Baraka
CPEN Digital System Design
Coding and Algorithms for Memories Lecture 5 1.
Basic Characteristics of Block Codes
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
Error Correction and Partial Information Rewriting for Flash Memories Yue Li joint work with Anxiao (Andrew) Jiang and Jehoshua Bruck.
Coding and Algorithms for Memories Lecture 4 1.
DIGITAL COMMUNICATIONS Linear Block Codes
ADVANTAGE of GENERATOR MATRIX:
Chapter 31 INTRODUCTION TO ALGEBRAIC CODING THEORY.
The parity bits of linear block codes are linear combination of the message. Therefore, we can represent the encoder by a linear system described by matrices.
Digital Communications I: Modulation and Coding Course Term Catharina Logothetis Lecture 9.
Digital Circuits Introduction Memory information storage a collection of cells store binary information RAM – Random-Access Memory read operation.
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
Eitan Yaakobi, Laura Grupp Steven Swanson, Paul H. Siegel, and Jack K. Wolf Flash Memory Summit, August 2011 University of California San Diego Error-Correcting.
Coding and Algorithms for Memories Lecture 6 1.
Gunjeet Kaur Dronacharya Group of Institutions. Outline I Random-Access Memory Memory Decoding Error Detection and Correction Read-Only Memory Programmable.
نظام المحاضرات الالكترونينظام المحاضرات الالكتروني Main Memory Read Only Memory (ROM)
 The emerged flash-memory based solid state drives (SSDs) have rapidly replaced the traditional hard disk drives (HDDs) in many applications.  Characteristics.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Chapter 5 - Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization.
Coding and Algorithms for Memories Lecture 2
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Memory Management.
Coding and Algorithms for Memories Lecture 5
Coding and Algorithms for Memories Lecture 4
Coding and Algorithms for Memories Lecture 5
An Adaptive Data Separation Aware FTL for Improving the Garbage Collection Efficiency of Solid State Drives Wei Xie and Yong Chen Texas Tech University.
Computer Organization
ICOM 6005 – Database Management Systems Design
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
UNIT IV RAID.
Electronics for Physicists
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Presentation transcript:

236601 - Coding and Algorithms for Memories Lecture 2 + 4

Overview Lecturer: Eitan Yaakobi yaakobi@cs.technion.ac.il, Taub 638 Lectures hours: Wed’s 10:30-12:30 @ Taub 9 Course website: http://webcourse.cs.technion.ac.il/232601 Office hours: Wed’s 17:30-18:30 and/or other times (please contact by email before) Final grade: Class participation (10%) Homeworks (50%) Take home exam/final Homework + project (40%)

What is this class about? Coding and Algorithms to Memories Memories – HDDs, flash memories, and other non-volatile memories Coding and algorithms – how to manage the memory and handle the interface between the physical level and the operating system Both from the theoretical and practical points of view

Memories Volatile Memories – need power to maintain the information Ex: RAM memories, DRAM, SRAM Non-Volatile Memories – do NOT need power to maintain the information Ex: HDD, optical disc (CD, DVD), flash memories Q: Examples of old non-volatile memories?

Some of the main goals in designing a computer storage: Price Capacity (size) Endurance Speed Power Consumption Flash Memory Summit Santa Clara, CA USA

Optical Storage First generation – CD (Compact Disc), 700MB Second generation – DVD (Digital Versatile Disc), 4.7GB, 1995 Third generation – BD (Blu-Ray Disc) Blue ray laser (shorter wavelength) A single layer can store 25GB, dual layer – 50GB Supported by Sony, Apple, Dell, Panasonic, LG, Pioneer

The Magnetic Hard Disk Drive “1” “0”

Flash Memories 1 3 2 Introduce errors

Gartner & Phison What are the coding problems

SLC, MLC and TLC Flash SLC Flash MLC Flash TLC Flash 1 01 00 10 11 011 High Voltage High Voltage High Voltage 1 01 00 10 11 011 010 000 001 101 100 110 111 SLC Flash MLC Flash TLC Flash 1 Bit Per Cell 2 States 2 Bits Per Cell 4 States 3 Bits Per Cell 8 States Low Voltage Low Voltage Low Voltage Flash Memory Summit Santa Clara, CA USA

Flash Memory Structure A group of cells constitute a page A group of pages constitute a block In SLC flash, a typical block layout is as follows page 0 page 1 page 2 page 3 page 4 page 5 . page 62 page 63 Flash Memory Summit Santa Clara, CA USA

Flash Memory Structure MSB/LSB In MLC flash the two bits within a cell DO NOT belong to the same page – MSB page and LSB page Given a group of cells, all the MSB’s constitute one page and all the LSB’s constitute another page 01 10 00 11 Row index MSB of first 214 cells LSB of first 214 cells MSB of last 214 cells LSB of last 214 cells page 0 page 4 page 1 page 5 1 page 2 page 8 page 3 page 9 2 page 6 page 12 page 7 page 13 3 page 10 page 16 page 11 page 17 ⋮ 30 page 118 page 124 page 119 page 125 31 page 122 page 126 page 123 page 127 Flash Memory Summit Santa Clara, CA USA

MLC Write Process Vread MSB=1 MSB=0 MSB=1 MSB=0 voltage PV1 PV2 PV3 LSB=1 LSB=0 MSB=0 voltage

Flash Memory Structure MSB Page CSB Page LSB Page MSB Page CSB Page LSB Page Row index MSB of first 216 cells CSB of first 216 cells LSB of first 216 cells MSB of last 216 cells CSB of last 216 cells LSB of last 216 cells page 0 page 1 1 page 2 page 6 page 12 page 3 page 7 page 13 2 page 4 page 10 page 18 page 5 page 11 page 19 3 page 8 page 16 page 24 page 9 page 17 page 25 4 page 14 page 22 page 30 page 15 page 23 page 31 ⋮ 62 page 362 page 370 page 378 page 363 page 371 page 379 63 page 368 page 376 page 369 page 377 64 page 374 page 382 page 375 page 383 65 page 380 page 381 Flash Memory Summit Santa Clara, CA USA

Flash Memories Programming Array of cells made from floating gate transistors Typical size can be 32×215 The cells are programmed by pulsing electrons via hot-electron injection

Flash Memories Programming Array of cells made from floating gate transistors Typical size can be 32×215 The cells are programmed by pulsing electrons via hot-electron injection Each cell can have q levels, represented by different amounts of electrons In order to reduce a cell level, thee cell and its containing block must be reset to level 0 before rewriting – A VERY EXPENSIVE OPERATION

Programming of Flash Memory Cells Flash memory cells are programmed in parallel in order to increase the write speed Cells can only increase their value In order to decrease a cell level, its entire containing block (~106 cells) has to be erased first Flash memory cells do not behave identically When charge is injected, only a fraction of it is trapped in the cell Easy cells – most of the charge is trapped in the cell Hard cells – a small fraction of the charge is trapped in the cell Flash Memory Summit Santa Clara, CA USA

Programming of Flash Memory Cells Flash memory cells are programmed in parallel in order to increase the write speed Cells can only increase their value In order to decrease a cell level, its entire containing block (~106 cells) has to be erased first Flash memory cells do not behave identically When charge is injected, only a fraction of it is trapped in the cell Easy cells – most of the charge is trapped in the cell Hard cells – a small fraction of the charge is trapped in the cell Goals: Programming is done cautiously to prevent over-shooting Programming should work for both easy and hard cells And still… fast enough Flash Memory Summit Santa Clara, CA USA

Incremental Step Pulse Programming (ISPP) Gradually increase the program voltage First the easy cells reach their level On subsequent steps, only cells which didn’t reach their level are programmed Enable fast programming of both easy and hard cells Flash Memory Summit Santa Clara, CA USA

Rewriting Codes Array of cells, made of floating gate transistors Each cell can store q different levels Today, q typically ranges between 2 and 16 The levels are represented by the number of electrons The cell’s level is increased by pulsing electrons To reduce a cell level, all cells in its containing block must first be reset to level 0 A VERY EXPENSIVE OPERATION Flash Memory Summit Santa Clara, CA USA

Rewriting Codes Problem: Cannot rewrite the memory without an erasure However… It is still possible to rewrite if only cells in low level are programmed

From Wikipedia: One limitation of flash memory is that, although it can be read or programmed a byte or a word at a time in a random access fashion, it can only be erased a "block" at a time. This generally sets all bits in the block to 1. Starting with a freshly erased block, any location within that block can be programmed. However, once a bit has been set to 0, only by erasing the entire block can it be changed back to 1. In other words, flash memory (specifically NOR flash) offers random-access read and programming operations, but does not offer arbitrary random-access rewrite or erase operations. A location can, however, be rewritten as long as the new value's 0 bits are a superset of the over-written values. For example, a nibble value may be erased to 1111, then written e.g. as 1110. Successive writes to that nibble can change it to 1010, then 0010, and finally 0000. Essentially, erasure sets all bits to 1, and programming can only clear bits to 0. File systems designed for flash devices can make use of this capability, for example to represent sector metadata.

Rewrite codes significantly reduce the number of block erasures Rewriting Codes Rewrite codes significantly reduce the number of block erasures Store 3 bits once Store 1 bit 8 times Store 4 bits once Store 1 bit 16 times

Rewriting Codes One of the most efficient schemes to decrease the number of block erasures Floating Codes Buffer Codes Trajectory Codes Rank Modulation Codes WOM Codes

Write-Once Memories (WOM) Introduced by Rivest and Shamir, “How to reuse a write-once memory”, 1982 The memory elements represent bits (2 levels) and are irreversibly programmed from ‘0’ to ‘1’ 1st Write 2nd Write

Write-Once Memories (WOM) Examples: data Memory State 00 000 11 011 data Memory State 10 010 00 111 data Memory State 11 100 10 101 data Memory State 01 001 1st Write 2nd Write

Write-Once Memories (WOM) Introduced by Rivest and Shamir, “How to reuse a write-once memory”, 1982 The memory elements represent bits (2 levels) and are irreversibly programmed from ‘0’ to ‘1’ Q: How many cells are required to write 100 bits twice? P1: Is it possible to do better…? P2: How many cells to write k bits twice? P3: How many cells to write k bits t times? P3’: What is the total number of bits that is possible to write in n cells in t writes? 1st Write 2nd Write

Binary WOM Codes k1,…,kt:the number of bits on each write n cells and t writes The sum-rate of the WOM code is R = (Σ1t ki)/n Rivest Shamir: R = (2+2)/3 = 4/3

Definition: WOM Codes Definition: An [n,t;M1,…,Mt] t-write WOM code is a coding scheme which consists of n cells and guarantees any t writes of alphabet size M1,…,Mt by programming cells from zero to one A WOM code consists of t encoding and decoding maps Ei, Di, 1 ≤i≤ t E1: {1,…,M1}  {0,1}n For 2 ≤i≤ t, Ei: {1,…,Mi}×{0,1}n  {0,1}n such that for all (m,c)∊{1,…,Mi}×{0,1}n, Ei(m,c) ≥ c For 1 ≤i≤ t, Di: {0,1}n  {1,…,Mi} such that for Di(Ei(m,c)) =m for all (m,c)∊{1,…,Mi}×{0,1}n The sum-rate of the WOM code is R = (Σ1t logMi)/n Rivest Shamir: [3,2;4,4], R = (log4+log4)/3=4/3

Definition: WOM Codes There are two cases The individual rates on each write must all be the same: fixed-rate The individual rates are allowed to be different: unrestricted-rate We assume that the write number on each write is known. This knowledge does not affect the rate Assume there exists a [n,t;M1,…,Mt] t-write WOM code where the write number is known It is possible to construct a [Nn+t,t;M1N,…,MtN] t-write WOM code where the write number is not-known so asymptotically the sum-rate is the same

James Saxe’s WOM Code [n,n/2-1; n/2,n/2-1,n/2-2,…,2] WOM Code Partition the memory into two parts of n/2 cells each First write: input symbol m∊{1,…,n/2} program the ith cell of the 1st group The ith write, i≥2: input symbol m∊{1,…,n/2-i+1} copy the first group to the second group program the ith available cell in the 1st group Decoding: There is always one cell that is programmed in the 1st and not in the 2nd group Its location, among the non-programmed cells, is the message value Sum-rate: (log(n/2)+log(n/2-1)+ … +log2)/n=log((n/2)!)/n ≈ (n/2log(n/2))/n ≈ (log n)/2

James Saxe’s WOM Code Example: n=8, [8,3; 4,3,2] [n,n/2-1; n/2,n/2-1,n/2-2,…,2] WOM Code Partition the memory into two parts of n/2 cells each Example: n=8, [8,3; 4,3,2] First write: 3 Second write: 2 Third write: 1 Sum-rate: (log4+log3+log2)/8=4.58/8=0.57 0,0,0,0|0,0,0,0  0,0,1,0|0,0,0,0  0,1,1,0|0,0,1,0  1,1,1,0|0,1,1,0

WOM Codes Constructions Rivest and Shamir ‘82 [3,2; 4,4] (R=1.33); [7,3; 8,8,8] (R=1.28); [7,5; 4,4,4,4,4] (R=1.42); [7,2; 26,26] (R=1.34) Tabular WOM-codes “Linear” WOM-codes David Klaner: [5,3; 5,5,5] (R=1.39) David Leavitt: [4,4; 7,7,7,7] (R=1.60) James Saxe: [n,n/2-1; n/2,n/2-1,n/2-2,…,2] (R≈0.5*log n), [12,3; 65,81,64] (R=1.53) Merkx ‘84 – WOM codes constructed with Projective Geometries [4,4;7,7,7,7] (R=1.60), [31,10; 31,31,31,31,31,31,31,31,31,31] (R=1.598) [7,4; 8,7,8,8] (R=1.69), [7,4; 8,7,11,8] (R=1.75) [8,4; 8,14,11,8] (R=1.66), [7,8; 16,16,16,16, 16,16,16,16] (R=1.75) Wu and Jiang ‘09 - Position modulation code for WOM codes [172,5; 256, 256,256,256,256] (R=1.63), [196,6; 256,256,256,256,256,256] (R=1.71), [238,8; 256,256,256,256,256,256,256,256] (R=1.88), [258,9; 256,256,256,256,256,256,256,256,256] (R=1.95), [278,10; 256,256,256,256,256,256,256,256,256,256] (R=2.01) Flash Memory Summit Santa Clara, CA USA

The Coset Coding Scheme Cohen, Godlewski, and Merkx ‘86 – The coset coding scheme Use Error Correcting Codes (ECC) in order to construct WOM-codes Let C[n,n-r] be an ECC with parity check matrix H of size r×n Write r bits: Given a syndrome s of r bits, find a length-n vector e such that H⋅eT = s Use ECC’s that guarantee on successive writes to find vectors that do not overlap with the previously programmed cells The goal is to find a vector e of minimum weight such that only 0s flip to 1s Flash Memory Summit Santa Clara, CA USA

The Coset Coding Scheme C[n,n-r] is an ECC with an r×n parity check matrix H Write r bits: Given a syndrome s of r bits, find a length-n vector e such that H⋅eT = s Example: H is aparity check matrix of a Hamming code s=100, v1 = 0000100: c = 0000100 s=000, v2 = 1001000: c = 1001100 s=111, v3 = 0100010: c = 1101110 s=010, …  can’t write! This matrix gives a [7,3:8,8,8] WOM code The Golay (23,12,7) code: [23,3; 211,211,211], R=33/23=1.43 The Hamming code: r bits, 2r-2+2 times, 2r–1 cells: R=r(2r-2+2)/(2r –1) Improved my Godlewski (1987) to 2r-2+2r-4+2 times: R=r(2r-2+2r-4+2)/(2r –1)

Variation of the Coset Coding Scheme Yunnan Wu (2010) – Two-write WOM-codes Constructions of WOM-codes by a computer search, [7,2; 176,76] (R=1.37) A general construction for the ε-error case, inspired from the memory with defects constructions and the coset coding scheme Let C[n,n-r] be an ECC with parity check matrix H First Write: write n–r bits Second write: write with high probability r bits as in the coset coding scheme Flash Memory Summit Santa Clara, CA USA

Binary Two-Write WOM-Codes C[n,n-r] is a linear code w/ parity check matrix H of size r×n For a vector v ∊ {0,1}n, Hv is the matrix H with 0’s in the columns that correspond to the positions of the 1’s in v v1 = (0 1 0 1 1 0 0) Flash Memory Summit Santa Clara, CA USA

Binary Two-Write WOM-Codes First Write: program only vectors v such that rank(Hv) = r VC = { v ∊ {0,1}n | rank(Hv) = r} For H we get |VC| = 92 - we can write 92 messages Assume we write v1 = 0 1 0 1 1 0 0 v1 = (0 1 0 1 1 0 0) Flash Memory Summit Santa Clara, CA USA

Binary Two-Write WOM-Codes First Write: program only vectors v such that rank(Hv) = r, VC = { v ∊ {0,1}n | rank(Hv) = r} Second Write Encoding: Second Write Decoding: Multiply the received word by H: H⋅(v1 + v2) = H⋅v1 + H⋅v2 = s1+ (s1 + s2) = s2 Write a vector s2 of r bits Calculate s1 = H⋅v1 Find v2 such that Hv1⋅v2 = s1+s2 a v2 exists since rank(Hv1) = r a Write v1+v2 to memory s2 = 001 s1 = H⋅v1 = 010 Hv1⋅v2 = s1+s2 = 011 a v2 = 0 0 0 0 0 1 1 v1+v2 = 0 1 0 1 1 1 1 v1 = (0 1 0 1 1 0 0) Flash Memory Summit Santa Clara, CA USA

Example Summary Let H be the parity check matrix 1 1 1 0 1 0 0 1 0 1 1 0 1 0 1 1 0 1 0 0 1 Let H be the parity check matrix of the [7,4] Hamming code First write: program only vectors v such that rank(Hv) = 3 VC = { v ϵ {0,1}n | rank(Hv) = 3} For H we get |VC| = 92 - we can write 92 messages Assume we write v1 = 0 1 0 1 1 0 0 Write 0’s in the columns of H corresponding to 1’s in v1: Hv1 d Second write: write r = 3 bits, for example: s2 = 0 0 1 Calculate s1 = H⋅v1 = 0 1 0 Solve: find a vector v2 such that Hv1⋅v2 = s1 + s2 = 0 1 1 d Choose v2 = 0 0 0 0 0 1 1 Finally, write v1 + v2 = 0 1 0 1 1 1 1 Decoding: H = 1 0 1 0 0 0 0 Hv1 = 1 0 1 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1 0 0 1 0 1 1 0 1 0 1 1 0 1 0 0 1 . [0 1 0 1 1 1 1]T = [0 0 1]

Sum-rate Results The construction works for any linear code C For any C[n,n-r] with parity check matrix H, VC = { v ∊ {0,1}n | rank(Hv) = r} The rate of the first write is: R1(C) = (log2|VC|)/n The rate of the second write is: R2(C) = r/n Thus, the sum-rate is: R(C) = (log2|VC| + r)/n In the last example: R1= log(92)/7=6.52/7=0.93, R2=3/7=0.42, R=1.35 Goal: Choose a code C with parity check matrix H that maximizes the sum-rate

Sum-rate Results The (23,11,8) Golay code: (0.9415,0.5217), R = 1.4632 The (16,5,8) Reed-Muller (4,2) code: (0.7691, 0.6875), R = 1.4566 We can limit the number of messages available for the first write so that both writes have the same rate, R1 = R2 = 0.6875, and R = 1.375 By computer search we found more codes Best code we found has rate 1.4928 For fixed rate on both writes, we found 1.4546

Capacity Achieving Results The Capacity region C2-WOM={(R1,R2)|∃p∊[0,0.5],R1≤h(p), R2≤1-p} Theorem: For any (R1, R2)∊C2-WOM and ε>0, there exists a linear code C satisfying R1(C) ≥ R1-ε and R2(C) ≥ R2–ε By computer search Best unrestricted sum-rate 1.4928 (upper bound 1.58) Best fixed sum-rate 1.4546 (upper bound 1.54)

Capacity Region and Achievable Rates of Two-Write WOM codes Flash Memory Summit Santa Clara, CA USA

The Entropy Function How many vectors are there with at most a single 1? How many bits is it possible to represent this way? What is the rate? How many vectors are there with at most k 1’s? Is it possible to approximate the value ? Yes! ≈ h(p), where p=k/n and h(p) = -plog(p)-(1-p)log(1-p): the Binary Entropy Function h(p) is the information rate that is possible to represent when bits are programmed with prob. p n+1 log(n+1) log(n+1)/n log( ) log( )/n log( )/n log( )/n

The Binary Symmetric Channel When transmitting a binary vector, with probability p, every bit is in error Roughly pn bits will be in error The amount of information which is lost is h(p) Therefore, the channel capacity is C(p)=1-h(p) The channel capacity is an indication on the amount of rate which is lost, or how much is necessary to “pay” in order to correct the errors in the channel 1-p p p 1-p

The Capacity of WOM Codes The Capacity Region for two writes C2-WOM={(R1,R2)|∃p∊[0,0.5],R1≤h(p), R2≤1-p} h(p) – the binary entropy function h(p) = -plog(p)-(1-p)log(1-p) The maximum achievable sum-rate is maxp∊[0,0.5]{h(p)+(1-p)} = log3 achieved for p=1/3: R1 = h(1/3) = log(3)-2/3 R2 = 1-1/3 = 2/3 Capacity region (Heegard ‘86, Fu and Han Vinck ‘99) Ct-WOM={(R1,…,Rt)| R1 ≤ h(p1), R2 ≤ (1–p1)h(p2),…, Rt-1≤ (1–p1)(1–pt–2)h(pt–1) Rt ≤ (1–p1)(1–pt–2)(1–pt–1)} The maximum achievable sum-rate is log(t+1) Flash Memory Summit Santa Clara, CA USA

The Capacity of WOM Codes The Capacity Region for two writes C2-WOM={(R1,R2)|∃p∊[0,0.5],R1≤h(p), R2≤1-p} h(p) – the entropy function h(p) = -plog(p)-(1-p)log(1-p) The Capacity Region for t writes: Ct-WOM={(R1,…,Rt)| ∃p1,p2,…pt-1∊[0,0.5], R1 ≤ h(p1), R2 ≤ (1–p1)h(p2),…, Rt-1≤ (1–p1)(1–pt–2)h(pt–1) Rt ≤ (1–p1)(1–pt–2)(1–pt–1)} p1 - prob to prog. a cell on the 1st write: R1 ≤ h(p1) p2 - prob to prog. a cell on the 2nd write (from the remainder): R2≤(1-p1)h(p2) pt-1 - prob to prog. a cell on the (t-1)th write (from the remainder): Rt-1 ≤ (1–p1)(1–pt–2)h(pt–1) Rt ≤ (1–p1)(1–pt–2)(1–pt–1) because (1–p1)(1–pt–2)(1–pt–1) cells weren’t programmed The maximum achievable sum-rate is log(t+1) Flash Memory Summit Santa Clara, CA USA

The Capacity for Fixed Rate The capacity region for two writes C2-WOM={(R1,R2)|∃p∊[0,0.5],R1≤h(p), R2≤1-p} When forcing R1=R2 we get h(p) = 1-p The (numerical) solution is p = 0.2271, the sum-rate is 1.54 Multiple writes: A recursive formula to calculate the maximum achievable sum-rate RF(1)=1 RF(t+1) = (t+1)root{h(zt/RF(t))-z} where root{f(z)} is the min positive value z s.t. f(z)=0 For example: RF(2) = 2root{h(z)-z} = 2 0.7729=1.5458 RF(3) = 3root{h(2z/1.54)-z}=3 0.6437=1.9311 Flash Memory Summit Santa Clara, CA USA

More Constructions Shpilka, “New constructions of WOM codes using the Wozencraft ensemble” An efficient capacity-achieving two-write construction 1st write – program any vector of weight at most m (fixed) 2nd write – instead of using one matrix, use a set of matrices such that at least one of them succeeds on the second write Need to index the matrix for the 2nd write – negligible if the number of matrices is small Use the Wozencraft ensemble of linear codes to construct a good set of matrices Flash Memory Summit Santa Clara, CA USA

Polar WOM Codes A probabilistic approach to construct WOM codes which works with high probability Similar to the one by Wu On each write, encode more bits and write a vector that matches the bits which were already programmed Can combine with ECC so the redundancy is used both for rewriting and error correction Another recent construction using LDPC codes

Capacity Achieving Results The Capacity region C2-WOM={(R1,R2)|∃p∊[0,0.5],R1≤h(p), R2≤1-p} Theorem: For any (R1, R2)∊C2-WOM and ε>0, there exists a linear code C satisfying R1(C) ≥ R1-ε and R2(C) ≥ R2–ε By computer search Best unrestricted sum-rate 1.4928 (upper bound 1.58) Best fixed sum-rate 1.4546 (upper bound 1.54)

Typical Use of WOM Codes User writes logical data pages Page size increases with encoding Invalid pages are ‘reused’ without erasing Read before the second write data 1st write 2nd write 00 000 111 10 100 011 01 010 101 11 001 110 Data Size Encoded Size 111100110100111011101011 101100110100000011100010 10110010111011000 11101100010101100 00101110100101101 11010100101110100 111010110111010111111111 000010010101010001110000 100000010100010001110000 100100111101110011110110 I N V A L I D WOM ENCODER 100111100000100010011000 110111100100100110011010 100001111100001110001000 101011111101101110111010 100010101011100000010000 111010111011100011110110 100101110001110010110000 111101110011110110110110 101100110100000011100010 111100110100111011101011 111010110111010111111111 000010010101010001110000 100000011110000001110000 101110011110110101110110

Why/When to Use WOM Codes? Disadvantage: sacrifice a large amount of the capacity Ex: Two write WOM codes The best sum-rate is log3≈1.58 Can write (at most) only 0.79n bits so there is a lost of (at least) 21% of the capacity Advantage: Can increase the lifetime of the memory and reduce the write amplification

Why/When to Use WOM Codes? Advantage: Can increase the lifetime of the memory and reduce the write amplification Example: User has 3GB of flash with lifetime 100 P/E cycles Each day the user writes 2GB of new data (no need to store the old data) Without WOM, the memory lasts 3/2*100=150 days With WOM (the Rivest Shamir scheme) every two days the memory is erased once the memory lasts 2*100=200 days

Drawbacks of Typical Use Data Size Capacity overhead: 29%-50% additional storage is needed for WOM coding Performance overheads: I/O operations access 29%-50% more bits A read precedes every second write Compatibility: Requires modification in physical page size Or access 2 physical pages Encoded Size Overprovisioning

Another Approach Capacity region of two-write WOM codes (R1=1, R2=0.5)

Another Approach Do not touch: Design handles: Interface Complexity Logical capacity Design handles: Failures  retry Latency  parallelism Capacity Efficiency Success rate Our observation is that for a real system, there are three things you cannot touch. We will make some compromises in other aspects, but our design will handle them. So we’re leaving that dotted line and moving to a point that’s actually very close to the blue line. (R1=1,R2=0.5) 2nd Write Rivest & Shamir 1st Write

Reusable SSD I N V A L I D I N V A L I D 1st write: (almost) unmodified  no overhead 2nd write: one logical page  two physical pages 10010011010001000111 ENCODER 1101101101010100111 Data Size 1111001101001110111 Encoded Size 11011011010101001111 10010011010001000111 10110011010000001110 11110011010011101110 01010111000010000010 00001001010101000111 11110000001100010001 10000001010001000111 I N V A L I D I N V A L I D 10010010010011001001 10011110000010001001 11001001001001111000 10000111110000111000 10000000000000010001 10001010101110000001 10111100001110000111 10010111000111001011 10010011111000001110 10000001111000000111

Reusable SSD 1st write: (almost) unmodified  no overhead 2nd write: one logical page  two physical pages 11110001010100100101 ENCODER 01011111000110010110 11101011011101011111 11011011010101001111 10010011010001000111 10110011010000001110 11110011010011101110 01011111000110010110 01010111000010000010 00001001010101000111 11101011011101011111 11110000001100010001 10000001010001000111 10010010010011001001 10011110000010001001 11001001001001111000 10000111110000111000 10000000000000010001 10001010101110000001 10111100001110000111 10010111000111001011 10010011111000001110 10000001111000000111

Reusable SSD 1st write: (almost) unmodified  no overhead 2nd write: one logical page  two physical pages 00100100001111101010 ENCODER 11111111001111010101 10010011110111001111 10010011010001000111 11011011010101001111 11110011010011101110 10110011010000001110 01011111000110010110 01010111000010000010 11101011011101011111 00001001010101000111 11110000001100010001 11111111001111010101 10000001010001000111 10010011110111001111 10010010010011001001 10011110000010001001 11001001001001111000 10000111110000111000 10000000000000010001 10001010101110000001 10111100001110000111 10010111000111001011 10010011111000001110 10000001111000000111

Reusable SSD 1st write: (almost) unmodified  no overhead 2nd write: one logical page  two physical pages 00100100001111101010 ENCODER 10110111111101101110 10111001111011010111 10010011010001000111 11011011010101001111 11110011010011101110 10110011010000001110 01010111000010000010 01011111000110010110 11101011011101011111 00001001010101000111 11110000001100010001 11111111001111010101 10010011110111001111 10000001010001000111 11011110110111011011 11011110010010011001 11111011011011111010 10101111110110111011 11010011001110110101 11101011101110001111 10111101101110110111 11110111001111011011 10110111111101101110 10111001111011010111

Hot/Cold Data First writes are more space efficient Best for long term storage Hot data: will be overwritten soon Cold data: will remain valid for long Use second writes for hot pages Identify hot data according to I/O size Heuristic : small  hot, large  cold More accurate classifications available writes pages So at each moment, we have a pool of blocks we can use for first writes, and a pool of blocks for second writes. Usually, a few hot pages are responsible for a major portion of write requests (pick your favorite long tail distribution) It is customary to assume that internal GC writes are cold. We use another heuristic. It has been shown that separating hot and cold data is useful, so there are plenty of classification schemes out there, we just use the simplest one as a proof of concept.

Putting it All Together 1st write clean Plane 0 recycled User Write Hot/cold, load balancing (FTL) 2nd write recycled Plane 1 1st write clean 1st writes 2nd writes Actually there’s a pool in each plane – two planes in each flash chip can be accessed in parallel If there is a pair of recycled blocks we direct the hot data to them, and write concurrently First writes are performed independently in each plane Any data can be written anywhere, no need to direct to partitions in advance There will be several blocks in each state. GC will choose one of the used/reused blocks, and some valid data may still be there As long as we can, and no limit has been reached, used blocks will be recycled (nothing happens during recycle, just a state change) Recalling the analysis, the most benefit will be reached if used blocks are always recycled before erasure However, at any point we can skip recycling and then Reusable SSD is equivalent to the standard SSD. garbage collection: lifetime? #recycled+#reused? clean used recycled reused full full garbage collection erase

Overprovisioned (OP) capacity Analysis Overprovisioned (OP) capacity Logical Capacity Erasures Logical pages written Standard SSD (best case): E =N/Z Reusable SSD (best case): “write once, get 50% free” E’ = N/(Z+Z/2) = 2/3E  33% reduction in erasures (without GC) Pages per block So where is the capacity overhead? Notice that the overprovisioned blocks are simply blocks that hold invalid data, that just lays there until it is erased. Instead of letting it lay, we use it for second writes. We need two physical blocks for each logical block of data, but now that we used it for data, we can let more blocks be in this state. So we can “take” blocks from the exported capacity. Only we’re not really taking them, we’re only using them less efficiently. Overall, the amount of logical data stored stays the same. There is an upper limit on the number of blocks that can be reused, but they don’t have to be allocated in advance. This is a dynamic decision – we the blocks that are recycled are chosen online, based on their amount of invalid data. Based on the workload we can also decide to use first writes only, to ensure that our use of the overprovisioned space does not degrade performance. This is a best case analysis, so we assume no internal writes, WA = 1, etc. think of it as the upper limit on the benefit from our design. It turns out that in practice the benefit is very close to this, and we’ll see this later.

Evaluation How many erasures saved? How is performance affected? Sensitivity to design parameters DiskSim simulator Available SSD extension Modified FTL component Type Pgs/Blk R (us) W (ms) E (ms) SLC 64 30 0.3 3 MLC 128 200 1.3 1.5 256 80 5 We use three representative disks with varying parameters Simulator and traces are very widely used Trace input: Microsoft MSR + Exchange Synthetic Zipf

Erasures Expected 33% reduction X axis – different traces (ordered by amount of data written compared to disk size) Y axis – number of cleans compared to standard SSD (1 means the same) Red – enterprise class, blue – consumer class (almost) always reduce erasures, very close to expected 33% More than expected when trace is short Less than expected when lots of cold data

Enterprise: up to 15% reduction Consumer: up to 35% reduction Response Time Enterprise: up to 15% reduction Less erasures means less GC, so performance improves Latency of second writes offset by parallelism More improvement with low OP, where erasures are more expensive Consumer: up to 35% reduction