What you should know about Flash Storage

Slides:



Advertisements
Similar presentations
Flash storage memory and Design Trade offs for SSD performance
Advertisements

Thank you for your introduction.
Discussion Week 7 TA: Kyle Dewey. Overview Midterm debriefing Virtual memory Virtual Filesystems / Disk I/O Project #3.
0 秘 Type of NAND FLASH Discuss the Differences between Flash NAND Technologies: SLC :Single Level Chip MLC: Multi Level Chip TLC: Tri Level Chip Discuss:
Lecture 19 Page 1 CS 111 Online Protecting Operating Systems Resources How do we use these various tools to protect actual OS resources? Memory? Files?
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
1 Error Correction Coding for Flash Memories Eitan Yaakobi, Jing Ma, Adrian Caulfield, Laura Grupp Steven Swanson, Paul H. Siegel, Jack K. Wolf Flash Memory.
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
Format Scandisk Defragmentation Antivirus Compression Software
Operating Systems COMP 4850/CISG 5550 Disks, Part II Dr. James Money.
Yu Cai1, Erich F. Haratsch2 , Onur Mutlu1 and Ken Mai1
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Lecture 18 Page 1 CS 111 Online Design Principles for Secure Systems Economy Complete mediation Open design Separation of privileges Least privilege Least.
Storage Systems: Advanced Topics Learning Objectives: To understand major characteristics of SSD To understand Logical Volume Management – its motivations.
File system support on Multi Level Cell (MLC) flash in open source April 17, 2008 Kyungmin Park Software Laboratories Samsung Electronics.
Computers in the real world Objectives Understand what is meant by memory Difference between RAM and ROM Look at how memory affects the performance of.
2010 IEEE ICECS - Athens, Greece, December1 Using Flash memories as SIMO channels for extending the lifetime of Solid-State Drives Maria Varsamou.
Reliability and Recovery CS Introduction to Operating Systems.
Operating Systems. Overview What is an Operating System (OS) What is an Operating System (OS) What Operating Systems do. What Operating Systems do. Operating.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
CS 147 Virtual Memory Prof. Sin Min Lee Anthony Palladino.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
Data Retention in MLC NAND FLASH Memory: Characterization, Optimization, and Recovery. 서동화
Ch 26 & 27 Operating Systems.  Understand the purpose of an operating system  Be able to describe the tasks performed by an operating system.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Implementation Method Linux-USB Gadget Framework –The Linux-USB Gadget Framework makes it easy for peripherals and other devices embedding GNU/Linux system.
XIP – eXecute In Place Jiyong Park. 2 Contents Flash Memory How to Use Flash Memory Flash Translation Layers (Traditional) JFFS JFFS2 eXecute.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
MANAGEMENT INFORMATION SYSTEM
NAND Chip Driver Optimization and Tuning
1 OPERATING SYSTEMS. 2 CONTENTS 1.What is an Operating System? 2.OS Functions 3.OS Services 4.Structure of OS 5.Evolution of OS.
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Combinational circuits
Memory Management.
Understanding Modern Flash Memory Systems
Memory COMPUTER ARCHITECTURE
Transactions and Reliability
Lesson Objectives Aims You should be able to:
Clocks A clock is a free-running signal with a cycle time.
Chapter 12: File System Implementation
DuraCache: A Durable SSD cache Using MLC NAND Flash Ren-Shuo Liu, Chia-Lin Yang, Cheng-Hsuan Li, Geng-You Chen IEEE Design Automation Conference.
Outline What does the OS protect? Authentication for operating systems
Random access memory Sequential circuits all depend upon the presence of memory. A flip-flop can store one bit of information. A register can store a single.
I/O Resource Management: Software
Swapping Segmented paging allows us to have non-contiguous allocations
Life Cycle Models PPT By :Dr. R. Mall.
Outline What does the OS protect? Authentication for operating systems
Flash EEPROM Emulation Concepts
Flash Disk Technology Stop the Spin!
Influence of Cheap and Fast NVRAM on Linux Kernel Architecture
Disks.
Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Yu Cai, Saugata Ghose, Yixin Luo, Ken.
Lecture 23: Cache, Memory, Virtual Memory
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Introduction to Systems Analysis and Design
Lecture 22: Cache Hierarchies, Memory
Software Testing and Maintenance Maintenance and Evolution Overview
Jonathan Walpole Computer Science Portland State University
Topic 5: Communication and the Internet
CSC3050 – Computer Architecture
CS333 Intro to Operating Systems
CSE 153 Design of Operating Systems Winter 2019
COMP755 Advanced Operating Systems
Chapter 13: I/O Systems.
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
An Introduction to Operating Systems
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

What you should know about Flash Storage

The flash storage is often a topic on our support channels The flash storage is often a topic on our support channels. Toradex invests a lot of resources into making the storage as reliable as possible. Nevertheless, it is important to understand some basics of the underlying storage device. One of the most important things you have to know is that if the storage wears out, you can destroy your storage device by writing a lot to the built-in storage device. With this post, we want to give you a basic overview of potential issues flash storage can have. Let’s start with a short technology overview first. Flash types: Raw Flash vs Managed Flash Currently, Toradex computer modules used NOR, NAND, and eMMC flash. NOR and NAND are raw storage devices. The main difference between NAND and NOR is that NOR allows random access, doesn’t need error correction as well as has higher cost-per-bit. NAND on the other side can only be read in pages, some bits in a page may be wrong and need to be corrected by an error correction mechanism.

eMMC Flash combines NAND memory with a built-in controller that handles most of the nasty things you have to take care of when dealing with NAND flash. eMMC is also called managed NAND. With NAND and NOR flash on the other side, the OS and device drivers are responsible to handle these issues. We will discuss the different kinds of challenges later in this blog post. Here is a small overview on the flash type used on our computer modules;

Evolution of NAND Flash: From SLC to MLC The bit density on NAND flash has evolved over time. First NAND devices were Single Level Cell (SLC) flash. This means every flash cell stores one single bit. With Multi Level Cell (MLC), flash can store two or more bits per cell, so the bit density gets increased. Sounds great but with MLC there are downsides as well: with MLC NAND, comes also a higher bit error rate and lower endurance. All eMMC use MLC NAND. Some of the eMMC devices allow you to switch into a pseudo-SLC (PSLC) mode on parts of (or) all the storage. This will reduce the size of the storage whereas the endurance of the device gets increased.

Here is a rough comparison of SLC and MLC. Endurance: Limited amount of erase cycles As already mentioned, one of the most important things you have to know about any flash technology used on our devices is that you can write and erase flash only a limited number of times.

Writing huge amounts of data to the flash device is not a good idea Writing huge amounts of data to the flash device is not a good idea! As shown in the table above, depending on the type of flash you have between 100K and 10K erase cycles available before the data potentially gets corrupted or lost. The term “erase cycles” is irritating. One limitation of flash storage is, that it cannot be rewritten without being erased before. Further on, this cannot be done at the bit level but only at bigger chunks called block. In a worst case, this means that if you only want to write one single byte, you potentially have to erase and write one whole block. The block size can be up to 512 KB. The effect of erasing / writing more than you actually want is called write amplification. May be, there are even additional write operations needed by the flash file system. If you want to estimate the lifetime of the flash storage on your embedded device, you should take that into consideration. Increase lifetime of flash The following section shows how the lifetime of NAND or eMMC flash can be improved. Don’t worry, all these things are already handled by Toradex, there is no need for any action on your side.

Prevent wearing: Wear leveling Let’s assume you are aware of the fact, that flash can be erased / written only a limited number of times and you only update small amounts of data periodically. If this data would be written always to the same flash cell you could only write max.15K times on MLC flash. While you have never touched all the other flash cells, your data could get lost and the flash is broken as the cells you have been writing to are worn out. Smart flash drivers use wear leveling. This technique ensures that all flash cells are worn similarly and not always the same cells are used. Detect and correct errors: Error correction Codes On a NAND flash device, it can happen that single bits start flipping and your data could get corrupted. This can either be due to wearing or any other disturbance. Therefore, the data is secured by Error Correction Codes (ECC). This allows first to detect corrupted data and second to correct the data. Depending on the Flash Controller and the NAND / eMMC flash itself, more or less errors can be detected and corrected.

Bad block handling As ECCs enables us to find erroneous blocks, we can stop using these bad blocks any longer. Depending on the ECC and the amount of bits that can corrected, a threshold is set that defines the maximal number of errors that are accepted before further action is taken. Once we reach this threshold, the data gets corrected and is moved to a good block on the device. The previous location is marked as bad. Bad blocks are not used any longer as they are potentially broken. Power fail tolerance What happens to your device in case of a sudden power loss while writing to the flash? On embedded devices, you expect that the device still boots properly and your data did not get corrupted. To reach that, all software layers and hardware parts involved have to be capable of handling such a situation. You find some more details in the next section on how we reach that goal.

Implementation Details on Toradex SoMs As seen above, having a proper setup depending on the underlying storage type is crucial. Let’s go into the details of the current setup you on the Toradex BSPs. NAND-based devices The following figure gives you a generic overview on the setup of our WinCE and Linux BSPs on NAND based devices.

Storage device: On all our devices using NAND, we use SLC NAND. Hardware Driver: The hardware driver offers a generic interface between the NAND device and the upper layers. This layer is also responsible to detect and correct errors. On Linux, all our current images use MTD. On WinCE, we use the Microsoft Flash PDD layer. There are some exceptions such as Colibri T20, where we use a device specific PDD layer on WinCE. Flash Translation Layer: This layer is responsible for wear leveling and bad block management. On Linux, this is done by the UBI subsystem; while on WinCE, it is done by the Microsoft MDD layer. Again, on the Colibri T20, we use a device specific layer and not the Microsoft Flash MDD.

Filesystem: The file system is actually the part that manages the partitions and the files stored in them. A user will use the file API to use the file system (on Linux trough the VFS layer). On Linux, we use currently UBI FS; while on WinCE, Transaction Save exFAT (TexFAT). Both are power-cut tolerant. The underlying layers are power-cut tolerant as well by supporting atomic operations. eMMC-based devices The following table shows the setup using the Toradex System on Modules using eMMC flash devices.

Storage device: Compared to the raw NAND, most magic is done by the eMMC itself. Higher layers do not have to take care of wear leveling, error correction or bad block management. Hardware Driver: This is the interface between the MMC controller and the file system. Filesystem: As for the NAND based devices on WinCE, here also we use TexFAT; our Linux Images use the ext3 filesystem. Again, both are power-cut tolerant.

Conclusion and Recommendations Toradex does its best to provide reliable and enduring flash storage. Nevertheless, you should always keep an eye on flash usage during application development. Reduce write access to the flash device Know the write behavior of your final product Check if with the write behavior, the requested lifetime of your product is feasible or not Run stress tests and longtime tests Not using the full capacity greatly improves the efficiency of wear leveling algorithms If you need any further information or you think we could improve our default setup, please get in contact with our engineers.

Thank you