Mobile Handset Storage and File System
Outline Storage and File System Basics Android File System iOS File System 2
Storage Hierarchy Almost all computers use a storage hierarchy Put fast but expensive and small storages close to the CPU Put slower but larger and cheaper storages far away from the CPU 3
Primary and Secondary Storages Primary storage (or main memory or internal memory) is the storage directly accessible to the CPU Lose information when not powered Such as cache, registers and main memory Secondary storage (or external memory or auxiliary storage) is not directly accessible by the CPU The computer usually uses input/output channels to access secondary storage Does not lose the data when the device is powered down Such as hard disks, CD/DVD, flash memory (e.g. sdcard) 4
Flash Memory Flash memory is a computer memory chip that maintains stored information without requiring a power source. It belongs to the secondary storage in the computer storage hierarchy It can be electronically erased and overwritten Mobile devices use flash memory to store data There are two major types of flash memory NAND flash memory and NOR flash memory They are named after the NAND and NOR logic gates 5
Flash Memory History Flash memory was invented by Dr. Fujio Masuoka (Toshiba) First presented in IEEE International Electron Devices Meeting (IEDM) 1984 Intel introduced the first commercial NOR type flash in 1988 Toshiba announced NAND flash at IEDM 1987 The first NAND-based removable media format was SmartMedia in
NOR Flash Memory Random access to any memory location Use blocks as the storage units (the typical block sizes are 64KB, 128KB or 256KB) Erasure must happen on block level, a block at a time. Write happens on byte level Long write and erasure time Developed as a replacement for ROM (read often, rarely updated) 7
NAND Flash Memory (1) Use page (a group of memory words) as the basic unit to store data. The typical page sizes are 512, 2048 or 4096 bytes Associated with each page are 12 to 16 bytes for checksum Pages are combined into blocks Read and write happen on a page level. Erasure can only happen on a block level 8
NAND Flash Memory (2) Write and erasure time is reduced Suitable for replacing disks Typical block sizes: 16KB: 32 pages of ( spare bytes) 128KB: 64 pages of ( spare bytes) 256KB: 64 pages of ( spare bytes) 512KB: 128 pages of ( spare bytes) Spare bytes can be used for checksum 9
NOR VS NAND 10 NORNAND PerformanceVery slow erase Slow write Fast read Fast erase Fast write Fast read ReliabilityStandard reliabilityLow reliability Needs bad block management Erase Times10,000 – 100,000100,000 – 1,000,000 Life SpanLess than 10% the life span of NAND Over 10 times more than NOR AccessRandomSequential Hardware Implementation EasyComplicated Spare BytesNoYes (16 bytes)
File System File system is a computer program which controls how data is stored and retrieved Primary roles: Provides an abstraction for secondary storage Provides a logical organization of files Enables sharing data between processes, users and machines Protects data from unwanted access 11
Why Need File System Disks are messy physical devices Errors, bad blocks, missed seeks, etc. The job of OS is to hide the mess from higher level software It needs to handle low-level device control (start a disk read, etc.) It needs to provide higher-level abstractions (files, databases, etc.) The file system handles the mess for OS 12
File Concept A file is a logically contiguous address space which stores a collection of data. It has following attributes: File name File identifier (a unique number for the file) File type File location (pointer to file location on disk) File size File protection (controls who can read, write or execute), etc. 13
File Operations Most common operations: Create Write Read Reposition within a file Delete Truncate 14
File Protection (1) File system must implement some kind of protection to control who can access a file and how they can access it Types of users Owner: the user who created the file Group: the users who is in the same group with the owner Others: any other users in the system Super user: administrator of the system Types of access are read (r), write (w) and execute (x) 15
File Protection (2) Protection in Unix file system 16
Directories A directory, also known as folder, is a structure which allows the user to group files into separate collections The root directory is the first or top-most directory in tree structured directories. It is the starting point where all branches originate from E.g., the / directory in Unix systems 17
Tree-Structured Directories 18
Block (1) A block is a sequence of bytes or bits and have a maximum length, a block size. It is the basic unit used by most file systems to store data File systems define a block size (e.g., 4KB) Disk space is allocated in granularity of blocks A “Master Block” stores the location of root directory Always at a well-known disk location Often replicated across disk for reliability 19
Block (2) A map stores which blocks are free, which are allocated Usually a bitmap, one bit per block on the disk Also stored on disk, cached in memory for performance Remaining disk blocks are used to store files and directories 20
Outline Storage and File System Basics Android File System iOS File System 21
Overview Android uses flash memory as its storage media, so it can use flash file systems such as exFAT, YAFFS2, JFFS2, etc. Android is based on Linux kernel, so it can use a Linux file system, such as ext2, ext3, ext4, etc. It may also use a proprietary file system developed by a manufacturer, depending on who made the device The most commonly used file system on Android Yet Another Flash File System2 (YAFFS2) 22
YAFFS YAFFS is a flash file system developed for NAND flash YAFFS1: designed for early NAND generations of flash memory (512-byte page) YAFFS2: support new NAND with 2KB pages and strictly sequential page writing order It uses chunk to manage data. Chunk is YAFFS terminology for a page. 23
YAFFS1 Chunks File data stored in fixed size “chunks”, i.e., NAND pages (512 bytes) Two types of chunk: Data chunk: holding regular data file contents File header: a file’s metadata such as file name, parent directory, etc. 24
YAFFS1 Tags Each chunk has tags with it. The tags comprise the following fields (8 bytes in total): FieldBitsMeaning File ID18Identifies which file the chunk belongs to Chunk ID20 Identifies where in the file this chunk belongs to. 0 means this chunk contains a file header, 1 means the first chunk and 2 is the next chunk and so on Serial Number2Differentiates chunks with the same file ID and chunk ID Byte Count10Number of bytes of data if this is a data chunk Checksum12Checksum for tags Reserved2Unused 64Total 25
YAFFS1 Serial Number When data is overwritten, the relevant chunks are replaced by writing new pages to the flash containing the new data. Then the old page is marked as “discarded” If power loss/crash/other problem happens before the old page is marked as regarded, it is possible to have two pages with the same tags Solve the problem: Increase 2-bit serial number by 1 every time a chunk is overwritten to distinguish the new data and old data 26
YAFFS1 Garbage Collection A block with all discarded pages is an obvious candidate for garbage collection Otherwise, valid pages are copied from a block and then mark the whole block discarded and ready for garbage collection 27
YAFFS1 Page Layout Bytes RangeFieldsDetails Data Data, either file data or file header depending on tags Tags 516Data StatusIf more than 4 bits are zero, this page is discarded. 517Block StatusShows whether the block is damaged Tags ChecksumChecksum for second 256 bytes part of data Tags ChecksumChecksum for first 256 bytes part of data 28
YAFFS2 VS YAFFS1 (1) YAFFS2 is very similar in concept to YAFFS1 and they share much of the same source code Add support for new NAND with 2KB pages Mark very newly written block with a sequence number The sequence of the chunks can be inferred from the block sequence number and chunk offset within the block When it detects two chunks with same file ID and chunk ID, it can choose the new chunk by taking the greater sequence number 29
YAFFS2 VS YAFFS1 (2) Introduce concept of shrink headers for efficiency When a file is resized to a smaller size, YAFFS1 will mark all of the affected chunks as discarded. But YAFFS2 writes a “shrink header”, which indicates that a certain number of pages before this header are invalid Improve performance relative to YAFFS1 Write 1.5-5x Delete: 4x Garbage collection: 2x 30
Outline Storage and File System Basics Android File System iOS File System 31
Overview In 1985 Apple developed a new file system called hierarchical file system (HFS) for use in Mac OS Hierarchical file system plus (HFS+) was introduced in 1998 for use in Mac OS 8.1 HFSX was introduced in Mac OS 10.3 in Now it becomes the file system for iOS 32
HFS Blocks At the physical level, the disk is divided into blocks of 512 bytes There are two types of blocks: Logical blocks: they are numbered from the first to the last on the disk. And they are static and the same size as the physical blocks, 512 bytes Allocation blocks: they are groups of logical blocks used by the HFS to track data in a more efficient way 33
HFS Structure (1) Logical blocks 0 and 1 the boot blocks which contain system startup information Logical blocks 2 contains the master directory block (MDB) which defines a wide variety of data such as date and time stamps for when the partition was created, the location of the bitmap, etc. Logical block 3 the starting block of the bitmap which keeps track of which allocation blocks are in use and which are free. Each allocation block is represented by a bit in the map: if the bit is set, the block is in use. Otherwise it is free to use. 34
HFS Structure (2) The extent overflow file Keeps track of which allocation blocks are allocated to which files Catalog file Describes the folder and file hierarchy on the disk. It contains metadata about all the files and folders on the disk including information about modify, access and create times 35
HFS+ VS HFS HFS+ has three more parts in terms of the structure Attributes file: contains attribute information of all files and folders Startup file: designed to assist in booting non-Mac OS systems that don’t have HFS or HFS+ support Reserved block: reserved for use by Apple 36
HFSX VS HFS+ All Apple mobile devices use HFSX as the file system. There is one major difference between HFSX and HFS+. HFSX is case sensitive. For example, Case_sensitive.doc and Case_Sensitive.doc are treated as two different files. They can both exist on HFSX but not in HFS+ 37
References (1) ks.pdf