I/O Stack Optimization for Smartphones

Slides:



Advertisements
Similar presentations
Paper by: Yu Li, Jianliang Xu, Byron Choi, and Haibo Hu Department of Computer Science Hong Kong Baptist University Slides and Presentation By: Justin.
Advertisements

More on File Management
More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Better I/O Through Byte-Addressable, Persistent Memory
Journaling of Journal Is (Almost) Free Kai Shen Stan Park* Meng Zhu University of Rochester * Currently affiliated with HP Labs FAST
Chapter 4 : File Systems What is a file system?
Mendel Rosenblum and John K. Ousterhout Presented by Travis Bale 1.
SYSTOR2010, Haifa Israel Optimization of LFS with Slack Space Recycling and Lazy Indirect Block Update Yongseok Oh The 3rd Annual Haifa Experimental Systems.
IO-Lite: A Unified Buffering and Caching System By Pai, Druschel, and Zwaenepoel (1999) Presented by Justin Kliger for CS780: Advanced Techniques in Caching.
Recovery CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
CSCI 3140 Module 8 – Database Recovery Theodore Chiasson Dalhousie University.
File Systems Examples.
Chapter 11: File System Implementation
Day 10 Threads. Threads and Processes  Process is seen as two entities Unit of resource allocation (process or task) Unit of dispatch or scheduling (thread.
Chapter 4: Threads. Overview Multithreading Models Threading Issues Pthreads Windows XP Threads.
File System Implementation
File System Implementation
Ext3 Journaling File System “absolute consistency of the filesystem in every respect after a reboot, with no loss of existing functionality” chadd williams.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
The Design and Implementation of a Log-Structured File System Presented by Carl Yao.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
Chapter 5 Part 2 Secondary Storage Mgt. File Mgt. in Popular OSs
Origianal Work Of Hyojun Kim and Seongjun Ahn
CS 346 – Chapter 12 File systems –Structure –Information to maintain –How to access a file –Directory implementation –Disk allocation methods  efficient.
Logging in Flash-based Database Systems Lu Zeping
IT 344: Operating Systems Winter 2008 Module 16 Journaling File Systems Chia-Chi Teng CTB 265.
File Systems in Real-Time Embedded Applications March 4th Eric Julien Introduction to File Systems 1.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
Switch off your Mobiles Phones or Change Profile to Silent Mode.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
UNIX File and Directory Caching How UNIX Optimizes File System Performance and Presents Data to User Processes Using a Virtual File System.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
Resolving Journaling of Journal Anomaly in Android I/O: Multi-Version B-tree with Lazy Split Wook-Hee Kim 1, Beomseok Nam 1, Dongil Park 2, Youjip Won.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Operating Systems CMPSC 473 Virtual Memory Management (4) November – Lecture 22 Instructor: Bhuvan Urgaonkar.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
Module 4.0: File Systems File is a contiguous logical address space.
Design of Flash-Based DBMS: An In-Page Logging Approach Sang-Won Lee and Bongki Moon Presented by RuBao Li, Zinan Li.
Lecture 21 LFS. VSFS FFS fsck journaling SBDISBDISBDI Group 1Group 2Group N…Journal.
Processes and Virtual Memory
1 Isolating Web Programs in Modern Browser Architectures CS6204: Cloud Environment Spring 2011.
Operating Systems 1 K. Salah Module 4.0: File Systems  File is a contiguous logical address space (of related records)  Access Methods  Directory Structure.
Lecture 20 FSCK & Journaling. FFS Review A few contributions: hybrid block size groups smart allocation.
CS 540 Database Management Systems
Transactional Recovery and Checkpoints. Difference How is this different from schedule recovery? It is the details to implementing schedule recovery –It.
XIP – eXecute In Place Jiyong Park. 2 Contents Flash Memory How to Use Flash Memory Flash Translation Layers (Traditional) JFFS JFFS2 eXecute.
CSE 451: Operating Systems Winter 2015 Module 17 Journaling File Systems Mark Zbikowski Allen Center 476 © 2013 Gribble, Lazowska,
Threads prepared and instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University 1July 2016Processes.
File System Consistency
© 2013 Gribble, Lazowska, Levy, Zahorjan
Persistent Memory (PM)
CS 540 Database Management Systems
Failure-Atomic Slotted Paging for Persistent Memory
Processes and threads.
Chapter 11: File System Implementation
FileSystems.
Day 12 Threads.
Better I/O Through Byte-Addressable, Persistent Memory
Chapter 11: File System Implementation
-A File System for Lots of Tiny Files
Lecture 20 LFS.
Building a Database on S3
CSE 451: Operating Systems Autumn Module 16 Journaling File Systems
CSE 451: Operating Systems Spring Module 17 Journaling File Systems
Overview: File system implementation (cont)
Introduction to Operating Systems
COMP755 Advanced Operating Systems
SQL Statement Logging for Making SQLite Truly Lite
Presentation transcript:

I/O Stack Optimization for Smartphones 2013. 10. 28 Mobile Lab 박세준

Contents Intro Past work Elimination of JOJ External Journaling Polling based I/O Evaluation Related Work & Conclusion 운이 좋게도 작년에 발표했던 Revisiting storage for smartphones와 매우 유사 참조 논문 역시 작년에 발표했던 내용..

Intro Android adopts DB Nevertheless.. Ref: sqlite.org Android adopts DB Featherweight file-based DB (It’s rather DB API than DBMS) Minimum library size is <300KB, max up to <500KB Provides journaling, to support atomic transaction action Widely used Web browsers: Firefox, Chrome, Opera Web language: HTML5(default web storage), python OS : Blackberry, Windows phone, iOS, Android, Symbian, WebOS, … Nevertheless.. Nevertheless SQLite has poor performance

Intro Ext4 A default file system since LINUX kernel ver. 2.6.28 Ref: https://ext4.wiki.kernel.org/index.php/Main_Page https://lkml.org/lkml/2008/8/1/217 Ext4 A default file system since LINUX kernel ver. 2.6.28 Introduces journaling also due to improve reliability Became default Android file system from ICS, ver. 4.0.4 Replaces rootfs for Linux kernel, YAFFS2 for previous internal flash, FAT32 for external SD Using MTP(Media Transfer Protocol) to interface between Ext4 external SD and NTFS for windows Has criticism Theodore, Developer of Ext4, stated that btrfs is better because there are more advanced technique in btrfs than Ext4 Btrfs : B-tree filesystem Providing cloning mechanism which is suitable for VM, Block discard to improve wear leveling on SSD Trim Online defragmentation, volume expansion or shrinkage, load balancing

Intro % about overall 90% 30% 70% 75% 64%

Revisiting storage for smartphones Past work Revisiting storage for smartphones Web cache tend to write sequentially, not on SQLite. -> Caused by different characteristic that web cache varies by pages, but SQLite reuses specific DB in address so that locality couldn’t be demonstrated (Continuous big page contents, Discrete much smaller chunks) In addition, almost SQLite write requests are synchronous, and they cause I/O delay in committing atomic operations

Revisiting storage for smartphones Past work Revisiting storage for smartphones Two improvement factors fsync Alleviation too frequent sync DB in RAM Write performance : NAND flash << RAM Improvement write performance itself

Revisiting storage for smartphones Past work Revisiting storage for smartphones Treat random write as sequential write Lazy evaluation

Revisiting storage for smartphones Past work Revisiting storage for smartphones Nilfs : Log structured file system PCM : Phase-Change Memory, a kind of NVRAM(Non-Volatile RAM) Kingston&Webbench Kingston&Facebook RiData&Webbench RiData&Facebook Kingston is really poor flash!! Didn’t use real PCM in experiment, but simulated

Evolved!! Revisiting storage for smartphones I/O stack optimization for smartphones

Elimination of JOJ(Journaling of Journal) Journaling in SQLite 6 Journaling modes DELTE (Default before Android 4.0.4) TRUNCATE(Default since Android 4.0.4) PERSIST MEMORY WAL(Write-Ahead-Logging) OFF

Journaling Mode : Delete Elimination of JOJ Journaling Mode : Delete Delete journal when atomic transaction successfully processed (unlink, delete) Journaling Mode : Truncate Don’t delete journal even if a transaction completed, instead, truncate by zero (delete) But both of delete & truncate need new allocation delete : 텍스트 파일과 디렉토리 엔트리 모두 지움 truncate : 텍스트 파일만 지움 (엔트리 유지)

Journaling Mode : Persist Elimination of JOJ Journaling Mode : Persist Overwrite journal by zero (=zero-fill) No need for reallocation, if journal has to be updated, reuse zero-filled journal Journaling Mode : Memory Keeping journal on memory If application crashes entirely, no way to rollback Fastest 000000000000000000000000000000000 Persist : 텍스트 파일을 0으로 채움 reallocation cost >> write cost인 경우 사용(임베디드 또는 특수 플랫폼)

Elimination of JOJ Journaling Mode : WAL Journaling Mode : OFF Create a separate WAL file(.wal) .wal is checkpointed every specified threshold Journaling Mode : OFF No journal No guarantee atomic operation Checkpointed Checkpointed Checkpointed Checkpointed Database(.db) WAL(.wal) WAL : 저널을 위한 파일이 차곡차곡 쌓이므로 sequential한 성향을 보이게 됨. 추가 저장공간 필요

Elimination of JOJ Journaling in EXT4 Overhead of journaling is negligible Journal transaction is much bigger than SQLite Journal transaction interval is much longer (e.g. 5sec) But if in SQLite? SQLite order fsync() command to commit journal to EXT4 In experiment, a INSERT SQLite SQL issues 2 or more fsync() within 2ms Moreover, each fsync() consists tiny chunks containing very few records Causes very inefficiency from fsync() -> 200% Overhead

Elimination of JOJ What fsync causes Red numbers denote data size(KB), red X represent write operation X의 개수와 크기에 대해 설명

Comparison along SQLite journaling mode Elimination of JOJ Comparison along SQLite journaling mode 24KB of operation instruction size 3 fsync() calls 9 write operations

Comparison along SQLite journaling mode Elimination of JOJ Comparison along SQLite journaling mode 16KB of operation instruction size No opening & unlinking of .db-journal 2 fsync() calls 8 write operations

Comparison along SQLite journaling mode Elimination of JOJ Comparison along SQLite journaling mode 8KB of operation instruction size No opening & unlinking of .db-journal 3 fsync() calls 12 write operations -> Caused by zero-filling : Not adequate EXT4

Comparison along SQLite journaling mode Elimination of JOJ Comparison along SQLite journaling mode 16KB of operation instruction size 1 fsync() call 2 write operations

Another filesystem for Android Elimination of JOJ Another filesystem for Android Uses B+ tree COW mechanism (Copy on Write) Wandering tree problem Updating a node invokes other nodes to be updated 1 fsync : 6 writes

Another filesystem for Android Elimination of JOJ Another filesystem for Android Log structured Merging chunks and combine to segment Size of a segment is 128KB fsync contains a command of flushing segment Due to too big segment & flushing operation it reveals worst performance

Another filesystem for Android Elimination of JOJ Another filesystem for Android To support enterprise level storage But result has been reversed One fsync calls only one journal write Minimum journal write is 1KB In EXT4 : 4KB

Another filesystem for Android Elimination of JOJ Another filesystem for Android Log structured Not only supports merging data to segments, but also operation that updating small chunks to storage, which is not supported the other LFS (e.g. NILFS2) So, it is relieved of suffering from updating tiny chunk

Elimination of JOJ Seq, rnd write with fsync

fsync, fdatasync and noatime Elimination of JOJ Ref: http://www.lug.or.kr/m/bbs/view.php? bo_table=centos_book&wr_id=70&page=8 fsync, fdatasync and noatime fsync flushes metadata every operation fdatasync doesn’t flush metadata until existing metadata is required to be flushed which caused by considerable metadata is to be journaled noatime : Mount option atime : Linux logs the time of last access relatime : Updates when current modify or change time is expired, compared with last access time noatime : Don’t log unless the file is written(modified) i.e. don’t log access time (Default Android mount option)

Evaluation of JOJ elimination Elimination of JOJ Evaluation of JOJ elimination F2FS is best, XFS is next BTRFS & NILFS2 gain benefit hardly Caused by COW: COW mechanism interference advantage of fdatasync that doesn’t flush metadata

Elimination of JOJ Evaluation JOJ Variant SQLite journaling modes, filesystems In update in DB on Galaxy S3, there is no journal file creation (Difference between Insert/sec and Update/sec)

External journaling Exploit Locality Data IO seems random Journal IO looks contiguous External journaling Put journal another block(=partition) As a result, external journaling can exploit locality to the maximum

Polling-based IO Flash vs HDD In legacy PC system, the size of I/O interrupt is much bigger than a smartphone system So, frequent context switches may be harmful Actually, as shown table 1, polling I/O consumes more CPU but this is ignorable because of dominant power consumption from display and telecommunication elements Context switch는 Log-scale임에 주의

Power consumption in smartphone Polling-based IO Power consumption in smartphone Power usage on E-Mail Power usage on Web browsing 단, LCD가 TFT가 아닌 LED라 전력 소모가 적었음 TFT였다면 꽤나 큰 차이를 보였을 것임

Combining all advances Overall result Combining all advances EXT4 -> EXT4 advanced : about 2.4X EXT4 -> F2FS baseline : about 2.2X EXT4 -> F2FS advanced : about 4X 단, LCD가 TFT가 아닌 LED라 전력 소모가 적었음 TFT였다면 꽤나 큰 차이를 보였을 것임

Any question?