FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDs Kai Shen and Stan Park University of Rochester USENIX-ATC 20131.

Slides:



Advertisements
Similar presentations
Journaling of Journal Is (Almost) Free Kai Shen Stan Park* Meng Zhu University of Rochester * Currently affiliated with HP Labs FAST
Advertisements

CHAPTER 2 PROCESSOR SCHEDULING PART I By U ğ ur HALICI.
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Resource Containers: A new Facility for Resource Management in Server Systems G. Banga, P. Druschel,
CSE506: Operating Systems Disk Scheduling. CSE506: Operating Systems Key to Disk Performance Don’t access the disk – Whenever possible Cache contents.
XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.
Capriccio: Scalable Threads for Internet Services ( by Behren, Condit, Zhou, Necula, Brewer ) Presented by Alex Sherman and Sarita Bafna.
Operating System Support Focus on Architecture
Chapter 12 File Management
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
ISCSI Performance in Integrated LAN/SAN Environment Li Yin U.C. Berkeley.
5: CPU-Scheduling1 Jerry Breecher OPERATING SYSTEMS SCHEDULING.
Computer Organization and Architecture
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
Task scheduling What are the goals of a modern operating system scheduler, and how does Linux achieve them?
A. Frank - P. Weisberg Operating Systems CPU Scheduling.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Power Containers: An OS Facility for Fine-Grained Power and Energy Management on Multicore Servers Kai Shen, Arrvindh Shriraman, Sandhya Dwarkadas, Xiao.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
1 CSE544 Database Architecture Tuesday, February 1 st, 2011 Slides courtesy of Magda Balazinska.
Chapter 2 Operating System Overview Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Chapter 4: What is an operating system?. What is an operating system? A program or collection of programs that coordinate computer usage among users and.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
TRACK-ALIGNED EXTENTS: MATCHING ACCESS PATTERNS TO DISK DRIVE CHARACTERISTICS J. Schindler J.-L.Griffin C. R. Lumb G. R. Ganger Carnegie Mellon University.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Improving Network I/O Virtualization for Cloud Computing.
Chapter 5 Operating System Support. Outline Operating system - Objective and function - types of OS Scheduling - Long term scheduling - Medium term scheduling.
Chapter 6 Scheduling. Basic concepts Goal is maximum utilization –what does this mean? –cpu pegged at 100% ?? Most programs are I/O bound Thus some other.
임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
Multiprogramming. Readings r Silberschatz, Galvin, Gagne, “Operating System Concepts”, 8 th edition: Chapter 3.1, 3.2.
Power Containers: An OS Facility for Fine-Grained Power and Energy Management on Multicore Servers Kai Shen, Arrvindh Shriraman, Sandhya Dwarkadas, Xiao.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona.
Uniprocessor Scheduling
Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.
Operating System Principles And Multitasking
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
CSC Multiprocessor Programming, Spring, 2012 Chapter 11 – Performance and Scalability Dr. Dale E. Parson, week 12.
Split-Level I/O Scheduling
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Assoc. Prof. Dr. Ahmet Turan ÖZCERİT.  What Operating Systems Do  Computer-System Organization  Computer-System Architecture  Operating-System Structure.
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
Chapter 8 System Management Semester 2. Objectives  Evaluating an operating system  Cooperation among components  The role of memory, processor,
Parallel IO for Cluster Computing Tran, Van Hoai.
Introduction to Operating System. 1.1 What is Operating System? An operating system is a program that manages the computer hardware. It also provides.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 27 – Media Server (Part 2) Klara Nahrstedt Spring 2009.
Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016.
Chapter 2 Operating System Overview Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Memory Management.
William Stallings Computer Organization and Architecture
Lecture 45 Syed Mansoor Sarwar
Chapter 4: Threads.
CPU Scheduling Basic Concepts Scheduling Criteria
Reference-Driven Performance Anomaly Identification
Chapter 4: Threads.
Operating Systems.
Operating systems Process scheduling.
CPU SCHEDULING.
Lecture Topics: 11/1 General Operating System Concepts Processes
Persistence: hard disk drive
Prof. Leonardo Mostarda University of Camerino
Request Behavior Variations
Linux Block I/O Layer Chris Gill, Brian Kocoloski
Chapter 2 Operating System Overview
CS703 - Advanced Operating Systems
Operating System Overview
Introduction to Operating Systems
Presentation transcript:

FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDs Kai Shen and Stan Park University of Rochester USENIX-ATC 20131

Flash I/O Fairness USENIX-ATC  NAND Flash storage devices achieve fast I/O without mechanical seek/rotation delay  High efficiency when no OS scheduling (passing requests to device without delay or reordering, Linux noop)  But fairness is important in multi-task systems and clouds  Concern: heavy I/O operations can unfairly block light operations (e.g., writes block reads, large I/O block small I/O)  Existing fair I/O schedulers are mostly timeslice-based  Linux CFQ, Argon [Wachs et al.’07], FIOS [Park and Shen’12]  Timeslice schedulers may exhibit poor responsiveness, particularly when there are large number of co-running tasks  Timeslice schedulers can’t easily exploit device parallelism

Fair Queueing Resource Scheduler USENIX-ATC  Originated in network packet scheduling  Weighted Fair Queueing [Demers et al.’89], Processor Sharing [Parekh’92] and others  Virtual time-based fairness  Virtual time roughly indicates accumulated resource use for a task  Balancing virtual time progression (equal resource usage) by dispatching the request from task with slowest virtual time  Management of under-utilizing tasks  Those who do not immediately use allotted resource  Prevent them from building up unused resources for bursty dispatches

Timeslicing vs. Fair Queueing USENIX-ATC  Timeslice scheduling  Fair queuing (more responsive)  Fair queuing (allow parallelism on parallel device)

Concern: Loss of Spatial Locality USENIX-ATC  Fair queueing with frequent task switches loses spatial locality  Significant problem for mechanical disks  Less a problem for Flash drives  Logically random writes become physically sequential writes through block remapping at firmware Ratio of random I/O latency over sequential I/O latency

FlashFQ Design Basis USENIX-ATC  Build on SFQ(D) [Jin et al.’04]  A request’s start tag is roughly the owner task’s accumulated resource usage before its service (task virtual time)  Request dispatches are ordered based on their start tags for fairness  Parallel dispatches are allowed up to the depth D  Prevent under-utilizing tasks from building up unused resources  System virtual time: minimum virtual time of all active tasks  System virtual time is the lower bound of request start tags  bring forward the virtual time of under-utilizing tasks after inactivity  forfeiture of unused resources

Challenge: Restricted Parallelism on Flash USENIX-ATC  Parallel I/O sometimes improves efficiency but interference exists among concurrently dispatched I/O operations  Challenge: exploit parallelism but manage interference

Solution: Throttled Dispatch USENIX-ATC  Given interference, parallel dispatches without control leads to unfairness  E.g., a writer would utilize much more resource than a reader does  Our Approach:  Account for each task’s resource usage  Throttled dispatch – block a task if its resource usage is excessively ahead of the slowest task, who will then catch up at less interference

Challenge: Deceptive Idleness USENIX-ATC  Existing fair queueing schedulers are work-conserving  They never idle the device when there is pending work  Deceptive idleness  An active task that issues the next I/O request a short time after receiving the result of the previous one temporarily appears idle  Work-conserving schedulers fail to recognize deceptive idleness  Known to cause poor performance on disks [Iyer and Druschel’01]  Little performance impact on Flash, but cause poor fairness  While a task is deceptively idle, the system virtual time may advance while forfeiting its resources

Solution: Anticipation for Fairness USENIX-ATC  Anticipation: Let a task stay “active” continuously when deceptive idleness appears between its consecutive requests 1. System virtual time considers such “active” task’s next anticipated request as a hypothetical outstanding request 2. Throttled dispatch also considers such “active” task when deciding whether another task is an excessive resource over-user  Work-conserving or not  Anticipation #2 may idle the device while there is pending work  wasted resources  Anticipation #1 maintains the work-conserving property  Differentiated anticipation timeouts

Discussion: Knowledge of Request Cost USENIX-ATC  Need to know a request’s resource use before it completes  Finish-time-based fair queueing  Start-time-based fair queueing that allows parallelism  We estimate an I/O operation’s resource use based on its type (read/write) and size  For reads and writes respectively, we assume a linear model (non-zero offset) between the I/O size and its resource use

Implementation Issues USENIX-ATC  We implement FlashFQ in the OS (Linux) to regulate I/O resource by concurrent applications  Can also be implemented in a virtual machine monitor to manage I/O resource among VMs  Queue plugging and request merging  Critical performance enhancement technique  Complication for fair queueing scheduler (re-computing virtual time tags for requests and tasks)  I/O context: the Linux resource principal to receive fairness  Hard to use (impossible to group multiple threads together)  Bug on process grouping  Journaling daemon, inappropriately, has a unique I/O context by itself

Evaluation Setup USENIX-ATC  Demonstrate the fairness and responsiveness of FlashFQ  Compare against several alternatives:  Raw device I/O (Linux noop)  Linux CFQ – timeslice-based, but a timeslice ends if the task appears to be idle (even deceptively idle)  Quanta – strict enforcement of timeslices  FIOS – our previously-developed timeslice Flash scheduler [FAST’12]  4-Tag SFQ(D) – no support for throttled dispatches or anticipation for fairness

Evaluation on Fairness USENIX-ATC  Only Quanta, FIOS, and FlashFQ achieve fairness

Evaluation on Responsiveness USENIX-ATC  Only FlashFQ achieves fairness and responsiveness

Evaluation with Apache and KyotoCabinet USENIX-ATC  Apache web server: reading mostly small files  KyotoCabinet key-value store: replacing large (128KB) records

Conclusions USENIX-ATC  Fair queueing is well suited for Flash I/O scheduling  Mostly work-conserving (efficient), fair, and highly responsive  Support I/O parallelism on Flash  Loss of spatial locality isn’t a big concern on Flash  FlashFQ  Build on classic fair queueing with parallelism  Throttled dispatch to address restricted parallelism on Flash  Anticipation for fairness to address deceptive idleness  Practical lessons  Require knowledge (estimation) of request cost  Linux implementation: queue plugging and request merging, proper I/O context maintenance (journaling)