PARALLEL DATA LABORATORY Carnegie Mellon University Advanced disk scheduling “ Freeblock scheduling” Eno Thereska (slide contributions by Chris Lumb and.

Slides:

Advertisements

Similar presentations

Storing Data: Disks and Files: Chapter 9

Advertisements

Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.

CSE506: Operating Systems Disk Scheduling. CSE506: Operating Systems Key to Disk Performance Don’t access the disk – Whenever possible Cache contents.

Local Filesystems (part 2) CPS210 Spring Papers  Towards Higher Disk Head Utilization: Extracting Free Bandwidth From Busy Disk Drives  Christopher.

CS4432: Database Systems II Data Storage - Lecture 2 (Sections 13.1 – 13.3) Elke A. Rundensteiner.

1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])

Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:

Disk Drivers May 10, 2000 Instructor: Gary Kimura.

Based on the slides supporting the text

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.

Slide 5-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 5.

SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.

CS4432: Database Systems II Lecture 2 Timothy Sutherland.

G Robert Grimm New York University Sprite LFS or Let’s Log Everything.

Secondary Storage CSCI 444/544 Operating Systems Fall 2008.

1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.

1 Today I/O Systems Storage. 2 I/O Devices Many different kinds of I/O devices Software that controls them: device drivers.

Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage Secondary Storage is usually: –anything outside of “primary memory” –storage that.

The Design and Implementation of a Log-Structured File System Presented by Carl Yao.

Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.

12.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 12: Mass-Storage Systems.

Disk and I/O Management

1 File System Implementation Operating Systems Hebrew University Spring 2010.

Lecture 11: DMBS Internals

Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 10: Mass-Storage Systems.

I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.

Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.

Operating Systems CMPSC 473 I/O Management (4) December 09, Lecture 25 Instructor: Bhuvan Urgaonkar.

Copyright ©: Nahrstedt, Angrave, Abdelzaher, Caccamo1 Disk & disk scheduling.

CS 153 Design of Operating Systems Spring 2015 Final Review.

I/O Management and Disk Structure Introduction to Operating Systems: Module 14.

Disks Chapter 5 Thursday, April 5, Today’s Schedule Input/Output – Disks (Chapter 5.4)  Magnetic vs. Optical Disks  RAID levels and functions.

CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.

15-410, S’ Disks Mar. 24, 2004 Dave Eckhardt & Bruce Maggs Brian Railing & Steve Muckle Contributions from Eno Thereska “How Stuff Works”

Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.

UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department

Csci 136 Computer Architecture II – IO and Storage Systems Xiuzhen Cheng

12/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.

Disks and Disk Scheduling Steve Muckle Monday, March 31st Spring 2003.

Disk Basics CS Introduction to Operating Systems.

Local Filesystems (part 1) CPS210 Spring Papers  The Design and Implementation of a Log- Structured File System  Mendel Rosenblum  File System.

1.  Disk Structure Disk Structure  Disk Scheduling Disk Scheduling  FCFS FCFS  SSTF SSTF  SCAN SCAN  C-SCAN C-SCAN  C-LOOK C-LOOK  Selecting a.

11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]

© 2004, D. J. Foreman 1 Device Mgmt. © 2004, D. J. Foreman 2 Device Management Organization  Multiple layers ■ Application ■ Operating System ■ Driver.

PARALLEL DATA LABORATORY Carnegie Mellon University A framework for implementing IO-bound maintenance applications Eno Thereska.

COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.

CPS110: I/O and file systems Landon Cox April 3, 2008.

Lecture Topics: 11/22 HW 7 File systems –block allocation Unix and NT –disk scheduling –file caches –RAID.

1 Lecture 16: Data Storage Wednesday, November 6, 2006.

1 Components of the Virtual Memory System  Arrows indicate what happens on a lw virtual address data physical address TLB page table memory cache disk.

Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.

Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Silberschatz, Galvin and Gagne ©2007 Chapter 11: File System Implementation.

Chapter 10: Mass-Storage Systems

Disks and RAID.

Database Management Systems (CS 564)

Operating Systems Disk Scheduling A. Frank - P. Weisberg.

CS703 - Advanced Operating Systems

CPSC-608 Database Systems

Research on Disks and Disk Scheduling

Lecture 45 Syed Mansoor Sarwar

CSE 153 Design of Operating Systems Winter 2018

Lecture 9: Data Storage and IO Models

Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin

CPSC 457 Operating Systems

Overview Continuation from Monday (File system implementation)

Persistence: hard disk drive

Disks and Disk Scheduling

CSE451 File System Introduction and Disk Drivers Autumn 2002

Device Mgmt © 2004, D. J. Foreman.

Presentation transcript:

PARALLEL DATA LABORATORY Carnegie Mellon University Advanced disk scheduling “ Freeblock scheduling” Eno Thereska (slide contributions by Chris Lumb and Brandon Salmon)

Eno Thereska Lecture April Outline Freeblock scheduling: some theory Freeblock scheduling: applied Some details Q & A

Eno Thereska Lecture April Some theory: preview Next few slides will review & show that: disks are slow mechanical delays (seek + rotational latencies) there is nothing we can do during a seek there is a lot we can do during a rotation rotational latencies are very large while rotation is happening go to nearby tracks and do useful work “freeblock scheduling” = utilization of rotational latency gaps (+ any idle time)

Eno Thereska Lecture April Are disks slow? Are the xfer speeds that slow? no, xfer speeds of 200MB/s are pretty good So what is slow? workload often not sequential disk head has to move from place to place seek (~ 4ms) + rotation (~ 3ms) Effective bandwidth can be very low ~ 10-30MB/s even when SPTF is used

Eno Thereska Lecture April Surface organized into tracks

Eno Thereska Lecture April Tracks broken up into sectors

Eno Thereska Lecture April Disk head position

Eno Thereska Lecture April Rotation is counter-clockwise

Eno Thereska Lecture April About to read blue sector

Eno Thereska Lecture April After reading blue sector After BLUE read

Eno Thereska Lecture April Red request scheduled next After BLUE read

Eno Thereska Lecture April Seek to Red’s track After BLUE readSeek for RED SEEK

Eno Thereska Lecture April Wait for Red sector to reach head After BLUE readSeek for REDRotational latency ROTATESEEK

Eno Thereska Lecture April Read Red sector After BLUE readSeek for REDRotational latencyAfter RED read ROTATESEEK

Eno Thereska Lecture April Scheduling algorithm Impact

Eno Thereska Lecture April Impact of Request Sizes

Eno Thereska Lecture April What can we do? Nothing we can do during a seek disk head has to move to the right track Rotational latency is fully wasted let’s use this latency During a rotational latency go to nearby tracks and do useful work then, just-in-time, seek back to the original request

Eno Thereska Lecture April A quick glance ahead… What kind of “useful work” are we doing? work that belongs to a “background” app things like backup, defrag, virus scanning What do we really gain? background apps don’t interfere with fore. apps background apps still complete What’s in it for me? can run defrag + virus scanner + backup in the background while working on your homework and you won’t notice they are running

Eno Thereska Lecture April Rotational latency gap utilization After BLUE read

Eno Thereska Lecture April Seek to Third track After BLUE readSeek to Third SEEK

Eno Thereska Lecture April Free transfer After BLUE readSeek to ThirdFree transfer SEEKFREE TRANSFER

Eno Thereska Lecture April Seek to Red’s track After BLUE readSeek to ThirdSeek to REDFree transfer SEEK FREE TRANSFER

Eno Thereska Lecture April Read Red sector After BLUE readSeek to ThirdSeek to REDAfter RED readFree transfer SEEK FREE TRANSFER

Eno Thereska Lecture April Final theory details Scheduler also uses disk idle time high end servers have little idle time Idle time + rotational latency usage = “freeblock scheduling” (it means we are getting things for free)

Eno Thereska Lecture April Steady background I/O progress % disk utilization by foreground reads/writes “Free” MB/s from idle time from rotational gaps

Eno Thereska Lecture April Applied freeblocks: preview Next few slides will show that: we can build background apps that do not interfere with foreground apps that complete eventually things like backup, defrag, virus scanners, etc imagine the possibilities…

Eno Thereska Lecture April App 1: Backup Frequent backup improves data reliability and availability companies take very frequent backups a backup every 30 mins is not uncommon Our experiment: disk used is 18GB we want to back up 12GB of data goal: back it up for free

Eno Thereska Lecture April Backup completed for free Idle systemSyntheticTPC-CPostmark Backup time (mins) < 2% impact on foreground workload

Eno Thereska Lecture April App 2: Layout reorganization Layout reorganization improves access latencies defragmentation is a type of reorganization typical example of background activity Our experiment: disk used is 18GB we want to defrag up to 20% of it goal: defrag for free

Eno Thereska Lecture April Disk Layout Reorganized for Free! Random Circular Track Shuffle %10%20%1%10%20% 8MB64MB Reorganizer buffer size (MB) Reorganization time (mins)

Eno Thereska Lecture April Other maintenance applications Virus scanner LFS cleaner Disk scrubber Data mining Data migration

Eno Thereska Lecture April Summary Disks are slow but we can squeeze extra bw out of them Use freeblock scheduling to extract free bandwidth Utilize free bandwidth for background applications they still complete eventually with no impact on foreground workload

Eno Thereska Lecture April Details: preview (extra slides) Next few slides will show that: it’s hard to do fine grained scheduling at the device driver background apps need new interfaces to express their desires to the background scheduler what if background apps want to read/write to files (APIs talk in LBNs, remember)? recommended reading material

Eno Thereska Lecture April Ahh, the details… Hard to do at the device driver need to know the position of the disk head however, we have done it! it’s more efficient inside the disk drive try to convince your disk vendor to put it in Efficient algorithms SPTF for foreground (0.5% of 1GHz PIII) Freeblock scheduling for background (<<8% of 1GHz PIII) Small memory utilization

Eno Thereska Lecture April Application programming interface (API) goals Work exposed but done opportunistically all disk accesses are asynchronous Minimized memory-induced constraints late binding of memory buffers late locking of memory buffers “Block size” can be application-specific Support for speculative tasks Support for rate control

Eno Thereska Lecture April API description: task registration fb_read (addr_range, blksize,…) fb_write (addr_range, blksize,…) Foreground Background foreground scheduler background scheduler application

Eno Thereska Lecture April API description: task completion callback_fn (addr, buffer, flag, …) Foreground Background foreground scheduler background scheduler application

Eno Thereska Lecture April API description: late locking of buffers buffer = getbuffer_fn (addr, …) Foreground Background foreground scheduler background scheduler application

Eno Thereska Lecture April API description: aborting/promoting tasks fb_abort (addr_range, …) fb_promote (addr_range, …) Foreground Background foreground scheduler background scheduler application

Eno Thereska Lecture April Complete API

Eno Thereska Lecture April Designing disk maintenance applications APIs talk in terms of logical blocks (LBNs) Some applications need structured version as presented by file system or database Example consistency issues application wants to read file “foo” registers task for inode’s blocks by time blocks read, file may not exist anymore!

Eno Thereska Lecture April Application does not care about structure scrubbing, data migration, array reconstruction Coordinate with file system/database cache write-backs, LFS cleaner, index generation Utilize snapshots backup, background fsck Designing disk maintenance applications

Eno Thereska Lecture April Q & A See for more details on freeblockshttp:// Recommended reading : Background fsck (describes snapshots) search for “M. K. McKusick. Running ’fsck’ in the background. BSDCon Conference, 2002” IO-Lite (describes a unified buffer system) search for “V. S. Pai, P. Druschel, and W. Zwaenepoel. IO-Lite: a unified I/O buffering and caching system. Symposium on Operating Systems Design and Implementation 1998”