Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.

Slides:



Advertisements
Similar presentations
I/O Management and Disk Scheduling Chapter 11. I/O Driver OS module which controls an I/O device hides the device specifics from the above layers in the.
Advertisements

- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
RAID Redundant Array of Independent Disks
CS 6560: Operating Systems Design
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
Lecture 16 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
Lecture 14 I/O Devices & Hard Disk Drives. vgetmem only from private heap.
CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #6.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:
Disk Drivers May 10, 2000 Instructor: Gary Kimura.
Based on the slides supporting the text
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
Cse Feb-001 CSE 451 Section February 24, 2000 Project 3 – VM.
Disks.
1 File System Implementations. 2 Overview The Low Level –disks –caching –RAM disks –RAIDs –disk head scheduling The higher level –file systems –directories.
Disks CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Avishai Wool lecture Introduction to Systems Programming Lecture 9 Input-Output Devices.
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
1 Today I/O Systems Storage. 2 I/O Devices Many different kinds of I/O devices Software that controls them: device drivers.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Moving-head Disk Mechanism Rotation Speeds: 60 to 200 rotations per second Head Crash: read-write head makes contact with the surface.
12.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 12: Mass-Storage Systems.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
Device Management. So far… We have covered CPU and memory management Computing is not interesting without I/Os Device management: the OS component that.
CS 346 – Chapter 10 Mass storage –Advantages? –Disk features –Disk scheduling –Disk formatting –Managing swap space –RAID.
Storage & Peripherals Disks, Networks, and Other Devices.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
1 Recitation 8 Disk & File System. 2 Disk Scheduling Disks are at least four orders of magnitude slower than main memory –The performance of disk I/O.
1 File System Implementation Operating Systems Hebrew University Spring 2010.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Topic: Disks – file system devices. Rotational Media Sector Track Cylinder Head Platter Arm Access time = seek time + rotational delay + transfer time.
1 Lecture 8: Secondary-Storage Structure 2 Disk Architecture Cylinder Track SectorDisk head rpm.
Disk Structure Disk drives are addressed as large one- dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer.
CS 6502 Operating Systems Dr. J.. Garrido Device Management (Lecture 7b) CS5002 Operating Systems Dr. Jose M. Garrido.
CE Operating Systems Lecture 20 Disk I/O. Overview of lecture In this lecture we will look at: Disk Structure Disk Scheduling Disk Management Swap-Space.
CS414 Review Session.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
Chapter 12 – Mass Storage Structures (Pgs )
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Chapter 14: Mass-Storage Systems Disk Structure. Disk Scheduling. RAID.
Device Management Mark Stanovich Operating Systems COP 4610.
CS399 New Beginnings Jonathan Walpole. Disk Technology & Secondary Storage Management.
Part IV I/O System Chapter 12: Mass Storage Structure.
Lecture Topics: 11/22 HW 7 File systems –block allocation Unix and NT –disk scheduling –file caches –RAID.
W4118 Operating Systems Instructor: Junfeng Yang.
LECTURE 13 I/O. I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access.
1 Components of the Virtual Memory System  Arrows indicate what happens on a lw virtual address data physical address TLB page table memory cache disk.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
Magnetic Disks Have cylinders, sectors platters, tracks, heads virtual and real disk blocks (x cylinders, y heads, z sectors per track) Relatively slow,
CS Introduction to Operating Systems
Database Applications (15-415) DBMS Internals- Part I Lecture 11, February 16, 2016 Mohammad Hammoud.
Multiple Platters.
Disks and RAID.
Operating System I/O System Monday, August 11, 2008.
Lecture 13 I/O.
Overview Continuation from Monday (File system implementation)
Jonathan Walpole Computer Science Portland State University
Persistence: hard disk drive
Mass-Storage Systems.
Persistence: I/O devices
CS333 Intro to Operating Systems
Disk Scheduling The operating system is responsible for using hardware efficiently — for the disk drives, this means having a fast access time and disk.
Lecture 10: Magnetic Disks
Presentation transcript:

Lecture 17 Raid

Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O

Seek, Rotate, Transfer Must accelerate, coast, decelerate, settle Seeks often take several milliseconds! Settling alone can take ms. Entire seek often takes ms.

Seek, Rotate, Transfer Depends on rotations per minute (RPM) RPM is common, RPM is high end. 1 / 7200 RPM = 1 minute / 7200 rotations = 1 second / 120 rotations = 8.3 ms / rotation so it may take 4.2 ms on avg to rotate to target (0.5 * 8.3 ms)

Seek, Rotate, Transfer Pretty fast — depends on RPM and sector density MB/s is typical. 1s / 100 MB = 10 ms / MB = 4.9 us / sector (assuming 512-byte sector)

Workload So… seeks are slow rotations are slow transfers are fast What kind of workload is fastest for disks? Sequential: access sectors in order (transfer dominated) Random: access sectors arbitrarily (seek+rotation dominated)

Disk Spec Sequential workload: what is throughput for each? Close to max transfer rate if workload large enough Random workload: what is throughput for each? Assume 16-KB reads, 2.5 MB/s and 1.2MB/s

Track Skew

Zones

Other Improvements Track Skew Zones Cache Drives may cache both reads and writes.

Schedulers Given a stream of requests, in what order should they be served? Try to follow SJF (shortest job first) SSTF: Shortest Seek Time First Difficult for OS, starvation Elevator (a.k.a. SCAN or C-SCAN) Still ignore rotation time

SPTF: Shortest Positioning Time First

Other Issues Both OS and disk do some scheduling I/O merging Work Conservation

Only One Disk? Sometimes we want many disks — why? capacity performance reliability Challenge: most file systems work on only one disk.

Solution 1: JBOD JBOD: Just a Bunch Of Disks Application is smart, stores different files on different file systems. FS Application

Solution 2: RAID RAID: Redundant Array of Inexpensive Disks Transparent and deployable Capacity and performance Reliability? FS Fake Disk Application

Why Inexpensive Disks? Economies of scale! Cheap disks are popular. You can often get many commodity H/W components for the same price as a few expensive components. Strategy: write S/W to build high-quality logical devices from many cheap devices. Alternative to RAID: buy an expensive, high-end disk.

General Strategy Build fast, large disk from smaller ones. Add even more disks for reliability. Disk RAID Disk 0 100

Mapping How should we map logical to physical addresses? How is this problem similar to virtual memory? Dynamic mapping: use data structure (hash table, tree) paging Static mapping: use math RAID

Redundancy Redundancy: how many copies? System engineers are always trying to increase or decrease redundancy. Increase: replication (e.g., RAID) Decrease: deduplication (e.g., code sharing) One strategy: reduce redundancy as much is possible. Then add back just the right amount.

Reasoning About RAID Workload: types of reads/writes issued by app Reads and writes One operation and steady I/O Sequential and random RAID: system for mapping logical to physical addrs Which logical blocks map to which physical blocks? How do we use extra physical blocks (if any)? Metrics Capacity: how much space can apps use? Reliability: how many disks can we safely lose? Performance: how long does each workload take?

RAID-0: Striping Optimize for capacity. No redundancy (weird name). Logical Blocks Disk 0 Disk 1

RAID-0 with 4 disks Disk 0 Disk 1 Disk 2 Disk stripe How to map given logical address A: Disk = A % disk_count Offset = A / disk_count

Chunk Size Disk 0 Disk 1 Disk 2 Disk stripe When are small chunks better? When are big ones better?

RAID-0: Analysis What is capacity? N * C How many disks can fail? 0 Throughput? ? Latency for one-op performance T

RAID-0: Analysis Performance: what is steady-state throughput for sequential reads sequential writes random reads random writes N * S N * R

RAID-1: Mirroring Keep two copies of all data. Logical Blocks Disk 0 Disk 1

4 disks Disk 0 Disk 1 Disk 2 Disk How many disks can fail?

Assumptions Assume disks are fail-stop. they work or they don’t we know when they don’t Tougher Errors: latent sector errors silent data corruption

RAID-1: Analysis What is capacity? N / 2 * C How many disks can fail? 1 (or maybe N / 2) Throughput? ??? Latency for one-op performance T for read, T+ for write

RAID-1: Throughput Performance: what is steady-state throughput for sequential reads sequential writes random reads random writes N/2 * S N * R N/2 * R

Crashes Operations: write (A) to 2 What if crash in between the writes to two disks? Logical Blocks Disk 0 Disk 1 Consistent-update problem

Write-ahead log Run recovery procedure upon a system failure H/W Solution: Use non-volatile RAM in RAID controller.

RAID-4 with 5 disks Disk 0 Disk 1 Disk 2 Disk 3 Disk PA PA PA PA3 How to calculate parity XOR is a good one parity

RAID-4: Analysis What is capacity? (N-1) * C How many disks can fail? 1 Throughput? ??? Latency for one-op performance T for read, 2T+ for write

RAID-4: Throughput Performance: what is steady-state throughput for sequential reads sequential writes random reads random writes additive parity subtractive parity (N-1) * S (N-1) * R R/2 How to avoid parity bottleneck?

RAID-5 Disk 0 Disk 1 Disk 2 Disk 3 Disk PA PA PA PA PA

RAID-5: Analysis What is capacity? (N-1) * C How many disks can fail? 1 Throughput? ??? Latency for one-op performance T for read, 2T+ for write

RAID-5: Throughput Performance: what is steady-state throughput for sequential reads sequential writes random reads random writes (N-1) * S N * R N/4 * R

All RAID ReliabilityCapacity RAID-00N RAID-11N / 2 RAID-41N - 1 RAID-51N - 1

All RAID Read LatencyWrite Latency RAID-0TT RAID-1TT RAID-4T2T RAID-5T2T RAID-5 can do more in parallel

All RAID Seq RSeq WRand RRand W RAID-0N*S N*R RAID-1N/2 * S N*RN/2 * R RAID-4(N-1)*S (N-1)*RR/2 RAID-5(N-1)*S N*RN/4 * R RAID-5 is strictly better than RAID-4 RAID-0 is always fastest and has best capacity. RAID-5 better than RAID-1 for sequential. RAID-1 better than RAID-5 for random write.

Next time FS API vsfs