[Altinbuke, Walsh, Weatherspoon, Bala, Bracy, McKee, and Sirer]

Slides:



Advertisements
Similar presentations
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
Advertisements

Disk Drivers May 10, 2000 Instructor: Gary Kimura.
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Disks.
12/3/2004EE 42 fall 2004 lecture 391 Lecture #39: Magnetic memory storage Last lecture: –Dynamic Ram –E 2 memory This lecture: –Future memory technologies.
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
Comp 1001: IT & Architecture - Joe Carthy 1 Information Representation: Summary All Information is stored and transmitted in digital form in a computer.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Topic: Disks – file system devices. Rotational Media Sector Track Cylinder Head Platter Arm Access time = seek time + rotational delay + transfer time.
I/O 1 Computer Organization II © McQuain Introduction I/O devices can be characterized by – Behavior: input, output, storage – Partner:
CS 111 – Aug – 1.3 –Information arranged in memory –Types of memory –Disk properties Commitment for next day: –Read pp , In other.
Lecture 16: Storage and I/O EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
Disks Chapter 5 Thursday, April 5, Today’s Schedule Input/Output – Disks (Chapter 5.4)  Magnetic vs. Optical Disks  RAID levels and functions.
I/O Computer Organization II 1 Introduction I/O devices can be characterized by – Behavior: input, output, storage – Partner: human or machine – Data rate:
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Computer Organization CS224 Fall 2012 Lessons 47 & 48.
Programming for GCSE Topic 5.1: Memory and Storage T eaching L ondon C omputing William Marsh School of Electronic Engineering and Computer Science Queen.
20 October 2015Birkbeck College, U. London1 Introduction to Computer Systems Lecturer: Steve Maybank Department of Computer Science and Information Systems.
CS 101 – Sept. 28 Main vs. secondary memory Examples of secondary storage –Disk (direct access) Various types Disk geometry –Flash memory (random access)
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Main Memory Main memory – –a collection of storage locations, –each with a unique identifier called the address. Word- –Data are transferred to and from.
W4118 Operating Systems Instructor: Junfeng Yang.
Data Storage and Querying in Various Storage Devices.
File organization Secondary Storage Devices Lec#7 Presenter: Dr Emad Nabil.
LECTURE 13 I/O. I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access.
CSE 451: Operating Systems Spring 2010 Module 12.5 Secondary Storage John Zahorjan Allen Center 534.
TYPES OF MEMORY.
Chapter 10: Mass-Storage Systems
Database Applications (15-415) DBMS Internals- Part I Lecture 11, February 16, 2016 Mohammad Hammoud.
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Disks and RAID.
Database Management Systems (CS 564)
I/O System Chapter 5 Designed by .VAS.
CS 554: Advanced Database System Notes 02: Hardware
Secondary Storage Secondary storage typically: Characteristics:
Backing Store.
Introduction to Computing
The Memory Hierarchy Chapter 5
IT 344: Operating Systems Winter 2008 Module 13 Secondary Storage
Part V Memory System Design
File Processing : Storage Media
Introduction I/O devices can be characterized by I/O bus connections
Lecture 13 I/O.
Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin
Secondary Storage Devices
CS 3410 Computer System Organization & Programming
File Processing : Storage Media
CSE 451: Operating Systems Winter 2006 Module 13 Secondary Storage
CSE 451: Operating Systems Autumn 2003 Lecture 12 Secondary Storage
CSE 451: Operating Systems Winter 2007 Module 13 Secondary Storage
CSE 451: Operating Systems Spring 2006 Module 13 Secondary Storage
Disks and scheduling algorithms
Secondary Storage Management Brian Bershad
Persistence: hard disk drive
Mass-Storage Systems.
CSE 451: Operating Systems Secondary Storage
CSE 451: Operating Systems Winter 2003 Lecture 12 Secondary Storage
CSE 451: Operating Systems Spring 2005 Module 13 Secondary Storage
Secondary Storage Management Hank Levy
CSE451 File System Introduction and Disk Drivers Autumn 2002
CSE 451: Operating Systems Autumn 2004 Secondary Storage
CSE 451: Operating Systems Winter 2004 Module 13 Secondary Storage
Chapter 11: Mass-Storage Systems
CS 245: Database System Principles Notes 02: Hardware
Presentation transcript:

[Altinbuke, Walsh, Weatherspoon, Bala, Bracy, McKee, and Sirer] Storage Hakim Weatherspoon CS 3410 Computer Science Cornell University [Altinbuke, Walsh, Weatherspoon, Bala, Bracy, McKee, and Sirer]

Challenge How do we store lots of data for a long time Disk (Hard disk, floppy disk, …) Tape (cassettes, backup, VHS, …) CDs/DVDs

Challenge How do we store lots of data for a long time Disk (Hard disk, floppy disk, …Solid State Disk (SSD) Tape (cassettes, backup, VHS, …) CDs/DVDs Non-Volitile Persistent Memory (NVM; e.g. 3D Xpoint)

I/O System Characteristics Dependability is important Particularly for storage devices Performance measures Latency (response time) Throughput (bandwidth) Desktops & embedded systems Mainly interested in response time & diversity of devices Servers Mainly interested in throughput & expandability of devices

2006 Memory Hierarchy 16 KB 2 ns, random access 512 KB registers/L1 512 KB 5 ns, random access L2 2 GB 20-80 ns, random access DRAM 300 GB 2-8 ms, random access Disk 2006 1 TB 100s, sequential access Tape

Memory Hierarchy 128 KB 2 ns, random access 4 MB 5 ns, random access registers/L1 4 MB 5 ns, random access L2 256 GB 20-80 ns, random access DRAM 6 TB 2-8 ms, random access Disk 30 TB 100ns-10us, random access Millions of IOPS (I/O per sec) SSD

Memory Hierarchy 128 KB 2 ns, random access 4 MB 5 ns, random access registers/L1 4 MB 5 ns, random access L2 256 GB 20-80 ns, random access DRAM 1 TB 20 -100 ns, random access Non-volatile memory 6 TB 2-8 ms, random access Disk 30 TB 100ns-10us, random access Millions of IOPS (I/O per sec) SSD

Memory Hierarchy 30 TB 245B 10s of Disks 256 TB 248B 10s of Servers 100ns-10us, random access Millions of IOPS (I/O per sec) 30 TB SSD 245B 10s of Disks 256 TB Server 248B 10s of Servers 10 PB Rack of Servers 253B 10-100s of Servers 1 EB Data Center 260B 10-100s of Data Centers 0.1 YB Cloud 267B

The Rise of Cloud Computing How big is Big Data in the Cloud? Exabytes: Delivery of petabytes of storage daily Titan tech boom, randy katz, 2008

The Rise of Cloud Computing How big is Big Data in the Cloud? Most of the worlds data (and computation) hosted by few companies

The Rise of Cloud Computing How big is Big Data in the Cloud? Most of the worlds data (and computation) hosted by few companies

The Rise of Cloud Computing The promise of the Cloud ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. NIST Cloud Definition

The Rise of Cloud Computing The promise of the Cloud ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. NIST Cloud Definition

Tapes Same basic principle for 8-tracks, cassettes, VHS, ... Ferric Oxide Powder: ferromagnetic material During recording, the audio signal is sent through the coil of wire to create a magnetic field in the core. During playback, the motion of the tape creates a varying magnetic field in the core and therefore a signal in the coil. 0 0 1 0 1 0 1 0 1

Disks & CDs Disks use same magnetic medium as tapes concentric rings (not a spiral) CDs & DVDs use optics and a single spiral track

Disk Physics Typical parameters : 1 spindle 1 arm assembly 1-4 platters 1-2 sides/platter 1 head per side (but only 1 active head at a time) 700-20480 tracks/surface 16-1600 sectors/track

Disk Accesses Accessing a disk requires: Performance: specify sector: C (cylinder), H (head), and S (sector) specify size: number of sectors to read or write specify memory address Performance: seek time: move the arm assembly to track Rotational delay: wait for sector to come around transfer time: get the bits off the disk Controller time: time for setup Track Sector Seek Time Rotation Delay

Example Average time to read/write 512-byte sector Average time: Disk rotation at 10,000 RPM Seek time: 6ms Transfer rate: 50 MB/sec Controller overhead: 0.2 ms Average time: Seek time + rotational delay + transfer time + controller overhead 6ms + 0.5 rotation/(10,000 RPM) + 0.5KB/(50 MB/sec) + 0.2ms 6.0 + 3.0 + 0.01 + 0.2 = 9.2ms

Disk Access Example If actual average seek time is 2ms Average read time = 5.2ms

Disk Scheduling Goal: minimize seek time secondary goal: minimize rotational latency FCFS (First come first served) Shortest seek time SCAN/Elevator First service all requests in one direction Then reverse and serve in opposite direction Circular SCAN Go off the edge and come to the beginning and start all over again

FCFS

SSTF

SCAN

C-SCAN

Disk Geometry: LBA New machines use logical block addressing instead of CHS machine presents illusion of an array of blocks, numbered 0 to N Modern disks… have varying number of sectors per track roughly constant data density over disk varying throughput over disk remap and reorder blocks (to avoid defects) completely obscure their actual physical geometry have built-in caches to hide latencies when possible (but being careful of persistence requirements) have internal software running on an embedded CPU

Flash Storage Nonvolatile semiconductor storage 100× – 1000× faster than disk Smaller, lower power But more $/GB (between disk and DRAM) But, price is dropping and performance is increasing faster than disk

Flash Types NOR flash: bit cell like a NOR gate Random read/write access Used for instruction memory in embedded systems NAND flash: bit cell like a NAND gate Denser (bits/area), but block-at-a-time access Cheaper per GB Used for USB keys, media storage, … Flash bits wears out after 1000’s of accesses Not suitable for direct RAM or disk replacement Flash has unusual interface can only “reset” bits in large blocks

I/O vs. CPU Performance Amdahl’s Law Example Don’t neglect I/O performance as parallelism increases compute performance Example Benchmark takes 90s CPU time, 10s I/O time Double the number of CPUs/2 years I/O unchanged Year CPU time I/O time Elapsed time % I/O time now 90s 10s 100s 10% +2 45s 55s 18% +4 23s 33s 31% +6 11s 21s 47%

RAID Redundant Arrays of Inexpensive Disks Big idea: Parallelism to gain performance Redundancy to gain reliability

Raid 0 Striping Non-redundant disk array!

Raid 1 Mirrored Disks! More expensive On failure use the extra copy

Raid 2-3-4-5-6 Bit Level Striping and Parity Checks! As level increases: More guarantee against failure, more reliability Better read/write performance Raid 2 Raid 4 Raid 3 Raid 5

Summary Disks provide nonvolatile memory I/O performance measures RAID Throughput, response time Dependability and cost very important RAID Redundancy for fault tolerance and speed