CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.

Slides:



Advertisements
Similar presentations
A Case for Redundant Arrays Of Inexpensive Disks Paper By David A Patterson Garth Gibson Randy H Katz University of California Berkeley.
Advertisements

Faculty of Information Technology Department of Computer Science Computer Organization Chapter 7 External Memory Mohammad Sharaf.
RAID Redundant Array of Independent Disks
CS 6560: Operating Systems Design
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
CS4432: Database Systems II Data Storage - Lecture 2 (Sections 13.1 – 13.3) Elke A. Rundensteiner.
Computer ArchitectureFall 2007 © November 28, 2007 Karem A. Sakallah Lecture 24 Disk IO and RAID CS : Computer Architecture.
CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #6.
1 Storage (cont’d) Disk scheduling Reducing seek time (cont’d) Reducing rotational latency RAIDs.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
Based on the slides supporting the text
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
1 CS143: Disks and Files. 2 System Architecture CPU Main Memory Disk Controller... Disk Word (1B – 64B) ~ x GB/sec Block (512B – 50KB) ~ x MB/sec System.
1 Outline File Systems Implementation How disks work How to organize data (files) on disks Data structures Placement of files on disk.
CS4432: Database Systems II Lecture 2 Timothy Sutherland.
Disks CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Mass-Storage Systems Revised Tao Yang.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
12.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 12: Mass-Storage Systems.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 5 – Storage Organization.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
CS 346 – Chapter 10 Mass storage –Advantages? –Disk features –Disk scheduling –Disk formatting –Managing swap space –RAID.
Storage & Peripherals Disks, Networks, and Other Devices.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
1 Recitation 8 Disk & File System. 2 Disk Scheduling Disks are at least four orders of magnitude slower than main memory –The performance of disk I/O.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Topic: Disks – file system devices. Rotational Media Sector Track Cylinder Head Platter Arm Access time = seek time + rotational delay + transfer time.
Disk Structure Disk drives are addressed as large one- dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer.
Chapter 111 Chapter 11: Hardware (Slides by Hector Garcia-Molina,
CE Operating Systems Lecture 20 Disk I/O. Overview of lecture In this lecture we will look at: Disk Structure Disk Scheduling Disk Management Swap-Space.
Disks Chapter 5 Thursday, April 5, Today’s Schedule Input/Output – Disks (Chapter 5.4)  Magnetic vs. Optical Disks  RAID levels and functions.
OSes: 11. FS Impl. 1 Operating Systems v Objectives –discuss file storage and access on secondary storage (a hard disk) Certificate Program in Software.
Chapter 8 External Storage. Primary vs. Secondary Storage Primary storage: Main memory (RAM) Secondary Storage: Peripheral devices  Disk drives  Tape.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Auxiliary Memory Magnetic Disk:
CS 101 – Sept. 28 Main vs. secondary memory Examples of secondary storage –Disk (direct access) Various types Disk geometry –Flash memory (random access)
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
Disk Average Seek Time. Multi-platter Disk platter Disk read/write arm read/write head.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Programmer’s View of Files Logical view of files: –An a array of bytes. –A file pointer marks the current position. Three fundamental operations: –Read.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
CPS216: Advanced Database Systems Notes 03: Data Access from Disks Shivnath Babu.
Lecture Topics: 11/22 HW 7 File systems –block allocation Unix and NT –disk scheduling –file caches –RAID.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 2 nd Edition Chapter 9: Mass-Storage Systems.
Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
W4118 Operating Systems Instructor: Junfeng Yang.
LECTURE 13 I/O. I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access.
1 Components of the Virtual Memory System  Arrows indicate what happens on a lw virtual address data physical address TLB page table memory cache disk.
Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.
File organization Secondary Storage Devices Lec#7 Presenter: Dr Emad Nabil.
CS Introduction to Operating Systems
Database Applications (15-415) DBMS Internals- Part I Lecture 11, February 16, 2016 Mohammad Hammoud.
Multiple Platters.
Disks and RAID.
CS 554: Advanced Database System Notes 02: Hardware
Operating System I/O System Monday, August 11, 2008.
Overview Continuation from Monday (File system implementation)
UNIT IV RAID.
Disks and scheduling algorithms
Mass-Storage Systems.
Disk Scheduling The operating system is responsible for using hardware efficiently — for the disk drives, this means having a fast access time and disk.
CPS216: Advanced Database Systems Notes 04: Data Access from Disks
Presentation transcript:

CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles

Disk Drive … … Actuator Arm with R/W head Platter TrackSector

… Disk Drive … Each disk drive contains a number of rotating platters Each platter has a number of tracks on which data is recorded Each track is divided into equal-sized (in bytes) sectors

… Disk Drive The tracks with the same track number on different platters form a cylinder Data can be accessed through read/write heads Read/write heads can move from one track to another controlled by an actuator

Access Data on Disk 1. Move the read/write head to the requested track 2. Rotate the platter so the first requested byte is beneath the r/w head 3. Continue to rotate the platter until all the requested data is transferred

Disk Access Time Seek time Rotational delay Transfer time Transfer Rate = Number of Bytes per Track Time for One Revolution of Platter

Measures of Disk Drive Performance Capacity Average seek time Rotation speed Transfer rate

Seagate ST AS Capacity: 500G Bytes per sector: 512 Default sectors per track: 63 Average seek time (read): <8.5ms Average seek time (write): <9.5ms RPM: 7200rpm

Examples: Disk Access Time Use the specs of ST AS to calculate the time for the following disk accesses Read 1KB on one track Read 4KB on one track Read 4KB on four tracks

What We Learned from the Examples Sequential access is much more efficient than random access Accessing a large chunk of data once is much more efficient than accessing smaller chunks multiple times

Improve Disk Performance Caching Striping Mirroring Storing parity

Caching Read more data than requested Read one sector vs. read one track Transfer data from cache No seek time No rotational delay Transfer rate 6Gb/s (SATA III)

Striping Multiple small disks are faster than one large disk … … but only when the I/O requests are evenly distributed among the disks Sector 0 Sector 1 Sector 2 Sector 3 … Sector 0 Sector 1 … Sector 0 Sector 1 … Virtual Disk Physical Disk 0 Physical Disk 1

Example: Striping Suppose we using N disks for striping, and each disk has k sectors. An access to virtual sector x is mapped to an access to which sector on which disk??

Mirroring Store the same data on two or more disks Improve reliability Do not improve speed Same read speed Slower write (why??)

Storing Parity … Let S be a set of bits. The parity of S is 1 if S contains odd number of 1’s 0 if S contains even number of 1’s ?? Disk 1 Disk 2 Disk 3 Parity Disk

… Storing Parity Backup any number of disks with one disk Can only recovery from single disk failure

Storing Parity without a Parity Disk ?? Disk 1Disk 2Disk 3Disk 4 ??101 What’s the benefit of distributing parity to all disks??

RAID … Redundant Array of Inexpensive Drives RAID 0 – striping RAID 1 – mirroring RAID 1+0 – mirroring + striping RAID 2 – striping (bit) RAID 3 – striping (byte) + parity RAID 4 – striping + parity

… RAID RAID 5 – striping + parity (no separate parity disk) RAID 6 – striping + 2*parity (no separate parity disk)

OS Disk Access API Block-level API File-level API

Block and Page … A block is similar to a sector except that the size is determined by the OS E.g. NTFS default block size on Vista is 4KB A file always starts at the beginning of a block Tradeoff between large and small block sizes??

… Block and Page A page is a block-sized area of main memory Each block/page is uniquely numbered by the OS

OS Block-Level API read_block(n,p) – read block n into page p write_block(n,p) – write page p to block n allocate(n,k) – allocate k continuous blocks; the new blocks should be as close to block n as possible deallocate(n,k) – mark k continuous blocks starting at block n as unused

OS File-Level API Similar to the API of RandomAccessFile in Java ava/io/RandomAccessFile.html ava/io/RandomAccessFile.html Treat a file as a continuous sequence of bytes seek(long position) Read and write various data types

DBMS Disk Access API … Approach 1: use OS block-level API Full control of disk access  Most efficient  Not constrained by file system limitations (e.g. file size) Complex to implement Disks must be mounted as raw disk Difficult to administrate

… DBMS Disk Access API … Approach 2: use OS file-level API Easy to implement Easy to administrate No block I/O  Much less efficient  No paging, which is required by DBMS buffer management

… DBMS Disk Access API Approach 3: build a block I/O API on top of OS’s file I/O API The approach taken by most DBMS

Readings Database Design and Implementation Chapter 12 SimpleDB disk access code in the package simpledb.file The SSD Anthology -