13.3 Accelerating Access to Secondary Storage

Slides:



Advertisements
Similar presentations
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
Advertisements

- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
RAID Redundant Array of Independent Disks
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
CSE506: Operating Systems Disk Scheduling. CSE506: Operating Systems Key to Disk Performance Don’t access the disk – Whenever possible Cache contents.
CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)
Using Secondary Storage Effectively In most studies of algorithms, one assumes the "RAM model“: –The data is in main memory, –Access to any item of data.
1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])
Disk Access Model. Using Secondary Storage Effectively In most studies of algorithms, one assumes the “RAM model”: –Data is in main memory, –Access to.
SECTION 13.3 Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #6.
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Using Secondary Storage Effectively In most studies of algorithms, one assumes the "RAM model“: –The data is in main memory, –Access to any item of data.
S.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
 2004 Deitel & Associates, Inc. All rights reserved. Chapter 12 – Disk Performance Optimization Outline 12.1 Introduction 12.2Evolution of Secondary Storage.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
RAID Ref: Stallings. Introduction The rate in improvement in secondary storage performance has been considerably less than the rate for processors and.
CS 346 – Chapter 10 Mass storage –Advantages? –Disk features –Disk scheduling –Disk formatting –Managing swap space –RAID.
1 Database Systems Storage Media Asma Ahmad 21 st Apr, 11.
1 Recitation 8 Disk & File System. 2 Disk Scheduling Disks are at least four orders of magnitude slower than main memory –The performance of disk I/O.
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Chapter 2 Data Storage How does a computer system store and manage very large volumes of data ?
Lecture 11: DMBS Internals
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
The concept of RAID in Databases By Junaid Ali Siddiqui.
Chapter 14: Mass-Storage Systems Disk Structure. Disk Scheduling. RAID.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Part IV I/O System Chapter 12: Mass Storage Structure.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
Data Storage and Querying in Various Storage Devices.
Magnetic Disks Have cylinders, sectors platters, tracks, heads virtual and real disk blocks (x cylinders, y heads, z sectors per track) Relatively slow,
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems DISK I/0.
CS Introduction to Operating Systems
Storage HDD, SSD and RAID.
Storage Overview of Physical Storage Media Magnetic Disks RAID
CMSC 611: Advanced Computer Architecture
Chapter 2 Memory and process management
Multiple Platters.
Storage and Disks.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Disks and RAID.
RAID Non-Redundant (RAID Level 0) has the lowest cost of any RAID
CS 554: Advanced Database System Notes 02: Hardware
Operating System I/O System Monday, August 11, 2008.
Lecture 11: DMBS Internals
RAID RAID Mukesh N Tekwani
Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin
April 30th – Scheduling / parallel
ICOM 6005 – Database Management Systems Design
CPSC-310 Database Systems
UNIT IV RAID.
Secondary Storage Management Brian Bershad
Persistence: hard disk drive
Mass-Storage Systems.
Section 13.1 – Secondary storage management (Former Student’s Note)
Secondary Storage Management Hank Levy
RAID RAID Mukesh N Tekwani April 23, 2019
Andy Wang Operating Systems COP 4610 / CGS 5765
Presentation transcript:

13.3 Accelerating Access to Secondary Storage

13.3 Accelerating Access to Secondary Storage Section Overview 13.3.1: The I/O Model of Computation 13.3.2: Organizing Data by Cylinders 13.3.3: Using Multiple Disks 13.3.4: Mirroring Disks 13.3.5: Disk Scheduling and the Elevator Algorithm 13.3.6: Prefetching and Large-Scale Buffering

13.3 Introduction Average block access is ~10ms. Disks may be busy. Requests may outpace access delays, leading to infinite scheduling latency. There are various strategies to increase disk throughput. The “I/O Model” is the correct model to determine speed of database operations

13.3 Introduction (Contd.) Actions that improve database access speed: Place blocks closer, within the same cylinder Increase the number of disks Mirror disks Use an improved disk-scheduling algorithm Use prefetching

13.3.1 The I/O Model of Computation If we have a computer running a DBMS that: Is trying to serve a number of users Has 1 processor, 1 disk controller, and 1 disk Each user is accessing different parts of the DB It can be assumed that: Time required for disk access is much larger than access to main memory; and as a result: The number of block accesses is a good approximation of time required by a DB algorithm

13.3.2 Organizing Data by Cylinders It is more efficient to store data that might be accessed together in the same or adjacent cylinder(s). In a relational database, related data should be stored in the same cylinder.

13.3.3 Using Multiple Disks If the disk controller supports the addition of multiple disks and has efficient scheduling, using multiple disks can improve performance significantly By striping a relation across multiple disks, each chunk of data can be retrieved in a parallel fashion, improving performance by up to a factor of n, where n is the total number of disks the data is striped over

13.3.4 Mirroring Disks A drawback of striping data across multiple disks is that you increase your chances of disk failure. To mitigate this risk, some DBMS use a disk mirroring configuration Disk mirroring makes each disk a copy of the other disks, so that if any disk fails, the data is not lost Since all the data is in multiple places, access speedup can be increased by more than n since the disk with the head closest to the requested block can be chosen

13.3.4 Mirroring Disks Advantages Disadvantages Striping Read/Write speedup ~n Capacity increased by ~n Higher risk of failure Mirroring Read speedup ~n Reduced failure risk Fast initial access High cost per bit Slow writes compared to striping

1 13.3.5 Disk Scheduling One way to improve disk throughput is to improve disk scheduling, prioritizing requests such that they are more efficient The elevator algorithm is a simple yet effective disk scheduling algorithm The algorithm makes the heads of a disk oscillate back and forth similar to how an elevator goes up and down The access requests closest to the heads current position are processed first

13.3.5 Disk Scheduling When sweeping outward, the direction of head movement changes only after the largest cylinder request has been processed When sweeping inward, the direction of head movement changes only after the smallest cylinder request has been processed Example: Cylinder Time Requested (ms) 8000 24000 56000 16000 10 64000 20 40000 30 Cylinder Time Completed (ms) 8000 4.3 24000 13.6 56000 26.9 64000 34.2 40000 45.5 16000 56.8

13.3.6 Prefetching and Large-Scale Buffering In some cases we can anticipate what data will be needed We can take advantage of this by prefetching data from the disk before the DBMS requests it Since the data is already in memory, the DBMS receives it instantly

? Questions ?