SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.

Slides:



Advertisements
Similar presentations
Storing Data: Disk Organization and I/O
Advertisements

13.2 Disks Gaurav Sharma Class ID Mechanics of Disks 2 Moving Principal Moving pieces of Disk are: 1. Disk assembly & 2. Head Assembly The.
- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
Storing Data: Disks and Files: Chapter 9
The Memory Hierarchy fastest, perhaps 1Mb
Storage. The Memory Hierarchy fastest, but small under a microsecond, random access, perhaps 2Gb Typically magnetic disks, magneto­ optical (erasable),
CS4432: Database Systems II Data Storage - Lecture 2 (Sections 13.1 – 13.3) Elke A. Rundensteiner.
1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])
SECTION 13.3 Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
13.2 Disks Mechanics of Disks Presented by Chao-Hsin Shih Feb 21, 2011.
OS2-1 Chapter 2 Computer System Structures. OS2-2 Outlines Computer System Operation I/O Structure Storage Structure Storage Hierarchy Hardware Protection.
1 Storing Data: Disks and Files Yanlei Diao UMass Amherst Feb 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Disk Drivers May 10, 2000 Instructor: Gary Kimura.
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
Chapter 12 – Disk Performance Optimization Outline 12.1 Introduction 12.2Evolution of Secondary Storage 12.3Characteristics of Moving-Head Disk Storage.
CS4432: Database Systems II Lecture 2 Timothy Sutherland.
CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #5.
Disks.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
12.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 12: Mass-Storage Systems.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
Chapter 2 Data Storage How does a computer system store and manage very large volumes of data ?
Lecture 11: DMBS Internals
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29 Database Systems II Secondary Storage.
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
1 Secondary Storage Management Submitted by: Sathya Anandan(ID:123)
1 Chapter 17 Disk Storage, Basic File Structures, and Hashing Chapter 18 Index Structures for Files.
CS 6502 Operating Systems Dr. J.. Garrido Device Management (Lecture 7b) CS5002 Operating Systems Dr. Jose M. Garrido.
Database Management Systems,Shri Prasad Sawant. 1 Storing Data: Disks and Files Unit 1 Mr.Prasad Sawant.
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
CS4432: Database Systems II Data Storage 1. Storage in DBMSs DBMSs manage large amounts of data How does a DBMS store and manage large amounts of data?
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
By : Reem Hasayen. A storage device is a hardware device capable of storing information. There are two types of storage devices used in computers 1. Primary.
Section 13.2 – Secondary storage management (Former Student’s Note)
Section 13.1 – Secondary storage management (Former Student’s Note)
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
DMBS Architecture May 15 th, Generic Architecture Query compiler/optimizer Execution engine Index/record mgr. Buffer manager Storage manager storage.
Section 13.2 – Secondary storage management. Index 13.2 Disks Mechanics of Disks The Disk Controller Disk Access Characteristics.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
1 Components of the Virtual Memory System  Arrows indicate what happens on a lw virtual address data physical address TLB page table memory cache disk.
Data Storage and Querying in Various Storage Devices.
Computer System Structures Storage
Storage Overview of Physical Storage Media Magnetic Disks RAID
Section 13.2 – Secondary storage management (Former Student’s Note)
Chapter 2: Computer-System Structures
Storage and Disks.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Database Management Systems (CS 564)
CS 554: Advanced Database System Notes 02: Hardware
I/O Resource Management: Software
CPSC-608 Database Systems
Lecture 11: DMBS Internals
Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin
Disk Storage, Basic File Structures, and Buffer Management
Operating Systems (CS 340 D)
Section 13.1 – Secondary storage management (Former Student’s Note)
13.3 Accelerating Access to Secondary Storage
CSE451 File System Introduction and Disk Drivers Autumn 2002
Presentation transcript:

SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT

Presentation Outline  13.1 The Memory Hierarchy  The Memory Hierarchy  Transfer of Data Between Levels  Volatile and Nonvolatile Storage  Virtual Memory  13.2 Disks  Mechanics of Disks  The Disk Controller  Disk Access Characteristics

Presentation Outline (con’t)  13.3 Accelerating Access to Secondary Storage  The I/O Model of Computation  Organizing Data by Cylinders  Using Multiple Disks  Mirroring Disks  Disk Scheduling and the Elevator Algorithm  Prefetching and Large-Scale Buffering

Memory Hierarchy  Several components for data storage having different data capacities available  Cost per byte to store data also varies  Device with smallest capacity offer the fastest speed with highest cost per bit

Memory Hierarchy Diagram Programs, DBMS Main Memory DBMS’s Main Memory Cache As Visual Memory Disk File System Tertiary Storage

Memory Hierarchy  Cache  Lowest level of the hierarchy  Data items are copies of certain locations of main memory  Sometimes, values in cache are changed and corresponding changes to main memory are delayed  Machine looks for instructions as well as data for those instructions in the cache  Holds limited amount of data

Memory Hierarchy (con’t)  No need to update the data in main memory immediately in a single processor computer  In multiple processors data is updated immediately to main memory….called as write through

Main Memory  Everything happens in the computer i.e. instruction execution, data manipulation, as working on information that is resident in main memory  Main memories are random access….one can obtain any byte in the same amount of time

Secondary storage  Used to store data and programs when they are not being processed  More permanent than main memory, as data and programs are retained when the power is turned off  E.g. magnetic disks, hard disks

Tertiary Storage  Holds data volumes in terabytes  Used for databases much larger than what can be stored on disk

Transfer of Data Between levels  Data moves between adjacent levels of the hierarchy  At the secondary or tertiary levels accessing the desired data or finding the desired place to store the data takes a lot of time  Disk is organized into bocks  Entire blocks are moved to and from memory called a buffer

Transfer of Data Between level (cont’d)  A key technique for speeding up database operations is to arrange the data so that when one piece of data block is needed it is likely that other data on the same block will be needed at the same time  Same idea applies to other hierarchy levels

Volatile and Non Volatile Storage  A volatile device forgets what data is stored on it after power off  Non volatile holds data for longer period even when device is turned off  All the secondary and tertiary devices are non volatile and main memory is volatile

Virtual Memory  Typical software executes in virtual memory  Address space is typically 32 bit or 2 32 bytes or 4GB  Transfer between memory and disk is in terms of blocks

Mechanism of Disk  Mechanisms of Disks  Use of secondary storage is one of the important characteristic of DBMS  Consists of 2 moving pieces of a disk 1. disk assembly 2. head assembly  Disk assembly consists of 1 or more platters  Platters rotate around a central spindle  Bits are stored on upper and lower surfaces of platters

Mechanism of Disk  Disk is organized into tracks  The track that are at fixed radius from center form one cylinder  Tracks are organized into sectors  Tracks are the segments of circle separated by gap

Disk Controller  One or more disks are controlled by disk controllers  Disks controllers are capable of  Controlling the mechanical actuator that moves the head assembly  Selecting the sector from among all those in the cylinder at which heads are positioned  Transferring bits between desired sector and main memory  Possible buffering an entire track

Disk Access Characteristics  Accessing (reading/writing) a block requires 3 steps  Disk controller positions the head assembly at the cylinder containing the track on which the block is located. It is a ‘seek time’  The disk controller waits while the first sector of the block moves under the head. This is a ‘rotational latency’  All the sectors and the gaps between them pass the head, while disk controller reads or writes data in these sectors. This is a ‘transfer time’

13.3 Accelerating Access to Secondary Storage  Several approaches for more-efficiently accessing data in secondary storage:  Place blocks that are together in the same cylinder.  Divide the data among multiple disks.  Mirror disks.  Use disk-scheduling algorithms.  Prefetch blocks into main memory.  Scheduling Latency – added delay in accessing data caused by a disk scheduling algorithm.  Throughput – the number of disk accesses per second that the system can accommodate.

The I/O Model of Computation  The number of block accesses (Disk I/O’s) is a good time approximation for the algorithm.  This should be minimized.  Ex 13.3: You want to have an index on R to identify the block on which the desired tuple appears, but not where on the block it resides.  For Megatron 747 (M747) example, it takes 11ms to read a 16k block.  A standard microprocessor can execute millions of instruction in 11ms, making any delay in searching for the desired tuple negligible.

Organizing Data by Cylinders  If we read all blocks on a single track or cylinder consecutively, then we can neglect all but first seek time and first rotational latency.  Ex 13.4: We request 1024 blocks of M747.  If data is randomly distributed, average latency is 10.76ms by Ex 13.2, making total latency 11s.  If all blocks are consecutively stored on 1 cylinder:  6.46ms ms * 16 = 139ms (1 average seek)(time per rotation)(# rotations)

Using Multiple Disks  If we have n disks, read/write performance will increase by a factor of n.  Striping – distributing a relation across multiple disks following this pattern:  Data on disk R 1 : R 1, R 1+n, R 1+2n,…  Data on disk R 2 : R 2, R 2+n, R 2+2n,… …  Data on disk R n : R n, R n+n, R n+2n, …  Ex 13.5: We request 1024 blocks with n = 4.  6.46ms + (8.33ms * (16/4)) = 39.8ms (1 average seek)(time per rotation)(# rotations)

Mirroring Disks  Mirroring Disks – having 2 or more disks hold identical copied of data.  Benefit 1: If n disks are mirrors of each other, the system can survive a crash by n-1 disks.  Benefit 2: If we have n disks, read performance increases by a factor of n.  Performance increases further by having the controller select the disk which has its head closest to desired data block for each read.

Disk Scheduling and the Elevator Problem  Disk controller will run this algorithm to select which of several requests to process first.  Pseudo code:  requests[] // array of all non-processed data requests  upon receiving new data request:  requests[].add(new request)  while(requests[] is not empty)  move head to next location  if(head location is at data in requests[])  retrieve data  remove data from requests[]  if(head reaches end)  reverse head direction

Disk Scheduling and the Elevator Problem (con’t) Events: Head starting point Request data at 8000 Request data at Request data at Get data at 8000 Request data at Get data at Request data at Get data at Request Data at Get data at Get data at Get data at datatime Current time Current time 10 Current time 13.6 Current time 20 Current time 26.9 Current time 30 Current time 34.2 Current time 45.5 Current time datatime datatime datatime datatime datatime datatime

Disk Scheduling and the Elevator Problem (con’t) datatime datatime Elevator Algorithm FIFO Algorithm

Prefetching and Large-Scale Buffering  If at the application level, we can predict the order blocks will be requested, we can load them into main memory before they are needed.

Questions