Download presentation
Presentation is loading. Please wait.
1
CS522 Advanced database Systems
1/26/2018 CS522 Advanced database Systems 4. Disks Huiping Guo Department of Computer Science California State University, Los Angeles
2
Review File organizations Index Data entry Index entry Classification
1/26/2018 Review File organizations Index Data entry Format Structure Index entry Classification 4. Disks CS522_S16
3
Disks and files DBMS stores information on hard disks
This has major implication for DBMS design! READ: transfer data from disk to main memory (RAM) WRITE: transfer data from RAM to disk Both are high cost operation, relative to in-memory operations. 4. Disks CS522_S16
4
Disks and files (Cont.) Why not store everything in main memory?
Costs too much Main memory is volatile Typical storage hierarchy RAM for currently used data Disk for the main database (secondary storage) Random access Tapes are for archiving older versions of the data (tertiary storage) Sequential access 4. Disks CS522_S16
5
Disks Secondary storage Advantages over tapes
Random access vs. sequential Data is stored and retrieved in units called pages 4. Disks CS522_S16
6
Components of a disk sector block Cylinder Spindle Disk head Tracks
Arm assembly Platters sector Arm movement Cylinder surface 4. Disks CS522_S16
7
Components of a disk Data blocks Tracks Platters Surface
Data is stored on disk in units called disk blocks A data block is a contiguous sequence of bytes and is the unit in which data is written to a disk and read from a disk Tracks Blocks are arranged in concentric rings called tracks Platters Tracks are on one or more platters Surface Tracks can be recorded on one or both surfaces of a platter (single-sided platter or double-sided platter) 4. Disks CS522_S16
8
Components of a disk Cylinder Sectors Disk head
The set of all tracks is shaped like a cylinder A cylinder contains one track per platter surface Sectors Each track is divided into arcs, called sectors Sector size is a characteristic of the disk A block contains multiple sectors Disk head There is a disk head for each surface An array of disk heads moves as a unit when one head is positioned over a block, the other heads are in identical positions with respect to their platters To read a block, a disk head must be positional on top of the block At most one disk head is allowed to read/write at a time 4. Disks CS522_S16
9
Components of a disk sector block Cylinder Spindle Disk head Tracks
Arm assembly Platters sector Arm movement Cylinder surface 4. Disks CS522_S16
10
Summary of disk components
Platter, surface, head Cylinder, track Block, sector A block stores a data page 4. Disks CS522_S16
11
Disk access time Seek time Rotational delay Transfer time
moving arms to position disk head on track Rotational delay Waiting for block to rotate under head Transfer time Actually moving data to/from disk surface Transfer time = Block size/Average transfer rate Access time = seek time + rotational delay + transfer time 4. Disks CS522_S16
12
Reduce I/O cost Seek time and rotational delay dominate.
Seek time varies from about 1 to 20msec Rotational delay varies from 0 to 10msec Transfer rate is about 1msec per 4KB page Key to lower I/O cost reduce seek/rotation delays! 4. Disks CS522_S16
13
Arranging Pages on Disk
`Next’ block concept: blocks on same track, followed by blocks on same cylinder, followed by blocks on adjacent cylinder To minimize seek and rotational delay Blocks in a file should be arranged sequentially on disk (by `next’), 4. Disks CS522_S16
14
Exercise 1 A disk with What’s the capacity of a track, the disk?
a sector size of 512 bytes, 2000 tracks per surface, 50 sector per track, five double-sided platters Average seek time 10 msec What’s the capacity of a track, the disk? How many cylinders? If the disk platters rotate at 5400 revolutions per min, what is the maximum rotational delay? If one track of data can be transferred per revolution, what is the transfer rate? 4. Disks CS522_S16
15
Answers Capacity of a track = 512x50 = 25K
Capacity of a disk = 2000x5x2x25K = 500,000K 2000 Cylinders Maximum rotational delay = (1/5400) x 60 = seconds Transfer rate = 25K/0.011 = 2,250K/Second 4. Disks CS522_S16
16
Exercise 2 The same disk specification from ex1 Block size is 1024byte
A file containing 100,000 records of 100 bytes each No record is allowed to span two blocks How many records fit onto a block? What time is required to read a file containing 100,000 of 100 bytes each sequentially? What time is required to read a file containing 100,000 of 100 bytes each in a random order? 4. Disks CS522_S16
17
Answers A block holds 1024/100 = 10 records A track has 25 blocks
The file needs 100,000/(10x25) = 400 tracks (40 cylinders) One track of data can be transferred in sec So it takes x 400 = 4.4 sec to transfer 400 tracks of data. This access seeks the track 40 times, so seek time is 40x0.01=0.4 sec. The total access time = = 4.8 sec. 4. Disks CS522_S16
18
Answers (cont.) 3. for any block,
access time = seek time + rotational delay + transfer time Seek time = 10 msec Rotational delay = 0.011/2 = 6 msec Transfer time = 1k/(2,250K/sec) = msec The access time for a block of data is msec The file contains 100,000/10 blocks, so the access time is 164.4 sec. 4. Disks CS522_S16
19
RAID Disks are potential bottlenecks for system performance and storage system reliability Disk performance has been improved slowly Disks have much higher failure rates Disk Array An arrangement of several disks that gives abstraction of a single, large disk Goals Increase performance and reliability 4. Disks CS522_S16
20
Two main techniques Improve performance: data striping
Data is partitioned size of a partition is called the striping unit Partitions are distributed over several disks. Improve reliability: Redundancy More disks => more failures Redundant information allows reconstruction of data if a disk fails. Redundancy level RAID Redundant Array of Independent Disks Disk arrays that implement the two techniques 4. Disks CS522_S16
21
Structure of a DBMS DBMS 4. Disks CS522_S16 File and Access Methods
Buffer Manager Disk Space Manager Recovery Manager Plan Executor Parser Operator Evaluator Optimizer Query evaluation engine Transaction Lock Concurrency control DBMS query Results Database 4. Disks CS522_S16
22
Disk Space Management Lowest layer of DBMS Manage space on disk How?
Supports the concepts of a page as a unit of data Provides commands to allocate or deallocate a page and read/write a page Manage space on disk Keep track of which disk blocks are in use Keep track of which pages are on which disk blocks How? Maintain a list of free blocks Or maintain a bit map with one bit for each disk block 4. Disks CS522_S16
23
Buffer Manager Buffer manager is a software layer responsible for
Bring pages from disk to main memory as needed Managing available main memory Buffer pool The main memory is partitioned into a collection of pages, called buffer pool. The main memory page in the buffer pool are called frames One frame holds one data page 4. Disks CS522_S16
24
Buffer Manager Page Requests from Higher Levels DB
1/26/2018 Buffer Manager DB MAIN MEMORY DISK disk page free frame Page Requests from Higher Levels BUFFER POOL choice of frame dictated by replacement policy P318 4. Disks CS522_S16
25
Buffer manager (BM) and higher level codes(HLC)
HLC needs a page Asks BM for the page BM brings the page into a frame if it is not in the buffer pool HLC doesn’t need a page Asks BM to release the frame The frame can be reused HLC also needs to tell BM whether the page has been modified BM ensures that any modification is propagated to the copy of the page on disk 4. Disks CS522_S16
26
Information buffer manager keeps
Table of <frame#, pageid> pairs For each frame, keep two variables pin_count The number of current users Initially, the pin_count for every frame is set to 0 dirty Whether the page has been modified since it was brought into the buffer pool 4. Disks CS522_S16
27
When a Page is Requested ...
BM checks the buffer pool to see of some frame contains the requested page If it is in pool Increase its pin_count (pinning the page) If requested page is not in pool: Choose a frame for replacement, increase its pin_count (pinning the page) If the frame is dirty, write it to disk Read requested page into chosen frame Return the address of the frame containing the requested page to the requestor 4. Disks CS522_S16
28
Choose a frame for replacement
Candidate frames for replacement Free frames Frames with pin_count = 0 No candidate frames Wait until some page is released 4. Disks CS522_S16
29
Buffer Replacement Policy: LRU
Least-recently-used (LRU) Use a queue of pointers to frames with pin_count 0 A frame is added to the end of the queue when it becomes a candidate for replacement The page chosen for replacement is the one in the frame at the head of the queue 4. Disks CS522_S16
30
Buffer Replacement Policy: Clock
Frames are arranged in a circle, like clock face “current” variable (1 – N)is like clock hand moving across the face Each frame also has an associated reference bit, which is turned on the page pin_count goes to 0 The current frame is considered for replacement If the frame is not chosen for replacement, current++, next frame is considered If the frame has pin_count>0, current++ If the frame has the referenced bit turned on (pin_count=0) , turn it off, current++ Recently referenced page is less likely to be replaced If the frame has pin_count=0, and its reference bit turned off, it’s chosen for replacement 4. Disks CS522_S16
31
Buffer Replacement Policy
Other policies Most-Recently-Used (MRU) FIFO Random 4. Disks CS522_S16
32
DBMS vs. OS File System OS does disk space & buffer mgmt: why not let OS manage these tasks? Differences in OS support: portability issues Some limitations, e.g., files can’t span disks. Buffer management in DBMS requires ability to: pin a page in buffer pool, force a page to disk (important for implementing CC & recovery), adjust replacement policy, and pre-fetch pages based on access patterns in typical DB operations. 4. Disks CS522_S16
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.