Download presentation
Presentation is loading. Please wait.
Published byMelinda Sherman Modified over 8 years ago
1
1 CSE232A: Database System Principles Hardware
2
Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query Rewriter and Optimizer Execution Engine relational algebra View definitions Statistics & Catalogs & System Data query execution plan Buffer Manager Transaction Manager Calls from Transactions (read,write) Concurrency Controller Lock Table Recovery Manager Log Hardware aspects of storing and retrieving data
3
3 Memory Hierarchy Cache memory –On-chip and L2 –Caching outside control of DB system RAM –Addressable space includes virtual memory but DB systems avoid it Disk –Access speed & Transfer rate –Winchester, arrays,… Tertiary storage –Tapes, jukeboxes, DVDs Cost per byte Capacity Access Speed
4
4 Storage Cost 10 -9 10 -6 10 -3 10 -0 10 3 access time (sec) 10 15 10 13 10 11 10 9 10 7 10 5 10 3 cache electronic main electronic secondary magnetic optical disks online tape nearline tape & optical disks offline tape typical capacity (bytes)
5
5 Storage Cost 10 -9 10 -6 10 -3 10 -0 10 3 access time (sec) 10 4 10 2 10 0 10 -2 10 -4 cache electronic main electronic secondary magnetic optical disks online tape nearline tape & optical disks offline tape dollars/MB from Gray & Reuter
6
6 Volatile Vs Non-Volatile Storage Persistence important for transaction atomicity and durability Even if database fits in main memory changes have to be written in non- volatile storage Hard disk RAM disks w/ battery Flash memory
7
7 Cost of Disk Access: Non-trivial part of estimating performance on secondary storage How many blocks were accessed ? Clustered/consecutive ? Such complexities also apply to flash, even main memory Learn to analyze them when you make the next generation of secondary storage data structures
8
8 Moore’s Law: Different Rates of Improvement Lead to Reconsiderations Processor speed Main memory bit/$ Disk bit/$ RAM access speed Disk access speed Disk transfer rate Disk Transfer Rate Disk Access Time Clustered/sequential access-based algorithms become relatively better
9
9 Moore’s Law: Same Phenomenon Applies to RAM RAM Transfer Rate RAM Access Time Algorithms that access memory sequentially have better constant factors than algorithms that access randomly
10
10 Moore’s Law: Different Rates of Improvement Cache Capacity RAM Capacity Disk Access Time Cost of “miss” increases
11
11 Focus on: “Typical Disk” Terms: Platter, Head, Actuator Cylinder, Track Sector (physical), Block (logical), Gap … Disk Controller BUS
12
12 Top View Track Gap Sector Block (typically multiple sectors) Often different numbers of sectors per track
13
13 “Typical” Numbers Diameter: 1 inch 15 inches Cylinders:100 20000 Surfaces:1 (CDs) (Tracks/cyl) 2 (floppies) 5 (typical hd) 30 Sector Size:512B 50K Capacity:360 KB (old floppy) 200 GB
14
14 block x in memory ? I want block X Key performance metric: Time to fetch block Time = Seek Time (locate track) + Rotational Delay (locate sector)+ Transfer Time (fetch block) + Other (disk controller, …)
15
15 Seek Delay Track Where Head is Track Where Head must go
16
16 Rotational Delay Head Here Block I Want
17
17 Seek Time 3 or 5x x 1N Cylinders Traveled Time Few ms
18
18 Average Random Seek Time SEEKTIME (i j) S = N(N-1) N N i=1 j=1 j i “Typical” S: 10 ms 40 ms
19
19 Average Rotational Delay R = 1/2 revolution “typical” R = 8.33 ms (7200 RPM) Assume we have to start reading from start of first sector
20
20 Transfer Rate: t “typical” t: 1 3 MB/second transfer time: block size t
21
21 Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory “Typical” Value: 0
22
22 Homework Practice Problem Single surface Rotation speed 7200rpm 16,384 tracks 128 sectors/track 4096 bytes/sector 4 sectors/block (16,384 bytes/block) SEEKTIME (i j) = [1000 + (j-i)] μs Neglect gaps Calculate minimum, maximum, average time to fetch one block
23
23 Practice Problem: Minimum Time Head is at the start of the first sector of the block Just compute transfer time 4 sectors cover 4/128 of a track 1 full rotation takes 60/7200=8.33ms Transfer time is 8.33 * 4 /128 = 0.26ms
24
24 Practice Problem: Maximum Time Assume read must start at the first sector Head is at innermost, required track is the outermost Seek time = … Head just missed the beginning Rotational delay = … Transfer time = …
25
25 Practice problem: Average time Solve…
26
26 So far: Random Block Access What about: Reading “Next” block? Time to get = Block Size + Negligible block t - skip gap - switch track - once in a while, next cylinder
27
27 Rule ofRandom I/O: Expensive Thumb Sequential I/O: Much less Ex:1 KB Block »Random I/O: 20 ms. »Sequential I/O: 1 ms.
28
28 Practice Problem cont’d: Sustained Bandwidth over Track Assume required blocks are consecutive on single track What is the approximate sustained bandwidth of fetching consecutive blocks? 128 sectors/track * 4KB/sector in 8.33ms/track full rotation = 512KB/8.33ms = 61.46KB/ms
29
29 Suggested optimization Cluster data in consecutive blocks Give an extra point to algorithms that –exploit data clustering by avoiding “random” accesses –Read/write consecutive blocks
30
30 An Algorithm with Little Random Access: 2-Phase Merge Sort P KA D L E Z W J C R H Main Memory: 4 blocks SORT A DK P A DE K READ WRITE MERGE Y F X I P KA D L E Z W L DK P PW Z A DK P A DE K L DK P PW Z C F H I J R X Y A DK P A CD F … Improve by bringing max number of blocks in memory in Phase 2
31
31 Cost for Writing similar to Reading …. unless we want to verify! need to add (full) rotation + Block size t
32
32 To Modify a Block? To Modify Block: (a) Read Block (b) Modify in Memory (c) Write Block [(d) Verify?]
33
33 Block Address: Physical Device Cylinder # Surface # Sector Once upon a time DBs had access to such – now it is the OS’s domain
34
34 Optimizations (in controller or O.S.) Disk Scheduling Algorithms –e.g., elevator algorithm Pre-fetch Arrays
35
35 Double Buffering Problem: Have a File » Sequence of Blocks B1, B2 Have a Program » Process B1 » Process B2 » Process B3...
36
36 Single Buffer Solution (1) Read B1 Buffer (2) Process Data in Buffer (3) Read B2 Buffer (4) Process Data in Buffer...
37
37 SayP = time to process/block R = time to read in 1 block n = # blocks Single buffer time = n(P+R)
38
38 Double Buffering Memory: Disk: ABCDGEFA B done process A C B done
39
39 Say P R What is processing time? P = Processing time/block R = IO time/block n = # blocks Double buffering time = R + nP Single buffering time = n(R+P) Improvement much more dramatic if consequtive blocks: …
40
40 Block Size Selection? Big Block Amortize I/O Cost Big Block Read in more useless stuff! and takes longer to read Unfortunately...
41
41 Trend memory prices drop and memory capacities increase, transfer rates increase Disk access times do not increase that much blocks get bigger...
42
42 Summary Secondary storage, mainly disks I/O times I/Os should be avoided, especially random ones….. Summary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.