CPS216: Advanced Database Systems Notes 03: Data Access from Disks Shivnath Babu.

Slides:



Advertisements
Similar presentations
CPS216: Data-Intensive Computing Systems Data Access from Disks Shivnath Babu.
Advertisements

Storing Data: Disk Organization and I/O
Disk Storage SystemsCSCE430/830 Disk Storage Systems CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu (U. Maine) Fall,
- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
Storing Data: Disks and Files: Chapter 9
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
CS 277 – Spring 2002Notes 21 CS 277: Database System Implementation Notes 02: Hardware Arthur Keller.
CS 245Notes 21 CS 245: Database System Principles Notes 02: Hardware Hector Garcia-Molina.
Data Storage John Ortiz. Lecture 17Data Storage2 Overview  Database stores data on secondary storage  Disk has distinct storage and access characteristics.
The Memory Hierarchy fastest, perhaps 1Mb
Storage. The Memory Hierarchy fastest, but small under a microsecond, random access, perhaps 2Gb Typically magnetic disks, magneto­ optical (erasable),
CS4432: Database Systems II Data Storage - Lecture 2 (Sections 13.1 – 13.3) Elke A. Rundensteiner.
1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])
Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
1 CS143: Disks and Files. 2 System Architecture CPU Main Memory Disk Controller... Disk Word (1B – 64B) ~ x GB/sec Block (512B – 50KB) ~ x MB/sec System.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
CS4432: Database Systems II Lecture 2 Timothy Sutherland.
CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #5.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
Storage & Peripherals Disks, Networks, and Other Devices.
Chapter 2 Data Storage How does a computer system store and manage very large volumes of data ?
Lecture 11: DMBS Internals
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
1 Secondary Storage Management Submitted by: Sathya Anandan(ID:123)
Chapter 111 Chapter 11: Hardware (Slides by Hector Garcia-Molina,
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
1 Data Storage (Chap. 11) Based on Hector Garcia-Molina’s slides.
Chapter 8 External Storage. Primary vs. Secondary Storage Primary storage: Main memory (RAM) Secondary Storage: Peripheral devices  Disk drives  Tape.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Storing Data Dina Said 1 1.
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
Disk Basics CS Introduction to Operating Systems.
Section 13.2 – Secondary storage management (Former Student’s Note)
DBMS 2001Notes 2: Hardware1 Principles of Database Management Systems Pekka Kilpeläinen (after Stanford CS245 slide originals by Hector Garcia-Molina,
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Disk Average Seek Time. Multi-platter Disk platter Disk read/write arm read/write head.
Section 13.2 – Secondary storage management. Index 13.2 Disks Mechanics of Disks The Disk Controller Disk Access Characteristics.
Magnetic Disk Rotational latency Example Find the average rotational latency if the disk rotates at 20,000 rpm.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
CS 245Notes 21 Database System Principles Notes 02: Hardware.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
1 Query Processing Part 1: Managing Disks. 2 Main Topics on Query Processing Running-time analysis Indexes (e.g., search trees, hashing) Efficient algorithms.
File organization Secondary Storage Devices Lec#7 Presenter: Dr Emad Nabil.
Storing Data Dina Said.
Query Processing Part 1: Managing Disks 1.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Database Management Systems (CS 564)
CS 554: Advanced Database System Notes 02: Hardware
CPSC-608 Database Systems
Lecture 11: DMBS Internals
Persistence: hard disk drive
Parameters of Disks The most important disk parameter is the time required to locate an arbitrary disk block, given its block address, and then to transfer.
Secondary Storage Management Hank Levy
CPS216: Advanced Database Systems Notes 04: Data Access from Disks
CS 245: Database System Principles Notes 02: Hardware
Introduction to Operating Systems
Presentation transcript:

CPS216: Advanced Database Systems Notes 03: Data Access from Disks Shivnath Babu

Outline Disks Data access from disks Software-based optimizations – Prefetching blocks – Choosing the right block size

Focus on: “Typical Disk” … Top View Head assembly Sector Gap Terms: Platter, Head, Cylinder, Track Sector (physical), Block (logical), Gap

Block Address: Physical Device Cylinder # Surface # Start sector #

Disk Access Time (Latency) block X in memory ? I want block X

Access Time = Seek Time + Rotational Delay + Transfer Time + Other

Seek Time 3 or 5x x 1N Cylinders Traveled Time Average value: 10 ms  40 ms

Rotational Delay Head Here Block I Want

Average Rotational Delay R = 1/2 revolution Example: R = 8.33 ms (3600 RPM)

Transfer Rate: t t: 1  100 MB/second transfer time: block size t

Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory “Typical” Value: 0

So far: Random Block Access What about: Reading “Next” block?

If we do things right … Time to get = Block Size + Negligible next block t - skip gap - switch track - once in a while, next cylinder

Rule ofRandom I/O: Expensive Thumb Sequential I/O: Much less Ex:1 KB Block »Random I/O:  20 ms. »Sequential I/O:  1 ms.

Cost for Writing similar to Reading …. unless we want to verify!

To Modify Block: (a) Read Block (b) Modify in Memory (c) Write Block [(d) Verify?]

3.5 in diameter disk 3600 RPM 1 surface 16 MB usable capacity (16 X 2 20 ) 128 cylinders seek time: average = 25 ms. adjacent cylinders = 5 ms. A Synthetic Example

1 KB blocks = sectors 10% overhead between sectors capacity = 16 MB = (2 20 )16 = 2 24 bytes # cylinders = 128 = 2 7 bytes/cyl = 2 24 /2 7 = 2 17 = 128 KB blocks/cyl = 128 KB / 1 KB = 128

3600 RPM 60 revolutions / sec 1 rev. = msec. One track:... Time over useful data:(16.66)(0.9)=14.99 ms. Time over gaps: (16.66)(0.1) = 1.66 ms. Transfer time 1 block = 14.99/128=0.117 ms. Trans. time 1 block+gap=16.66/128=0.13ms.

Burst Bandwith 1 KB in ms. BB = 1/0.117 = 8.54 KB/ms. or BB =8.54KB/ms x 1000 ms/1sec x 1MB/1024KB = 8540/1024 = 8.33 MB/sec

Sustained bandwith (over track) 128 KB in ms. SB = 128/16.66 = 7.68 KB/ms or SB = 7.68 x 1000/1024 = 7.50 MB/sec.

T 1 = Time to read one random block T 1 = seek + rotational delay + TT = 25 + (16.66/2) = ms.

Suppose DBMS deals with 4 KB blocks T 4 = 25 + (16.66/2) + (.117) x 1 + (.130) X 3 = ms [Compare to T 1 = ms] block

T T = Time to read a full track (start at any block) T T = 25 + (0.130/2) * = ms to get to first block * Actually, a bit less; do not have to read last gap.

Outline Disks Data access from disks Software-based optimizations – Prefetching blocks – Choosing the right block size

Software-based Optimizations (in Disk controller, OS, or DBMS Buffer Manager) Prefetching blocks Choosing the right block size Some others covered in textbook

Prefetching Blocks Exploits locality of access –Ex: relation scan Improves performance by hiding access latency Needs extra buffer space –Double buffering

Block Size Selection? Big Block  Amortize I/O Cost Big Block  Read in more useless stuff! Unfortunately...

Tradeoffs in Choosing Block Size Small relations? Update-heavy workload? Difficult to use blocks larger than track Multiple block sizes

Further Reading if you are Interested (not part of syllabus) Chapter 11 “Data Storage” in textbook Sorting disk-resident records (Will cover later in class) Scheduling disk accesses Disk failures and recovery, RAID