1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.

Slides:



Advertisements
Similar presentations
CPS216: Data-Intensive Computing Systems Data Access from Disks Shivnath Babu.
Advertisements

Tertiary Storage Devices
Lecture # 7. Topics Storage Techniques of Bits Storage Techniques of Bits Mass Storage Mass Storage Disk System Performance Disk System Performance File.
- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
Secondary Storage Rohit Khokher
Storing Data: Disks and Files: Chapter 9
CS 277 – Spring 2002Notes 21 CS 277: Database System Implementation Notes 02: Hardware Arthur Keller.
CS 245Notes 21 CS 245: Database System Principles Notes 02: Hardware Hector Garcia-Molina.
Data Storage John Ortiz. Lecture 17Data Storage2 Overview  Database stores data on secondary storage  Disk has distinct storage and access characteristics.
The Memory Hierarchy fastest, perhaps 1Mb
Storage. The Memory Hierarchy fastest, but small under a microsecond, random access, perhaps 2Gb Typically magnetic disks, magneto­ optical (erasable),
CS4432: Database Systems II Data Storage - Lecture 2 (Sections 13.1 – 13.3) Elke A. Rundensteiner.
1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
CS4432: Database Systems II Lecture 2 Timothy Sutherland.
CMSC424: Database Design Data Storage. Storage Hierarchy.
CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #5.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
12/3/2004EE 42 fall 2004 lecture 391 Lecture #39: Magnetic memory storage Last lecture: –Dynamic Ram –E 2 memory This lecture: –Future memory technologies.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
Chapter 2 Data Storage How does a computer system store and manage very large volumes of data ?
Lecture 11: DMBS Internals
Physical Storage and File Organization COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
1 Secondary Storage Management Submitted by: Sathya Anandan(ID:123)
Chapter 111 Chapter 11: Hardware (Slides by Hector Garcia-Molina,
Overview of Physical Storage Media
2.1 Operating System Concepts Chapter 2: Computer-System Structures Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General.
File Processing : Storage Media 2015, Spring Pusan National University Ki-Joune Li.
1 Data Storage (Chap. 11) Based on Hector Garcia-Molina’s slides.
Chapter 8 External Storage. Primary vs. Secondary Storage Primary storage: Main memory (RAM) Secondary Storage: Peripheral devices  Disk drives  Tape.
11.1Database System Concepts. 11.2Database System Concepts Now Something Different 1st part of the course: Application Oriented 2nd part of the course:
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
CS4432: Database Systems II Data Storage 1. Storage in DBMSs DBMSs manage large amounts of data How does a DBMS store and manage large amounts of data?
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
Disk Basics CS Introduction to Operating Systems.
CS 101 – Sept. 28 Main vs. secondary memory Examples of secondary storage –Disk (direct access) Various types Disk geometry –Flash memory (random access)
Lecture 5: 9/10/2002CS149D Fall CS149D Elements of Computer Science Ayman Abdel-Hamid Department of Computer Science Old Dominion University Lecture.
DBMS 2001Notes 2: Hardware1 Principles of Database Management Systems Pekka Kilpeläinen (after Stanford CS245 slide originals by Hector Garcia-Molina,
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Magnetic Disk Rotational latency Example Find the average rotational latency if the disk rotates at 20,000 rpm.
1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
CPS216: Advanced Database Systems Notes 03: Data Access from Disks Shivnath Babu.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
CS 245Notes 21 Database System Principles Notes 02: Hardware.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
Data Storage and Querying in Various Storage Devices.
File organization Secondary Storage Devices Lec#7 Presenter: Dr Emad Nabil.
TYPES OF MEMORY.
Chapter 2: Computer-System Structures
Storage and Disks.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Database Management Systems (CS 564)
CS 554: Advanced Database System Notes 02: Hardware
Disks and Files DBMS stores information on (“hard”) disks.
File Processing : Storage Media
Lecture 11: DMBS Internals
Lecture 9: Data Storage and IO Models
Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin
Secondary Storage Devices
File Processing : Storage Media
Persistence: hard disk drive
CPS216: Advanced Database Systems Notes 04: Data Access from Disks
CS 245: Database System Principles Notes 02: Hardware
Presentation transcript:

1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01

CS222Notes 012 Topic 1: Data Storage and Record- Oriented File Systems Data Storage –Storage hierarchy –Disks Record-oriented file systems

CS222Notes 013 Storage hierarchy CPU MemoryController Disk/tape... cache

CS222Notes 014 Storage Media Cache: inside/outside CPU –CPU: becoming faster and faster (>=3 GHz now) Main Memory –costs $100/Mbyte -- reduces every year –‘volatile’ -- does not survive system failures –random I/O very fast –data can be processed by CPU directly –capacity limited to orders of magnitude lower than what database needs.

CS222Notes 015 Storage Media: secondary storage Disks (floppy disks, hard disks, CD) –Cheap, and price reduces each year –Non-volatile (except when disk crashes) –Random I/O slow –Data needs to be transferred to memory to be processed by CPU Tape –Cheaper but slower than disks. –Sequential I/O devices. –Handy for backups, sometimes for archival.

CS222Notes 016 Databases and Storage Devices Due to capacity, cost, volatility factors, DBs usually stored in disks. Data brought to main memory for processing from disks There are many ways to interface memory with disk resident data E.g., virtual memory: –VM size limited to max address generated by CPU –Existing VM does not support durability File system provides a more powerful mapping between memory and disk storage A bunch of tricks used ensure that high latency of secondary storage does not impact application response time and system throughput –access disks asynchronously with active applications –prefetch data before application needs it –intelligent caching techniques

CS222Notes 017 Disk Storages -- Outline Disk mechanics Access times (random, sequential) Examples Optimization Other topics

CS222Notes 018 Terms: Spindle, Platters, Magnetic surfaces, Disk head, Disk controller, … … Disk mechanics

CS222Notes 019 Top Views Tracks Sectors Gaps Cylinders

CS222Notes 0110 Characteristics Diameter: 1 inch inches Cylinders: Surfaces:1 (CDs) -- many Tracks/Cyl: 2 (floppies) Sector Size:512B -- 50K Capacity:360 KB (old floppy) -- >=200GB

CS222Notes 0111 “Block” Corresponds to 1 or multiple sectors Its address consists of: –Physical device # (in case of multi disks) –Cylinder # –Surface # –Sector #

CS222Notes 0112 block x in memory I want block X Random disk access time Time = Seek Time + Rotational Delay + Transfer Time + Other time 1 time 2 time 3 time 4

CS222Notes or 5x x 1N Cylinders Traveled Time Time 1: seek time

CS222Notes 0114 Average Random Seek Time   SeekTime(Track i  Track j) S = N(N-1) N N i=1 j=1 j  i Assumptions: –Each track has the same probability to be accessed. –Each track has the probability to jump to another track. Typical S value: 10 ms – 50 ms

CS222Notes 0115 Time 2: Rotational Delay Initial Head Block Wanted Average delay: –R = 1/2 revolution –If disk speed 3600 RPM, then R = 8.33 ms

CS222Notes 0116 Complication May have to wait for start of track before we can read desired block Head Here Block We Want Track Start

CS222Notes 0117 Time 3: Transfer time Transfer time: block size/transfer rate Typical transfer rate:1  3 MB/sec

CS222Notes 0118 Time 4: Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory, etc. Typical value: “0”

CS222Notes 0119 Reading “Next” block Additional time = Block size/transfer rate Other time negligible: –skip gaps –once in a while, next cylinder Sequential disk access

CS222Notes 0120 Average sequential IO time much smaller than random IO time –Random I/O:  20 ms (most time on the initial delay) –Sequential I/O:  1 ms. When designing a structure, try to use sequential IOs. –Data layout on disk becomes critical –Do not just look at the number of IOs Random I/O vs Sequential I/O

CS222Notes 0121 Modify blocks Read block Modify in memory Write block Verify –Optional –If so, the access time needs to add: full rotation + block size/transfer rate

CS222Notes 0122 Disk Specs: 3.5 in diameter 3600 RPM 1 surface Usable capacity: 16 MB = 2 24 # of cylinders: 128 = block = 1 sector = 1 KB 10% overhead between blocks (gaps) seek time: –average = 25 ms. –adjacent cyl = 5 ms. Example 1

CS222Notes 0123 bytes/cyl = 2 24 /2 7 = 2 17 = 128 KB blocks/cyl = 128 KB / 1 KB = 128 Cylinder

CS222Notes 0124 One track... Track Speed: –3600 RPM  60 revolutions / sec  ms/rev In each revolution: –Time over useful data: * 0.9=14.99 ms –Time over gaps: * 0.1 = 1.66 ms –Transfer time 1 block = 14.99/128 = ms –Trans. time 1 block + gap = 16.66/128 = 0.13ms

CS222Notes 0125 Bandwidths Burst bandwidth: –No time on gaps (10%) –1 KB in ms. BB=1KB / 0.117ms = 8.54 KB/ms = 8.33MB/sec Sustained bandwidth: –Including time on gaps –128 KB in ms. SB=128KB /16.66ms = 7.68 KB/ms = 7.50 MB/sec

CS222Notes 0126 Time of random block access Time to read one random block T1 T1 = seek time + rotational delay + Transfer time –Assume we do not have to wait for track start –Seek time = 25ms –Rotational delay = 16.66ms /2 = 8.33 ms –Transfer time =.117 ms –Total = 25 ms ms ms= ms Most of the time is on “seek time” and “rotational delay”!

CS222Notes 0127 Larger blocks? Suppose OS deals with 4 KB blocks We need to include the time of reading 1 block (without gap) and 3 blocks (with gaps) T4 = 25ms + (16.66ms/2) + (.117) x 1 + (.130) * 3 = ms Compare to T1 = ms – not much difference –That’s why we want to use sequential IOs! block

CS222Notes 0128 Reading a track T T = Time to read a full track (start at any block) T T = 25ms (seek time) + (0.13ms / 2) (rotational delay, half of a block) ms (transfer time) = ms The time could be a bit less by ignoring the last gap. Question: what if we need to wait for the start of a track?