1 Query Processing Part 1: Managing Disks. 2 Main Topics on Query Processing Running-time analysis Indexes (e.g., search trees, hashing) Efficient algorithms.

Slides:



Advertisements
Similar presentations
Disk Storage, Basic File Structures, and Hashing
Advertisements

Databasteknik Databaser och bioinformatik Data structures and Indexing (II) Fang Wei-Kleiner.
Dr. Kalpakis CMSC 661, Principles of Database Systems Representing Data Elements [12]
Advance Database System
IELM 230: File Storage and Indexes Agenda: - Physical storage of data in Relational DB’s - Indexes and other means to speed Data access - Defining indexes.
1 Storing Data: Disks and Files Yanlei Diao UMass Amherst Feb 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Database Implementation Issues CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 5 Slides adapted from those used by Jennifer Welch.
CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.
Recap of Feb 27: Disk-Block Access and Buffer Management Major concepts in Disk-Block Access covered: –Disk-arm Scheduling –Non-volatile write buffers.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Disk Storage, Basic File Structures, and Hashing by Pinar Senkul resources: mostly froom.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
DISK STORAGE INDEX STRUCTURES FOR FILES Lecture 12.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Layers of a DBMS Query optimization Execution engine Files and access methods Buffer management Disk space management Query Processor Query execution plan.
1 Lecture 7: Data structures for databases I Jose M. Peña
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 17 Disk Storage, Basic File Structures, and Hashing.
1 Physical Data Organization and Indexing Lecture 14.
Announcements Exam Friday Project: Steps –Due today.
1 Chapter 17 Disk Storage, Basic File Structures, and Hashing Chapter 18 Index Structures for Files.
OSes: 11. FS Impl. 1 Operating Systems v Objectives –discuss file storage and access on secondary storage (a hard disk) Certificate Program in Software.
Database Management Systems,Shri Prasad Sawant. 1 Storing Data: Disks and Files Unit 1 Mr.Prasad Sawant.
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
Chapter Ten. Storage Categories Storage medium is required to store information/data Primary memory can be accessed by the CPU directly Fast, expensive.
Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright © 2004 Pearson Education, Inc.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Storage Structures. Memory Hierarchies Primary Storage –Registers –Cache memory –RAM Secondary Storage –Magnetic disks –Magnetic tape –CDROM (read-only.
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Chapter 5 Record Storage and Primary File Organizations
1 Query Processing Exercise Session 1. 2 The system (OS or DBMS) manages the buffer Disk B1B2B3 Bn … … Program’s private memory An application program.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Lec 5 part1 Disk Storage, Basic File Structures, and Hashing.
Storing Data: Disks and Files Memory Hierarchy Primary Storage: main memory. fast access, expensive. Secondary storage: hard disk. slower access,
File Organization Record Storage and Primary File Organization
CS222: Principles of Data Management Lecture #4 Catalogs, Buffer Manager, File Organizations Instructor: Chen Li.
Query Processing Part 1: Managing Disks 1.
Module 11: File Structure
CS522 Advanced database Systems
Lecture 16: Data Storage Wednesday, November 6, 2006.
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
Database Management Systems (CS 564)
Oracle SQL*Loader
Disk Storage, Basic File Structures, and Hashing
9/12/2018.
Lecture 45 Syed Mansoor Sarwar
Lecture 11: DMBS Internals
Lecture 10: Buffer Manager and File Organization
Chapter 11: File System Implementation
Chapters 17 & 18 6e, 13 & 14 5e: Design/Storage/Index
Disk Storage, Basic File Structures, and Hashing
Disk Storage, Basic File Structures, and Buffer Management
Database Implementation Issues
Disk storage Index structures for files
Introduction to Database Systems
Lecture 19: Data Storage and Indexes
Secondary Storage Management Brian Bershad
File Storage and Indexing
File Storage and Indexing
DATABASE IMPLEMENTATION ISSUES
Chapter 14: File-System Implementation
Secondary Storage Management Hank Levy
Database Implementation Issues
CPS216: Advanced Database Systems Notes 04: Data Access from Disks
Lecture 15: Data Storage Tuesday, February 20, 2001.
Database Implementation Issues
Lecture 20: Representing Data Elements
CS 245: Database System Principles Notes 02: Hardware
Presentation transcript:

1 Query Processing Part 1: Managing Disks

2 Main Topics on Query Processing Running-time analysis Indexes (e.g., search trees, hashing) Efficient algorithms for the relational operators Optimizing the evaluation of a whole query The characteristics of disks are different from those of main memory

3 Disks (HDDs) HDD – Hard Disk Drive Not to be confused with the newer SSD (Solid-State Drive)

4 P MC Typical Computer Disks (and other secondary storage devices)... The processor (CPU), main memory (RAM) and controllers are connected by a bus

5 Processor Speed: 100  500  1000 MIPS (MIPS = Million Instructions per Second) Memory Access time:  sec. 1  s  1 ns

6 “Typical Disk” Terms: Platter, Head, Actuator Cylinder, Track Sector (physical), Block (logical), Gap … Head Platter

7 Top (& Bottom) View of a Disk Platter Tracks are concentric circles, divided into sectors All sectors have the same number of bytes (typically 512) Gaps between sectors and between tracks

8 More Details Both surfaces of each platter are used –There is a head for each surface All tracks with the same radius form a cylinder –The heads move together and are always over the same cylinder A block consists of N contiguous sectors –N is determined when the OS formats the disk –The DBMS may choose a different value for N

9 Memory is fast and –time to read (or write) a byte is fixed –can read (or write) just what is needed Disk is slow and –must read (or write) at least one block –time to read (or write) a block varies Memory is volatile whereas a disk keeps the data even without electricity Memory vs. Disk not exactly true …

10 Disk Access Time block x in memory ? user needs block X

11 Seek Time – the time it takes to move the heads to the cylinder where the block is Rotational Delay – the time it takes until the beginning of the block arrives under the head Transfer Time – the time it takes to actually read the block Time = Seek Time + Rotational Delay + Transfer Time + Other …

12 Seek Time 3 or 5x x 1N Cylinders Traveled Time …

13 Average Seek Time Can be measured empirically Alternatively, it can be proved that the average distance from one random cylinder to another is 1/3 of the maximal distance (i.e., from innermost to outermost track) –Hence, the average seek time is about 1/3 of the maximum Typical average seek time is about 10 msec –For the fastest disks it is about 3 msec

14 Rotational Delay (Latency) Head Here Needed Block It is 4.17 msec (7200 rpm) –Only 2 msec for the fastest disks (15,000rpm) The average latency is ½ of the time of one revolution

15 Transfer Time The transfer time can be computed from the sustained transfer rate, which is measured in MB/sec A transfer time of 0.1 msec for a 4KB block amounts to a rate of 40 MB/sec –This is a conservative estimate with respect to recent models of disks

16 Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory We ignore these delays

17 The time to read a block of 4KB is avgSeek + avgLatency + transferTime = = = msec If we read 11 sequential blocks (on the same tack), then –Seek & latency are needed just for the first block –So, the time is = msec Time to Read

18 Summary Random I/O is expensive –Average per 4KB block is ~15 msec Sequential I/O much less –Average per 4KB block is ~1.5 msec (when reading 11 sequential blocks) However, even sequential I/O is slower than memory by at least a factor of 100

19 Writing and Updating Cost of writing is similar to reading Unless we want to verify –If so, add 1 revolution + transfer time To update a block, we must read it into memory, modify it, and then write it back to the disk

20 Typical DB Application The CPU can execute tens-of-thousands (if not millions) of instructions while the controller reads or writes a single block while blocks to read do read next block from disk process the block write some result to disk end

21 Running-Time Analysis: I/O Cost We only count the number of blocks that are read from or written to the disk The CPU time is negligible in comparison Furthermore, the controller can read to and write from the disk while the CPU is processing other blocks –So, the CPU time that can actually influence an exact analysis is even more negligible The goal is to minimize the number of blocks that we read and write

22 We Count Blocks, But What is the cost (in time) of each block? –Cannot tell whether a block was read randomly or sequentially (with other blocks) We should organize data on disks and write programs so that the I/O will be sequential as much as possible –The DBMS helps a lot in this task! –It is also capable of minimizing the number of accessed blocks when processing queries –And it tries to keep the controller busy while the CPU processes blocks that are already in memory

23 Best-Case Analysis Read B 1 blocks from the disk Compute the result and write it back to the disk –Suppose that the size of the result is B 2 What is the best possible I/O cost? What is needed to achieve the best I/O cost?

24 Summary The running time of an algorithm is the I/O cost We measure the I/O cost in terms of the number of blocks that are read or written –A block that is read and then written is counted as 2

25 Arranging Data on Disks

26 The Goal Arrange data on disks so that –Queries and updates can be performed by reading and writing as few blocks as possible, and –Blocks would usually be read sequentially Optimal arrangement depends on the typical queries and updates that are going to be executed Harder to achieve

27 Addresses of Records on Disks

28 Addresses for Records on Disks We need the ability to refer to a particular record In fact, some records have pointers to other records or to blocks –Pointers are inherent to object-relational database systems –Even in purely relational systems, pointers are needed in indexes The DBMS stores indexes – not just relations! Rx

29 How does one refer to records? Several Types of Addresses Rx Many options: Physical Indirect

30 Purely Physical Device ID Cylinder # =Track # Block # Offset in Block Block ID Record Address

31 Fully Indirect (Record IDs) Record ID is a bit string (assigned by the system) that can be translated to a physical address by means of a table map Rec ID for RAddress A Physical addr. Rec ID

32 Tradeoff Flexibility Cost to move records of indirection (for deletions, insertions) Physical addresses limit the ability to move records or use their space when deleting them – why? Logical addresses have the cost of indirection

33 Physical Indirect Many options in between … Half & Half Approach One option: physical address of the block + logical address inside the block

34 R7 R5 R8R6 A Block: Free Space Header: Fixed Part + Array A The address of R6 is the pair (P, 2), where P is the physical address of the block Given (P, 2), we go to the block having the address P and then follow the pointer in A[2] Illustration:

35 More Details on Half & Half One field of the fixed part (of the header) contains the size of the array A The header is at the beginning of the block Any record R can be moved freely inside the block –Only need to change the pointer to R in A All records are packed at the end of the block Available free space is between the header and the records –Why do we want the free space to be contiguous?

36 Insertions Insert a new record R at the end of the free space, and add to the array A a pointer to R The address of R is determined when space is allocated to R R7 R5 R8R6 A Block: Free Space Header: Fixed Part + Array A

37 Deletions To delete a record R, put a null in the entry of A for R – why do we need to do that? Move records toward the end to fill gaps and update their entries in A R7 R5 R8R6 A Block: Free Space Header: Fixed Part + Array A

38 Updates Can be done in-place, except when: –The record grows in size We may have to move the record or parts of it to another block if there is not enough space –We update a field that is used to keep the file in sorted order We may have to move the record to another block, as dictated by the sorted order This case is really like a deletion followed by an insertion

39 Types of Files

40 Arranging a File on Disk Try to allocate a contiguous portion of the disk to the file In a heap, records are packed into blocks in no particular order In a sorted file (also called sequential file), records are inserted in sorted order according to some field(s) … Blocks for the file It is a good idea to chain the file’s blocks in both directions Why the name “sequential file”?

41 Heap Easy to insert – records can be added either at the end or in any block that has available space –I/O cost of insertion is 2 (not 1!) Suppose there are 100 records for “Levy” –What if we want to read all of them? I/O cost is 1,000,000 blocks (must read all blocks) –How much time will it take if we have the IDs of all those records? In the worst case, each record is in another block, so I/O cost is 100 Assume that the file has 1,000,000 blocks

42 Sorted File Must insert a new record in the location dictated by the order How much time does it take (the file has N blocks)? What if each block has some free space – does it help?

43 Sorted File Must insert a new record in the location dictated by the order How much time does insertion take (the file has N blocks)? –We assume that binary search can be done (what is needed to make it possible?) Need to read logN blocks to find the location –On average we have to read and write half of the file’s blocks to make room for the new record (if existing blocks are full) I/O cost is N, where N is the number of blocks of the file –To avoid this high cost, use overflow blocks

44 We need to insert 350, 490 and 600, but block is full Overflow Blocks Use an overflow block header What is the problem with overflow?

45 Interesting Problems How much free space to leave in each block, track, cylinder? How often to reorganize file + overflow? Free space

46 Heap vs. Sorted File A file with 100 records for “Levy” (each has a size of 320 bytes) and 1,000,000 blocks (each is 4K bytes long) We have the IDs of all the records for “Levy” and need to read them If the file is organized as heap, then in the worst case the I/O cost is 100 blocks If the file is sorted on Name, then –The records for “Levy” occupy a minimum of 8 blocks and 9 in the worst case, so the I/O cost is 9 –In the best case, the system will read starting with the first “Levy” (in sequential order) and will use read-ahead buffering, so in this case all 9 blocks will be read sequentially

Comment The previous slide says: –The records for “Levy” occupy a minimum of 8 blocks and 9 in the worst case, so the I/O cost is 9 Does this statement assume that records can span blocks? If so, what are the numbers for the minimum and worst cases if records cannot span blocks? 47 Unless explicitly stated otherwise, we assume that records do not span blocks

48 Variable-Length Records Reasons for variable-length records: –Repeating fields Data about children –Variable format A record of a person with data about medical tests –Fields whose size varies, for example Address of a person BLOB (binary, large object), e.g., video clip Also, long fixed-length records cause a problem if they cannot be spanned across blocks

49 Handling Variable-Length Records Several options for arranging variable- length records in blocks –Read the textbook –Read about how it is done in a specific DBMS you may want to use You need to understand these things to achieve optimal performance

50 Simple Example How to store data about students and the courses they take? –Fixed-length records (S#,C#), or –One variable-length record per student (S#,C#*) Does the system allocate space in each record for the max number of courses? Does the system use truly variable- length records, but with overflow blocks? How efficient is it to search on C#? Could save space (disk & memory) All the courses for a given student can be found very efficiently

51 Addresses of Records on Disks are Different from Addresses in Main Memory So, what does happen when a block of records is read into main memory?

52 Pointer Swizzling Memory Disk block 1 block 2 block 1 Rec B Block 1 was read into memory and record B continues to point to record A on the disk Rec A

53 Now We Also Read Block 2 Memory Disk block 1 block 2 block 1 Rec B When reading block 2 into memory, we need to change (swizzle) the pointer to A in record B Rec A

54 This table is just for the DB addresses that are currently in memory –One entry per record or per block? This table is different from the one that translates logical addresses to physical ones (Slide 31)Slide 31 Memory Addr. DB Addr. A Table Translates DB Addresses to Memory Addresses

55 Several Approaches to Swizzling Automatic swizzling –When reading a block into memory, the pointers in that block are swizzled if they are in the table –Is this enough? Swizzling on demand (lazy approach) No swizzling (i.e., use the table all the time) Address A bit indicating whether this is a DB address or a memory address

56 Unswizzling At some point, a block B is removed from memory –To make room for another block If B was changed (while in memory), then first it has to be written to disk –Need to unswizzle the pointers in the block Must also update the table, and unswizzle pointers in memory that are pointing to B –Need a list of all the pointers in memory that point to B