Database Applications (15-415) DBMS Internals: Part II Lecture 13, February 25, 2018 Mohammad Hammoud.

Slides:



Advertisements
Similar presentations
The Bare Basics Storing Data on Disks and Files
Advertisements

Storing Data: Disk Organization and I/O
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 9 Yea, from the table of my memory Ill wipe away.
1 Storing Data: Disks and Files Chapter 7. 2 Disks and Files v DBMS stores information on (hard) disks. v This has major implications for DBMS design!
Storing Data: Disks and Files
FILES (AND DISKS).
CS4432: Database Systems II Buffer Manager 1. 2 Covered in week 1.
Introduction to Database Systems1 Records and Files Storage Technology: Topic 3.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7.
Storing Data: Disks and Files
Buffer management.
1 Storing Data: Disks and Files Yanlei Diao UMass Amherst Feb 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
The Relational Model (cont’d) Introduction to Disks and Storage CS 186, Spring 2007, Lecture 3 Cow book Section 1.5, Chapter 3 (cont’d) Cow book Chapter.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
1 Database Systems November 12/14, 2007 Lecture #7.
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
Layers of a DBMS Query optimization Execution engine Files and access methods Buffer management Disk space management Query Processor Query execution plan.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 9.
Storage and File Structure. Architecture of a DBMS.
Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7.
Physical Storage Susan B. Davidson University of Pennsylvania CIS330 – Database Management Systems November 20, 2007.
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7 “ Yea, from the table of my memory I ’ ll wipe away.
1 Storing Data: Disks and Files Chapter 9. 2 Disks and Files  DBMS stores information on (“hard”) disks.  This has major implications for DBMS design!
“Yea, from the table of my memory I’ll wipe away all trivial fond records.” -- Shakespeare, Hamlet.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
Exam I Grades uMax: 96, Min: 37 uMean/Median:66, Std: 18 uDistribution: w>= 90 : 6 w>= 80 : 12 w>= 70 : 9 w>= 60 : 9 w>= 50 : 7 w>= 40 : 11 w>= 30 : 5.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Content based on Chapter 9 Database Management Systems, (3.
Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES.
Storage and File structure COP 4720 Lecture 20 Lecture Notes.
Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Storing Data: Disks and Files Chapter 7 Jianping Fan Dept of Computer Science UNC-Charlotte.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
1 Storing Data: Disks and Files Chapter 9. 2 Objectives  Memory hierarchy in computer systems  Characteristics of disks and tapes  RAID storage systems.
Database Applications (15-415) DBMS Internals: Part II Lecture 12, February 21, 2016 Mohammad Hammoud.
Announcements Program 1 on web site: due next Friday Today: buffer replacement, record and block formats Next Time: file organizations, start Chapter 14.
Storing Data: Disks and Files Memory Hierarchy Primary Storage: main memory. fast access, expensive. Secondary storage: hard disk. slower access,
The very Essentials of Disk and Buffer Management.
CS222: Principles of Data Management Lecture #4 Catalogs, Buffer Manager, File Organizations Instructor: Chen Li.
CS522 Advanced database Systems
Database Applications (15-415) DBMS Internals- Part I Lecture 11, February 16, 2016 Mohammad Hammoud.
Module 11: File Structure
CS522 Advanced database Systems
Storing Data: Disks and Files
Storing Data: Disks and Files
Database Applications (15-415) DBMS Internals: Part II Lecture 11, October 2, 2016 Mohammad Hammoud.
CS522 Advanced database Systems
CS222/CS122C: Principles of Data Management Lecture #3 Heap Files, Page Formats, Buffer Manager Instructor: Chen Li.
Database Management Systems (CS 564)
Storing Data: Disks and Files
Lecture 10: Buffer Manager and File Organization
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
Disk Storage, Basic File Structures, and Buffer Management
CS222P: Principles of Data Management Lecture #2 Heap Files, Page structure, Record formats Instructor: Chen Li.
Database Systems November 2, 2011 Lecture #7.
Distributed Systems CS
Database Applications (15-415) DBMS Internals: Part III Lecture 14, February 27, 2018 Mohammad Hammoud.
Introduction to Database Systems
5. Disk, Pages and Buffers Why Not Store Everything in Main Memory
Storing Data: Disks and Files
CS222/CS122C: Principles of Data Management Lecture #4 Catalogs, File Organizations Instructor: Chen Li.
Basics Storing Data on Disks and Files
CS222p: Principles of Data Management Lecture #4 Catalogs, File Organizations Instructor: Chen Li.
Distributed Systems CS
CS222P: Principles of Data Management Lecture #3 Buffer Manager, PAX
Database Systems (資料庫系統)
Storing Data: Disks and Files
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #03 Row/Column Stores, Heap Files, Buffer Manager, Catalogs Instructor: Chen Li.
Presentation transcript:

Database Applications (15-415) DBMS Internals: Part II Lecture 13, February 25, 2018 Mohammad Hammoud

Today… Last Session: DBMS Internals- Part I Today’s Session: DBMS Internals- Part II A Brief Summary on Disks and the RAID Technology Disk and memory management Announcements: Project 2 is due on March 22 by midnight The Midterm exam is on March 13

DBMS Layers Queries Query Optimization and Execution Relational Operators Transaction Manager Files and Access Methods Recovery Manager Buffer Management Lock Manager Disk Space Management DB

Reliability + Performance Faloutsos CMU - 15-415/615 Multiple Disks Discussions on: Reliability Performance Reliability + Performance

Redundant Arrays of Independent Disks Faloutsos CMU - 15-415/615 Redundant Arrays of Independent Disks A system depending on N disks is much more likely to fail than one depending on one disk If the probability of one disk to fail is f Then, the probability of N disks to fail is (1-(1-f)N) How would we combine reliability with performance? Redundant Arrays of Inexpensive Disks (RAID) combines mirroring and striping Nowadays, Independent!

RAID Level 0 Data Striping RAID level 1: mirroring.

RAID Level 1 Data Mirroring RAID level 1: mirroring.

RAID Level 2 Bit Interleaving; ECC Data Data bits Check bits RAID level 2: bit interleaving with an error-correcting code (ECC). An example of ECC is hamming code, which can detect up to two-bit errors or correct one-bit errors without detection of uncorrected errors. By contrast, the simple parity code cannot correct errors, and can detect only an odd number of bits in error. Check bits

RAID Level 3 Bit Interleaving; Parity Data Data bits Parity bits RAID level 3: bit interleaving with parity bits. Parity bits

RAID Level 4 Block Interleaving; Parity Data Data blocks Parity blocks RAID level 4: block interleaving with parity blocks. Parity blocks

RAID Level 5 Block Interleaving; Parity Data Data and parity blocks RAID level 5: block interleaving with parity blocks. Rather than dedicating one disk to hold all the parity blocks, the parity blocks are distributed among all the disks. For stripe 1, the parity block might be on disk 1; for stripe 2 it would be on disk 2, and so forth. If we have eleven disks, then for stripe 11 the parity block would be back on disk 1.

RAID 4 vs. RAID 5 What if we have a lot of small writes? RAID 5 is better What if we have mostly large writes? Multiples of stripes Either is fine What if we want to expand the number of disks? RAID 5: add a disk, re-compute parity, and shuffle data blocks among all disks to reestablish the check-block pattern (expensive!)

Beyond Disks: Flash Flash memory is a relatively new technology providing the functionality needed to hold file systems and DBMSs It is writable It is readable Writing is slower than reading It is non-volatile Faster than disks, but slower than DRAMs Unlike disks, it allows random accesses Limited lifetime More expensive than disks

Disks: A “Very” Brief Summary Faloutsos CMU - 15-415/615 Disks: A “Very” Brief Summary DBMSs store data in disks Disks provide large, cheap, and non-volatile storage I/O time dominates! The cost depends on the locations of pages on disk (among others) It is important to arrange data sequentially to minimize seek and rotational delays

Disks: A “Very” Brief Summary Faloutsos CMU - 15-415/615 Disks: A “Very” Brief Summary Disks can cause reliability and performance problems To mitigate such problems we can adopt “multiple disks” and accordingly gain: More capacity Redundancy Concurrency To achieve only redundancy we can apply mirroring To achieve only concurrency we can apply striping To achieve redundancy and concurrency we can apply RAID (e.g., levels 2, 3, 4 or 5)

DBMS Layers Queries Query Optimization and Execution Relational Operators Transaction Manager Files and Access Methods Recovery Manager Buffer Management Lock Manager Disk Space Management DB

Disk Space Management DBMSs disk space managers Support the concept of a page as a unit of data Page size is usually chosen to be equal to the block size so that reading or writing a page can be done in 1 disk I/O Allocate/de-allocate pages as a contiguous sequence of blocks on disks Abstracts hardware (and possibly OS) details from higher DBMS levels

What to Keep Track of? The DBMS disk space manager keeps track of: Which pages are on which disk blocks Which disk blocks are in use (i.e., occupied by data blocks) Blocks can be initially allocated contiguously, but allocating and de-allocating blocks usually create “holes” Hence, a mechanism to keep track of free blocks is needed A list of free blocks can be maintained (storage could be an issue) Alternatively, a bitmap with one bit per each disk block can be maintained (more storage efficient and faster in identifying contiguous free areas!)

OS File Systems vs. DBMS Disk Space Managers Operating Systems already employ disk space managers using their “file” abstraction “Read byte i of file f”  “read block m of track t of cylinder c of disk d” DBMSs disk space managers usually pursue their own disk management without relying on OS file systems Enables portability Can address larger amounts of data Allows spanning and mirroring

DBMS Layers Queries Query Optimization and Execution Relational Operators Transaction Manager Files and Access Methods Recovery Manager Buffer Management Lock Manager Disk Space Management DB

Buffer Management What is a DBMS buffer manager? It is a software component responsible for managing (e.g., placing and replacing) data on memory (or buffer pool) It hides the fact that not all data is in the buffer pool Page Requests from Higher Levels Memory page = disk block Buffer pool in the main memory free frame DB choice of frame dictated by a replacement policy DISK

Satisfying Page Requests For each frame in the buffer pool, the DBMS buffer manager maintains: A pin_count variable: # of users of a page A dirty variable: whether a page has been modified or not If a page is requested and missed in the pool, the DBMS buffer manager: Chooses a frame for placement and increments its pin_count (a process known as pinning) If the frame is empty, fetches the page (or block) from disk and places it in the frame If the frame is occupied: If the hosted page is dirty, writes it back to the disk, then fetches and places the requested page in the frame Else, fetches and places the requested page in the frame

Replacement Policies A frame is not used to store a new page until its pin_count becomes 0 I.e., Until all requestors of the old page have unpinned it (a process known as unpinning) When many frames with pin_count = 0 are available, a specific replacement policy is pursued If no frame in the pool has pin_count = 0 and a page which is not in the pool is requested, the buffer manager must wait until some page is released!

Replacement Policies When a new page is to be placed in the pool, a resident page should be evicted first Criterion for an optimum replacement [Belady, 1966]: The page that will be accessed farthest in the future should be the one that is evicted Unfortunately, optimum replacement is not implementable! Hence, most buffer managers capitalize on a different criterion E.g., The page that was accessed the farthest back in the past is the one that is evicted, leading to a policy known as Least Recently Used (LRU) Other policies: MRU, Clock, FIFO, and Random, among others

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 Frame 1 Frame 2 Frame 3 Frame 4 # of Hits: # of Misses: # of Hits: # of Misses: # of Hits: # of Misses:

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 7 7 7 Frame 1 Frame 2 Frame 3 Frame 4 # of Hits: # of Misses: 1 # of Hits: # of Misses: 1 # of Hits: # of Misses: 1

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 0 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 7 7 7 Frame 1 Frame 2 Frame 3 Frame 4 # of Hits: # of Misses: 2 # of Hits: # of Misses: 2 # of Hits: # of Misses: 2

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 1 0 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 7 7 7 Frame 1 1 1 1 Frame 2 Frame 3 Frame 4 # of Hits: # of Misses: 3 # of Hits: # of Misses: 3 # of Hits: # of Misses: 3

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 2 1 0 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 2 7 7 Frame 1 1 1 1 Frame 2 2 2 Frame 3 Frame 4 # of Hits: # of Misses: 4 # of Hits: # of Misses: 4 # of Hits: # of Misses: 4

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 0 2 1 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 2 7 7 Frame 1 1 1 1 Frame 2 2 2 Frame 3 Frame 4 # of Hits: 1 # of Misses: 4 # of Hits: 1 # of Misses: 4 # of Hits: 1 # of Misses: 4

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 3 0 2 1 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 2 3 7 Frame 1 3 1 1 Frame 2 2 2 Frame 3 3 Frame 4 # of Hits: 1 # of Misses: 5 # of Hits: 1 # of Misses: 5 # of Hits: 1 # of Misses: 5

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 0 3 2 1 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 2 3 7 Frame 1 3 1 1 Frame 2 2 2 Frame 3 3 Frame 4 # of Hits: 2 # of Misses: 5 # of Hits: 2 # of Misses: 5 # of Hits: 2 # of Misses: 5

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 4 0 3 2 1 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 4 3 4 Frame 1 3 4 1 Frame 2 2 2 Frame 3 3 Frame 4 # of Hits: 2 # of Misses: 6 # of Hits: 2 # of Misses: 6 # of Hits: 2 # of Misses: 6

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 2 4 0 3 1 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 4 3 4 Frame 1 2 4 1 Frame 2 2 2 Frame 3 3 Frame 4 # of Hits: 2 # of Misses: 7 # of Hits: 3 # of Misses: 6 # of Hits: 3 # of Misses: 6

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 3 2 4 0 1 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 4 3 4 Frame 1 3 2 4 1 Frame 2 2 2 Frame 3 3 Frame 4 # of Hits: 2 # of Misses: 8 # of Hits: 4 # of Misses: 6 # of Hits: 4 # of Misses: 6

Example: LRU 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace LRU Chain: 0 3 2 4 1 7 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Reference Trace Pool X (size = 3) Pool Y (size = 4) Pool Z (size = 5) Frame 0 3 4 Frame 1 3 2 4 1 Frame 2 2 2 Frame 3 3 Frame 4 # of Hits: 2 # of Misses: 9 # of Hits: 5 # of Misses: 6 # of Hits: 5 # of Misses: 6

Observation: The Stack Property Adding pool space never hurts, but it may or may not help This is referred to as the “Stack Property” LRU has the stack property, but not all replacement policies have it E.g., FIFO does not have it

Working Sets Given a time interval T, WorkingSet(T) is defined as the set of distinct pages accessed during T Its size (or what is referred to as the working set size) is all what matters It captures the adequacy of the cache size with respect to the program behavior What happens if a process performs repetitive accesses to some data in memory, with a working set size that is larger than the buffer pool?

The LRU Policy: Sequential Flooding To answer this question, assume: Three pages, A, B, and C An access pattern: A, B, C, A, B, C, etc. A buffer pool that consists of only two frames Access A: Page Fault Access B: Page Fault Access C: Page Fault Access A: Page Fault Access B: Page Fault Access C: Page Fault Access A: Page Fault A B A C B A C B A C B A C . . . Although the access pattern exhibits temporal locality, no locality was exploited! This phenomenon is known as “sequential flooding” For this access pattern, MRU works better!

Why Not Virtual Memory of OSs? Operating Systems already employ a buffer management technique known as virtual memory Virtual Address Virtual Pages Page # Offset 68K-76k --- 60K-68k --- 52K-60k --- 44K-52k Physical Pages Virtual Address Space --- 32K-40k 16K-24k 16K-24k 8K-16k --- 8K-16k 0K-8k 0K-8k Physical Address Space

Why Not Virtual Memory of OSs? Nonetheless, DBMSs pursue their own buffer management so that they can: Predict page reference patterns more accurately and applying effective strategies (e.g., page prefetching for improving performance) Force pages to disks Needed for the WAL protocol Typically, the OS cannot guarantee this!

DBMS Layers Queries Query Optimization and Execution Relational Operators Transaction Manager Files and Access Methods Recovery Manager Buffer Management Lock Manager Disk Space Management DB

Records, Pages and Files Higher-levels of DBMSs deal with records (not pages!) At lower-levels, records are stored in pages But, a page might not fit all records of a database Hence, multiple pages might be needed A collection of pages is denoted as a file A File A Record … A Page

File Operations and Organizations A file is a collection of pages, each containing a collection of records Files must support operations like: Insert/Delete/Modify records Read a particular record (specified using a record id) Scan all records (possibly with some conditions on the records to be retrieved) There are several organizations of files: Heap Sorted Indexed

Heap Files Records in heap file pages do not follow any particular order As a heap file grows and shrinks, disk pages are allocated and de-allocated To support record level operations, we must: Keep track of the pages in a file Keep track of the records on each page Keep track of the fields on each record

Supporting Record Level Operations Keeping Track of Pages in a File Records in a Page Fields in a Record

Heap Files Using Lists of Pages A heap file can be organized as a doubly linked list of pages The Header Page (i.e., <heap_file_name, page_1_addr> is stored in a known location on disk Each page contains 2 ‘pointers’ plus data Header Page Data Available Pages with Free Space Full Pages

Heap Files Using Lists of Pages It is likely that every page has at least a few free bytes Thus, virtually all pages in a file will be on the free list! To insert a typical record, we must retrieve and examine several pages on the free list before one with enough free space is found This problem can be addressed using an alternative design known as the directory-based heap file organization

Heap Files Using Directory of Pages A directory of pages can be maintained whereby each directory entry identifies a page in the heap file Free space can be managed via maintaining: A bit per entry (indicating whether the corresponding page has any free space) A count per entry (indicating the amount of free space on the page) Data Page 1 Page 2 Page N Header Page DIRECTORY

Supporting Record Level Operations Keeping Track of Pages in a File Records in a Page Fields in a Record

Page Formats A page of a file can be thought of as a collection of slots, each of which contains a record A record can be identified using the pair <page_id, slot_#>, which is typically referred to as record id (rid) Records can be either: Fixed-Length Variable-Length Slot 1 Slot 2 Slot M . . .

Fixed-Length Records When records are of fixed-length, slots become uniform and can be arranged consecutively Records can be located by simple offset calculations Whenever a record is deleted, the last record on the page is moved into the vacated slot This changes its rid <page_id, slot_#> (may not be acceptable!) Slot 1 Slot 2 Slot N . . . N Free Space Number of Records

Fixed-Length Records Alternatively, we can handle deletions by using an array of bits When a record is deleted, its bit is turned off, thus, the rids of currently stored records remain the same! . . . M 1 M ... 3 2 1 Slot 1 Slot 2 Slot M Number of Slots (NOT Records) Free Space

Variable-Length Records If the records are of variable length, we cannot divide the page into a fixed collection of slots When a new record is to be inserted, we have to find an empty slot of “just” the right length Thus, when a record is deleted, we better ensure that all the free space is contiguous The ability of moving records “without changing rids” becomes crucial!

Pages with Directory of Slots A flexible organization for variable-length records is to maintain a directory of slots with a <record_offset, record_length> pair per a page Rid = (i,N) Page i Rid = (i,2) Rid = (i,1) 20 16 24 N Pointer to start of free space N . . . 2 1 # Slots Records can be moved without changing rids! SLOT DIRECTORY

Supporting Record Level Operations Keeping Track of Pages in a File Records in a Page Fields in a Record

Record Formats Fields in a record can be either of: Fixed-Length: each field has a fixed length and the number of fields is also fixed Variable-Length: fields are of variable lengths but the number of fields is fixed Information common to all records (e.g., number of fields and field types) are stored in the system catalog

Fixed-Length Fields Fixed-length fields can be stored consecutively and their addresses can be calculated using information about the lengths of preceding fields F1 F2 F3 F4 L1 L2 L3 L4 Base address (B) Address = B+L1+L2

Variable-Length Fields There are two possible organizations to store variable-length fields Consecutive storage of fields separated by delimiters F1 F2 F3 F4 4 $ Fields Delimited by Special Symbols Field Count This entails a scan of records to locate a desired field!

Variable-Length Fields There are two possible organizations to store variable-length fields Consecutive storage of fields separated by delimiters Storage of fields with an array of integer offsets F1 F2 F3 F4 Array of Field Offsets This offers direct access to a field in a record and stores NULL values efficiently!

Next Class Queries Cont’d Query Optimization and Execution Relational Operators Transaction Manager Files and Access Methods Recovery Manager Buffer Management Lock Manager Disk Space Management DB