Announcements Exam Friday Project: Steps –Due today
Practice Mapping
More Practice
Physical Storage Lecture 10
Storage Media cylinder of tracks (imaginary) disk rotation spindle read/write head arm actuator track actuator movement Figure 13.1 (a) A single-sided disk with read/write hardware. (b) A disk pack with read/write hardware Secondary Storage Device Used because databases are too large to store in main memory Permanent loss of data arise less frequently Cost of storage much less
Storage Media Disk storage terminology: disk-pack cylinder track sector (physical) block or page (logical) (2048 B is a standard block for a UNIX DB, 4096 B is a standard block for an IBM mainframe DB)
Blocking of Records Data arranged in files Transfer data in fixed size blocks –System read multiple logical records into buffer (Blocking factor) Unblocked Records Blocked Records Hdr1 Rec1Hdr2 Rec2Hdr3 Rec3Hdr4 Rec4 Hdr1 Rec1 Rec2 Rec3Hdr2 Rec4 Rec5 Rec6 Blocking factor = 3
Record Format Fixed-length records – assumes all logical records same length –Spanned records Retrieving records requires multiple reads –Unspanned records Wastes space Rec1 Rec2 Rec3 -start Rec3 -rest Rec4 Rec5 Rec6 -start Rec1 Rec2Rec3 Rec4
Record Format Variable-length records –Impossible to add data without relocating it –When deleting all subsequent records moved up one slot Mark record as delete and ignores when reading (made available for insertion) –Only shorter records stored in space –Prime area (fixed-length record) and overflow area accessed with pointer
Application A disk block is 2048B A record is 450B There are 10,000 records 1.What is the block factor? 2.What is the number of blocks needed to store entire table?
File organization File organization is described in terms of how the records are arranged. Sequential or ordered –Reading records in order of the key very efficient –Inserts and Deletes are expensive Heap or unsorted –Efficient insertion, but slow search and deletion Hashed –Fast access on certain search conditions –Efficient inserts and deletes
Data structures B+ Trees –An efficient and flexible hierarchical index that provides both sequenticial and direct access of records –Index has 2 parts Index set Sequence set – bottom level of the index (the leaf nodes) –All key values arranged in a sequence with a pointer from each key value
Example B+ Tree
Rules for Constructing a B+ Tree If the root is not a leaf, it must have at least two children If the tree is order n, each interior node (that is, all nodes except the root and leaf nodes), must have between n/2 and n occupied pointers (and children). If n/2 is not an integer, roundup to determine the minimizes number of pointers
Rules for Constructing a B+ Tree The number of key values contained in a non-leaf node is 1 less than the number of pointers If the tree has order n, the number of occupied key values in a leaf node must be between (n-1)/2 and n-1. If (n-1)/2 is not an integer, round up to determine the minimum number of occupied key values. The tree must be balanced, that is, every path from the root node must have the same length.
Storage Capacity Number of records that can be stored in a B+ tree –n d-1 (n-1) Each node in a tree is a block –How many records if 20 pointers per node and 3 levels?