CS4433 Database Systems Indexing
Why Do We Learn This? Find out the desired information (by value) from the database (very) quickly! E.g., author catalog in library Indexing Common properties of indexes B+ trees Hash tables
What is Indexing? A “labeled” pointer to an (a collection of) item that satisfies some common property Examples in the Real World?
What is Indexing? A “labeled” pointer to an (a collection of) item that satisfies some common property Examples in the Real World?
What is Indexing? A “labeled” pointer to an (a collection of) item that satisfies some common property Examples in the Real World?
Theoretically, Indexes is … An index on a file speeds up selections on the search key attributes(s) Search key = any subset of the attributes of a relation attributes used to look up records in a file. Search key is not the same as key (minimal set of attributes that uniquely identify a tuple (record) in a relation) An index file consists of records (called index entries) Entries in an index: (K, R), where: K: the search key R: pointers of the record OR record id OR record ids Index files are typically much smaller than the original file
Types of Indexes Ordered/Hash Ordered indices: index entries are stored sorted on the search key value. E.g., author catalog in library. Hash indices: index entries are distributed uniformly across “buckets” based on search key using a “hash function”. Clustered/Unclustered Clustered = records sorted in the key order Unclustered = no Dense/sparse Dense = Index record appears for every search-key value in the file. Sparse = only some records have Primary/secondary Primary = on the primary key Secondary = on any key Some textbooks interpret these differently B+ tree / Hash table / …
Clustered, Dense Index Clustered: File is sorted on the index attribute Dense: sequence of (key, pointer) pairs 10 20 10 20 30 40 30 40 50 60 70 80 50 60 70 80
Clustered, Dense Index index on ID attribute of instructor relation
Dense Index Files (Cont.) Dense index on dept_name, with instructor file sorted on dept_name
Clustered, Sparse Index Sparse index: contains index records for only some search-key values e.g. one key per data block Applicable when records are sequentially ordered on search-key Save more space Sacrifice efficiency 10 20 10 30 50 70 30 40 90 110 130 150 50 60 70 80
Sparse Index Files To locate a record with search-key value K we: Find index record with largest search-key value < K Search file sequentially starting at the record to which the index record points
Sparse Index Files (Cont.) Compared to dense indices: Less space and less maintenance overhead for insertions and deletions. Generally slower than dense index for locating records. Good tradeoff: sparse index with an index entry for every block in file, corresponding to least search-key value in the block.
Clustered Index with Duplicate Keys Dense index: point to the first record with that key 10 10 20 30 40 10 20 50 60 70 80 20 30 40
Unclustered Indexes Often for indexing other attributes than primary key Always dense (why ?) The locality of values has been broken! 20 30 10 20 30 20 20 30 10 20 10 30
Clustered vs. Unclustered Index Index entries Index entries (Index File) (Data file) Data Records Data Records CLUSTERED UNCLUSTERED
Secondary Indices Example Secondary index on salary field of instructor Index record points to a bucket that contains pointers to all the actual records with that particular search-key value. Secondary indices have to be dense 19
Primary and Secondary Indices Indices offer substantial benefits when searching for records. BUT: Updating indices imposes overhead on database modification --when a file is modified, every index on the file must be updated, Sequential scan using primary index is efficient, but a sequential scan using a secondary index is expensive Each record access may fetch a new block from disk Block fetch requires about 5 to 10 milliseconds, versus about 100 nanoseconds for memory access 20
Multilevel Index If primary index does not fit in memory, access becomes expensive. Solution: treat primary index kept on disk as a sequential file and construct a sparse index on it. outer index – a sparse index of primary index inner index – the primary index file If even outer index is too large to fit in main memory, yet another level of index can be created, and so on. Indices at all levels must be updated on insertion or deletion from the file.
Multilevel Index (Cont.)
Index Update: Deletion If deleted record was the only record in the file with its particular search-key value, the search-key is deleted from the index also. Single-level index entry deletion: Dense indices – deletion of search-key is similar to file record deletion. Sparse indices – if an entry for the search key exists in the index, it is deleted by replacing the entry in the index with the next search-key value in the file (in search-key order). If the next search-key value already has an index entry, the entry is deleted instead of being replaced.
Index Update: Insertion Single-level index insertion: Perform a lookup using the search-key value appearing in the record to be inserted. Dense indices – if the search-key value does not appear in the index, insert it. Sparse indices – if index stores an entry for each block of the file, no change needs to be made to the index unless a new block is created. If a new block is created, the first search-key value appearing in the new block is inserted into the index. Multilevel insertion and deletion: algorithms are simple extensions of the single-level algorithms
Secondary Indices Frequently, one wants to find all the records whose values in a certain field (which is not the search-key of the primary index) satisfy some condition. Example 1: In the instructor relation stored sequentially by ID, we may want to find all instructors in a particular department Example 2: as above, but where we want to find all instructors with a specified salary or with salary in a specified range of values We can have a secondary index with an index record for each search-key value
B+ Trees What’s wrong with sequential index? Pros: easy/fast to access Cons: hard to maintain the sequential property upon updates Periodic reorganization of entire file is required. performance degrades as file grows, since many overflow blocks get created. B+ Tree Intuition: Give up sequentiality of index and Try to get “balance” by dynamic reorganization automatically reorganizes itself with small, local, changes, in the face of insertions and deletions. Reorganization of entire file is not required to maintain performance. (Minor) disadvantage of B+-trees: extra insertion and deletion overhead, space overhead.
Example of B+-Tree
B+-Tree Index Files (Cont.) A B+-tree is a rooted tree satisfying the following properties: All paths from root to leaf are of the same length Parameter d = the degree (order) Each node has [d, 2d] keys (except root) Each interior node that is not a root or a leaf has pointer to [d+1,2d+1] children. A leaf node has pointers [d+1,2d+1] to record Special cases: If the root is not a leaf, it has at least 2 children. If the root is a leaf (that is, there are no other nodes in the tree), it can have between 0 and (2d–1) values
B+-Tree Node Structure Typical Node Ki are the search-key values Pi are pointers to children (for non-leaf nodes) or pointers to records or buckets of records (for leaf nodes). The search-keys in a node are ordered K1 < K2 < K3 < . . . < Kn–1 (Initially assume no duplicate keys, address duplicates later) K1 K2 K3 p1 p2 p3 p4 [X , K1) [K1, K2) [K2, K3) [K3, Y)
B+ Trees Basics Internal node: Leaf: next leaf 30 120 240 40 50 60 [30, 120) [120, 240) [240, Y) 40 50 60 next leaf 40 50 60
Properties of a leaf node: Leaf Nodes in B+-Trees Properties of a leaf node: For i = 1, 2, . . ., 2d, pointer Pi points to a file record with search-key value Ki, P2d+1 points to next leaf node in search-key order
Non-Leaf Nodes in B+-Trees Non leaf nodes form a multi-level sparse index on the leaf nodes. For a non-leaf node with m pointers: All the search-keys in the subtree to which P1 points are less than K1 For 2 i 2d + 1, all the search-keys in the subtree to which Pi points have values greater than or equal to Ki–1 and less than Ki All the search-keys in the subtree to which P2d+1 points have values greater than or equal to K2d K1 .. K2d p1 p2 ,, P2d+1
Searching a B+ Tree Select name From people Where age = 25 Select name Point queries with exact key values: Start at the root Proceed down, to the leaf Range queries: As above Then sequential traversal on leafs Select name From people Where 20 <= age and age <= 30
Queries on B+-Trees Find record with search-key value V. C=root While C is not a leaf node { Let i be least value s.t. V Ki. If no such exists, set C = last non-null pointer in C Else { if (V= Ki ) Set C = Pi +1 else set C = Pi} } Let i be least value s.t. Ki = V If there is such a value i, follow pointer Pi to the desired record. Else no record with search-key value k exists.
B+ Tree Example Root (d=1) d = 2 Select name From person Where age = 30 (Where age >=30) (Where 20<=age and age <=30) Root (d=1) d = 2 80 20 60 100 120 140 10 15 18 20 30 40 50 60 65 80 85 90 10 15 18 20 30 40 50 60 65 80 85 90
B+ Tree Design How large is d? Eack block will have space for 2d search key and 2d+1 pointers. Pick n as large as possible that fits into a block Example 14.10 Example: Key size = 4 bytes Pointer size = 8 bytes Block size = 4096 byes 2d x 4 + (2d+1) x 8 <= 4096 So, d = 170
B+ Trees in Practice Typical order: 100. Typical fill-factor: 67%. average fan-out = 133 Typical capacities: Height 4: 1334 = 312,900,700 records Height 3: 1333 = 2,352,637 records Can often hold top levels in buffer pool: Level 1 = 1 page = 8 Kbytes Level 2 = 133 pages = 1 Mbyte Level 3 = 17,689 pages = 133 MBytes
Insertion in a B+ Tree Insert (K, P): Find leaf where K belongs, insert If no overflow (2d keys or less), halt If overflow (2d+1 keys), split node, insert in parent: If leaf, keep K3 too in right node When root splits, new root has 1 key only that’s why root is special for degree satisfaction (K3, ) to parent K1 K2 K3 K4 K5 P0 P1 P2 P3 P4 p5 K1 K2 P0 P1 P2 K4 K5 P3 P4 p5
Insertion in a B+ Tree Insert K=19 80 20 60 100 120 140 10 15 18 20 30 50 60 65 80 85 90 10 15 18 20 30 40 50 60 65 80 85 90
Insertion in a B+ Tree After Insertion 80 20 60 100 120 140 10 15 18 19 20 30 40 50 60 65 80 85 90 10 15 18 19 20 30 40 50 60 65 80 85 90
Insertion in a B+ Tree Now Insert K=25 80 20 60 100 120 140 10 15 18 19 20 30 40 50 60 65 80 85 90 10 15 18 19 20 30 40 50 60 65 80 85 90
Insertion in a B+ Tree After Insertion 80 20 60 100 120 140 10 15 18 19 20 25 30 40 50 60 65 80 85 90 10 15 18 19 20 25 30 40 50 60 65 80 85 90
Insertion in a B+ Tree Now Split 80 20 60 100 120 140 10 15 18 19 20 25 30 40 50 60 65 80 85 90 10 15 18 19 20 25 30 40 50 60 65 80 85 90
Insertion in a B+ Tree After the Split 80 20 30 60 100 120 140 10 15 18 19 20 25 30 40 50 60 65 80 85 90 10 15 18 19 20 25 30 40 50 60 65 80 85 90
Deletion from a B+ Tree Delete 30 80 20 30 60 100 120 140 10 15 18 19 25 30 40 50 60 65 80 85 90 10 15 18 19 20 25 30 40 50 60 65 80 85 90
Deletion from a B+ Tree After Deleting 30 May change to 40, or not 80 20 30 60 100 120 140 10 15 18 19 20 25 40 50 60 65 80 85 90 10 15 18 19 20 25 40 50 60 65 80 85 90
Deletion from a B+ Tree Delete 25 80 20 30 60 100 120 140 10 15 18 19 50 60 65 80 85 90 10 15 18 19 20 25 40 50 60 65 80 85 90
Deletion from a B+ Tree After deleting 25, Need to rebalance: Rotate 80 20 30 60 100 120 140 10 15 18 19 20 40 50 60 65 80 85 90 10 15 18 19 20 40 50 60 65 80 85 90
Deletion from a B+ Tree Now Delete 40 80 19 30 60 100 120 140 10 15 18 50 60 65 80 85 90 10 15 18 19 20 40 50 60 65 80 85 90
Deletion from a B+ Tree After deleting 40, Rotation not possible. Need to merge nodes 80 19 30 60 100 120 140 10 15 18 19 20 50 60 65 80 85 90 10 15 18 19 20 50 60 65 80 85 90
Deletion from a B+ Tree Final Tree 80 19 60 100 120 140 10 15 18 19 20 50 60 65 80 85 90 10 15 18 19 20 50 60 65 80 85 90
B Tree Idea: Avoid duplicate keys Have record pointers in non-leaf nodes to record to record to record with K1 with K2 with K3 to keys to keys to keys to keys < K1 K1<x<K2 K2<x<k3 >k3 K1 P1 K2 P2 K3 P3
B-Tree Example D = 2 Sequence pointers not useful now! 65 125 25 45 85 105 145 165 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180
Hash Tables Recall basics There are n buckets A hash function f(k) maps a key k to {0, 1, …, n-1} Store in bucket f(k) a pointer to record with key k Secondary storage: bucket = block, use overflow blocks when needed
Hash Table Example Assume 1 bucket (block) stores 2 keys + pointers h(e)=0 h(b)=h(f)=1 h(g)=2 h(a)=h(c)=3 e b f g a c 1 2 3
Searching in a Hash Table Search for a: Compute h(a)=3 Read bucket 3 1 disk access e b f g a c 1 2 3
Insertion in Hash Table Place in right bucket, if there exists space E.g. h(d)=2 e b f g d a c 1 2 3
Insertion in Hash Table Create overflow block, if no space E.g. h(k)=1 More overflow blocks may be needed e b f g d a c k 1 2 3
Hash Table Performance Excellent, if no overflow blocks Degrades considerably when number of keys exceeds the number of buckets (i.e. many overflow blocks) Other problems: Memory requirement Dynamic maintenance Equality queries only!
Extensible Hash Table Allows hash table to grow, to avoid performance degradation Assume a hash function h that returns numbers in {0, …, 2k – 1} Start with n = 2i << 2k (size of the hash table), only look at first i most significant bits E.g. i=1, n=2, k=4 The first i bits (i = 1) i=1 0(010) 1 1 1(011) 1
Insertion in Extensible Hash Table 0(010) 1 1 1(011) 1(110) 1
Insertion in Extensible Hash Table 0(010) 1 Now insert 1010 Need to extend table, split blocks i becomes 2 so n=4 1 1(011) 1(110), 1(010) 1
Insertion in Extensible Hash Table Now insert 1010 i=2 0(010) 1 00 01 10(11) 10(10) 2 Doubling the hash table 10 11 11(10) 2
Insertion in Extensible Hash Table Now insert 0000, then 0101 Need to split block i=2 0(010) 0(000), 0(101) 0(010) 0(000), 1 00 01 10(11) 10(10) 2 10 11 11(10) 2
Insertion in Extensible Hash Table After splitting the block 00(10) 00(00) 2 i=2 01(01) 2 00 01 10(11) 10(10) 2 10 11 11(10) 2
Performance Extensible Hash Table No overflow blocks: access always one read BUT: Extensions can be costly and disruptive After an extension table may no longer fit in memory
Linear Hash Table Idea: extend only one entry at a time Problem: n= no longer a power of 2 Let i be #bits necessary to address n buckets 2i-1 < n <= 2i After computing h(k), use last i bits: If last i bits represent a number >= n, change msb from 1 to 0 (get a number < n)
Linear Hash Table Example Insert (01)11 Bit flip: 11 01 (01)00 (11)00 i=2 (01)11 BIT FLIP 00 01 (10)10 10
Linear Hash Table Example Insert 1000: overflow blocks… (01)00 (11)00 (10)00 i=2 (01)11 00 01 (10)10 10
Linear Hash Tables Extension: independent on overflow blocks Extend n:=n+1 when average number of records per block exceeds (say) 80%
Linear Hash Table Extension From n=3 to n=4 Only need to touch one block (which one ?) (01)00 (11)00 (01)00 (11)00 i=2 (01)11 00 (01)11 i=2 01 (10)10 10 (10)10 00 01 (01)11 10 11
Linear Hash Table Extension From n=3 to n=4 finished Extension from n=4 to n=5 (new bit) Need to touch every single block (why ?) Need to look last 3 bits which affect all keys (01)00 (11)00 i=2 (10)10 00 01 (01)11 10 11