Download presentation
Presentation is loading. Please wait.
1
1 Lecture 20: Indexes Friday, February 25, 2005
2
2 Outline Representing data elements (12) Index structures (13.1, 13.2) B-trees (13.3)
3
3 Representing Data Elements Note: will cover VERY LITTLE in class READING ASSIGNMENT: Chapter 12. Disk Blocks = fixed size Table record = variable size Issue: how to group records into blocks
4
4 Representing Data Elements A record is always smaller than a block –Use BLOBs when needed Pack multiple records into blocks –But initially don’t fill (why ?) Cross block bounderies ? –Advantage: better space utilization –Disadvantage: slower access
5
5 BLOB Binary large objects Supported by modern database systems E.g. images, sounds, etc. Storage: attempt to cluster blocks together CLOB = character large objec Supports only restricted operations
6
Indexes An index on a file speeds up selections on the search key fields for the index. –Any subset of the fields of a relation can be the search key for an index on the relation. –Search key is not the same as key (minimal set of fields that uniquely identify a record in a relation). An index contains a collection of data entries, and supports efficient retrieval of all data entries with a given key value k.
7
7 Index Classification Primary/secondary –Primary = may reorder data according to index –Secondary = cannot reorder data Clustered/unclustered –Clustered = records close in the index are close in the data –Unclustered = records close in the index may be far in the data Dense/sparse –Dense = every key in the data appears in the index –Sparse = the index contains only some keys B+ tree / Hash table / …
8
8 Primary Index File is sorted on the index attribute Dense index: sequence of (key,pointer) pairs 10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
9
9 Primary Index Sparse index 10 30 50 70 90 110 130 150 10 20 30 40 50 60 70 80
10
10 Primary Index with Duplicate Keys Dense index: 10 20 30 40 50 60 70 80 10 20 30 40
11
11 Primary Index with Duplicate Keys Sparse index: pointer to lowest search key in each block: Search for 20 10 20 30 10 20 30 40 20 is here......but need to search here too
12
12 Better: pointer to lowest new search key in each block: Search for 20 Search for 15 ? 35 ? Primary Index with Duplicate Keys 10 20 30 40 50 60 70 80 10 20 30 40 50 20 is here......ok to search from here 30
13
13 Secondary Indexes To index other attributes than primary key Always dense (why ?) 10 20 30 20 30 20 10 20 10 30
14
14 Clustered/Unclustered Primary indexes = usually clustered Secondary indexes = usually unclustered
15
Clustered vs. Unclustered Index Data entries ( Index File ) ( Data file ) Data Records Data entries Data Records CLUSTERED UNCLUSTERED
16
16 Secondary Indexes Applications: –index other attributes than primary key –index unsorted files (heap files) –index clustered data
17
17 B+ Trees Search trees Idea in B Trees: –make 1 node = 1 block Idea in B+ Trees: –Make leaves into a linked list (range queries are easier)
18
18 Parameter d = the degree Each node has >= d and <= 2d keys (except root) Each leaf has >=d and <= 2d keys: B+ Trees Basics 30120240 Keys k < 30 Keys 30<=k<120 Keys 120<=k<240Keys 240<=k 405060 40 5060 Next leaf
19
19 B+ Tree Example 80 2060100120140 101518203040506065808590 101518203040506065808590 d = 2 Find the key 40 40 80 20 < 40 60 30 < 40 40
20
20 B+ Tree Design How large d ? Example: –Key size = 4 bytes –Pointer size = 8 bytes –Block size = 4096 byes 2d x 4 + (2d+1) x 8 <= 4096 d = 170
21
21 Searching a B+ Tree Exact key values: –Start at the root –Proceed down, to the leaf Range queries: –As above –Then sequential traversal Select name From people Where age = 25 Select name From people Where age = 25 Select name From people Where 20 <= age and age <= 30 Select name From people Where 20 <= age and age <= 30
22
B+ Trees in Practice Typical order: 100. Typical fill-factor: 67%. –average fanout = 133 Typical capacities: –Height 4: 133 4 = 312,900,700 records –Height 3: 133 3 = 2,352,637 records Can often hold top levels in buffer pool: –Level 1 = 1 page = 8 Kbytes –Level 2 = 133 pages = 1 Mbyte –Level 3 = 17,689 pages = 133 MBytes
23
23 Insertion in a B+ Tree Insert (K, P) Find leaf where K belongs, insert If no overflow (2d keys or less), halt If overflow (2d+1 keys), split node, insert in parent: If leaf, keep K3 too in right node When root splits, new root has 1 key only K1K2K3K4K5 P0P1P2P3P4p5 K1K2 P0P1P2 K4K5 P3P4p5 parent K3 parent
24
24 Insertion in a B+ Tree 80 2060100120140 101518203040506065808590 101518203040506065808590 Insert K=19
25
25 Insertion in a B+ Tree 80 2060100120140 10151819203040506065808590 10151820304050606580859019 After insertion
26
26 Insertion in a B+ Tree 80 2060100120140 10151819203040506065808590 10151820304050606580859019 Now insert 25
27
27 Insertion in a B+ Tree 80 2060100120140 1015181920253040506065808590 10151820253040606580859019 After insertion 50
28
28 Insertion in a B+ Tree 80 2060100120140 1015181920253040506065808590 10151820253040606580859019 But now have to split ! 50
29
29 Insertion in a B+ Tree 80 203060100120140 1015181920256065808590 10151820253040606580859019 After the split 50 304050
30
30 Deletion from a B+ Tree 80 203060100120140 1015181920256065808590 10151820253040606580859019 Delete 30 50 304050
31
31 Deletion from a B+ Tree 80 203060100120140 1015181920256065808590 101518202540606580859019 After deleting 30 50 4050 May change to 40, or not
32
32 Deletion from a B+ Tree 80 203060100120140 1015181920256065808590 101518202540606580859019 Now delete 25 50 4050
33
33 Deletion from a B+ Tree 80 203060100120140 10151819206065808590 1015182040606580859019 After deleting 25 Need to rebalance Rotate 50 4050
34
34 Deletion from a B+ Tree 80 193060100120140 10151819206065808590 1015182040606580859019 Now delete 40 50 4050
35
35 Deletion from a B+ Tree 80 193060100120140 10151819206065808590 10151820606580859019 After deleting 40 Rotation not possible Need to merge nodes 50
36
36 Deletion from a B+ Tree 80 1960100120140 1015181920506065808590 10151820606580859019 Final tree 50
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.