Download presentation
Presentation is loading. Please wait.
Published byMagdalene Pitts Modified over 9 years ago
1
1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files Hashing Hashing Static Static Dynamic Hashing Dynamic Hashing
2
2 Basic Concepts Value Search Key - set of attributes used to look up records in a file. Search Key - set of attributes used to look up records in a file. value record search key pointer
3
3 Index Evaluation Metrics Access types supported efficiently. E.g., Access types supported efficiently. E.g., Point query: find “Tom” Point query: find “Tom” Range query: find students whose age is between 20- 40 Range query: find students whose age is between 20- 40 Access time Access time Update time Update time Space overhead Space overhead
4
4 Ordered Indices In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library. In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library.
5
5 20 10 40 30 60 50 80 70 100 90 10 30 50 70 90 110 130 150 170 190 210 230 Primary index Also called clustering index The search key of a primary index is usually but not necessarily the primary key. same order Search key
6
6 50 30 70 20 40 80 10 100 60 90 10 20 30 40 50 60 70... Secondary index: non-clustering index. different order
7
7 Sequential File 20 10 40 30 60 50 80 70 100 90 Dense Index 10 20 30 40 50 60 70 80 90 100 110 120 Dense Index: contains index records for every search-key values.
8
8 Sequential File 20 10 40 30 60 50 80 70 100 90 Sparse Index 10 30 50 70 90 110 130 150 170 190 210 230 Sparse Index: contains index records for only some search- key values. Applicable when records are sequentially ordered on search-key
9
9 Secondary indexes Sequence field 50 30 70 20 40 80 10 100 60 90 Sparse index 30 20 80 100 90... does not make sense!
10
10 Sequential File 20 10 40 30 60 50 80 70 100 90 Sparse 2nd level 10 30 50 70 90 110 130 150 170 190 210 230 10 90 170 250 330 410 490 570 Multilevel Index
11
11 Secondary indexes Sequence field 50 30 70 20 40 80 10 100 60 90 10 20 30 40 50 60 70... 10 50 90... sparse high level Lowest level is dense Lowest level is dense Other levels are sparse Other levels are sparse Multilevel Index
12
12 Conventional indexes Advantage: - Simple - Index is sequential file good for scans scans Disadvantage: - Inserts expensive
13
13 Outline Conventional indexes Conventional indexes B+-Tree NEXT B+-Tree NEXT
14
14 NEXT: Another type of index NEXT: Another type of index Give up on sequentiality of index Give up on sequentiality of index Try to get “balance” Try to get “balance”
15
15 Root B+Tree Examplen=4 100 120 150 180 30 3 5 11 30 35 100 101 110 120 130 150 156 179 180 200
16
16 Sample non-leaf 57 81 95 Key is moved (not copied) from lower level non-leaf node to upper level non-leaf node to keysto keysto keys to keys < 5757 k<8181 k<95 95
17
17 Sample leaf node: From non-leaf node to next leaf in sequence 57 81 95 To record with key 57 To record with key 81 To record with key 85 Key is copied (not moved) from leaf node to non-leaf node
18
18 n=4 Leaf:Non-leaf: 30 35 30 35 30
19
19 Size of nodes: n pointers n-1 keys n-1 keys
20
20 Don’t want nodes to be too empty Use at least Use at least Root : 2 pointers Root : 2 pointers Non-leaf: n/2 pointers Leaf : (n-1)/2 keys
21
21 Full nodemin. node Non-leafLeaf n=4 120 150 180 30 3 5 11 30 35 counts even if null
22
22 B+tree rulestree of order n (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer” except for “sequence pointer”
23
23 (3) Number of pointers/keys for B+tree Non-leaf (non-root) nn-1 n/ 2 n/ 2 - 1 Leaf (non-root) nn-1 Rootnn-121 Max Max Min Min ptrs keys ptrs data keys (n-1)/2
24
24 Insert into B+tree (a) simple case space available in leaf space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root
25
25 (a) Insert key = 32 n=4 3 5 11 30 31 30 100 32
26
26 (b) Insert key = 7 n=4 3 5 11 30 31 30 100 3535 7 7
27
27 (c) Insert key = 160 n=4 100 120 150 180 150 156 179 180 200 160 180 160 179
28
28 (d) New root, insert 45 n=4 10 20 30 123123 10 12 20 25 30 32 40 45 4030 new root
29
29 (a) Simple case - no example (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf Deletion from B+tree
30
30 (b) Coalesce with sibling Delete 50 Delete 50 10 40 100 10 20 30 40 50 n=5 40
31
31 (c) Redistribute keys Delete 50 Delete 50 10 40 100 10 20 30 35 40 50 n=5 35
32
32 40 45 30 37 25 26 20 22 10 14 1313 10 2030 40 (d) Non-leaf coalese Delete 37 Delete 37 n=5 40 30 25 new root
33
33 B+tree deletions in practice – Often, coalescing is not implemented Too hard and not worth it! Too hard and not worth it!
34
34 Index Definition in SQL Create an index Create an index create index create index on ( ) on ( ) E.g.: create index gindex on country(gdp); To drop an index To drop an index drop index drop index E.g.: drop index gindex;
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.