Download presentation
Presentation is loading. Please wait.
Published byNatalie Farmer Modified over 9 years ago
1
1 CPS216: Data-intensive Computing Systems Operators for Data Access (contd.) Shivnath Babu
2
2 Insertion in a B-Tree 49 n = 2 15 36 Insert: 62
3
3 Insertion in a B-Tree 49 n = 2 15 36 Insert: 62 62
4
4 Insertion in a B-Tree 49 n = 2 15 3662 Insert: 50
5
5 Insertion in a B-Tree 49 n = 2 15 3650 Insert: 50 62
6
6 Insertion in a B-Tree 49 n = 2 15 3650 Insert: 75 62
7
7 Insertion in a B-Tree 49 n = 2 15 3650 Insert: 75 62 75
8
8 Insertion
9
9 Insertion
10
10 Insertion
11
11 Insertion
12
12 Insertion
13
13 Insertion
14
14 Insertion
15
15 Insertion
16
16 Insertion
17
17 Insertion
18
18 Insertion
19
19 Insertion: Primitives Inserting into a leaf node Inserting into a leaf node Splitting a leaf node Splitting a leaf node Splitting an internal node Splitting an internal node Splitting root node Splitting root node
20
20 Inserting into a Leaf Node 54576062 58
21
21 Inserting into a Leaf Node 54576062 58
22
22 Inserting into a Leaf Node 54576062 58
23
23 61 5457606258 5466 Splitting a Leaf Node
24
24 61 5457606258 5466 Splitting a Leaf Node
25
25 61 5457616258 5466 60 Splitting a Leaf Node
26
26 61 5457616258 5466 60 59 Splitting a Leaf Node
27
27 61 5457616258 5466 60 59 Splitting a Leaf Node
28
59 546640 [ 59, 66)[54, 59) 7484 9921 … … [66,74) Splitting an Internal Node
29
59 5466407484 9921 … … [ 59, 66)[54, 59)[66,74) Splitting an Internal Node
30
5954 66 407484 9921 … … [66, 99) [ 59, 66)[54, 59) [21,66) [66,74) Splitting an Internal Node
31
5466407484 59 [ 59, 66)[54, 59)[66,74) Splitting the Root
32
5466407484 59 [ 59, 66)[54, 59)[66,74) Splitting the Root
33
54 66 40748459 [ 59, 66)[54, 59)[66,74) Splitting the Root
34
34 Deletion
35
35 Deletion redistribute
36
36 Deletion
37
37 Deletion - II
38
merge
39
39 Deletion - II
40
40 Deletion - II
41
41 Deletion - II
42
42 Deletion - II merge Not needed
43
43 Deletion - II
44
44 Deletion: Primitives Delete key from a leaf Delete key from a leaf Redistribute keys between sibling leaves Redistribute keys between sibling leaves Merge a leaf into its sibling Merge a leaf into its sibling Redistribute keys between two sibling internal nodes Redistribute keys between two sibling internal nodes Merge an internal node into its sibling Merge an internal node into its sibling
45
45 Merge Leaf into Sibling 545864687275 67 85…72
46
46 Merge Leaf into Sibling 5458646875 67 …7285
47
47 Merge Leaf into Sibling 5458646875 67 …7285
48
48 Merge Leaf into Sibling 5458646875 …72 85
49
49 Merge Internal Node into Sibling 41 4852 6374 59 [52, 59) [59,63) … …
50
50 Merge Internal Node into Sibling 41 485263 59 [52, 59) [59,63) 59 … …
51
51 B-Tree Roadmap B-Tree B-Tree Recap Recap Insertion (recap) Insertion (recap) Deletion Deletion Construction Construction Efficiency Efficiency B-Tree variants B-Tree variants Hash-based Indexes Hash-based Indexes
52
52 Question How does insertion-based construction perform?
53
53 B-Tree Construction 111315213441485762758197 Sort
54
B-Tree Construction 759721415715111348346281 Scan 758197 111315213441 4857 62
55
B-Tree Construction 214875 111315213441 4857 62758197 Scan
56
56 B-Tree Construction Why is sort-based construction better than insertion-based one?
57
57 Cost of B-Tree Operations Height of B-Tree: H Height of B-Tree: H Assume no duplicates Assume no duplicates Question: what is the random I/O cost of: Question: what is the random I/O cost of: Insertion: Insertion: Deletion: Deletion: Equality search: Equality search: Range Search: Range Search:
58
58 Height of B-Tree Number of keys: N Number of keys: N B-Tree parameter: n B-Tree parameter: n Height ≈ log N = n log N log n In practice: 2-3 levels
59
59 Question: How do you pick parameter n? 1. Ignore inserts and deletes 2. Optimize for equality searches 3. Assume no duplicates
60
60 Roadmap B-Tree B-Tree B-Tree variants B-Tree variants Sparse Index Sparse Index Duplicate Keys Duplicate Keys Hash-based Indexes Hash-based Indexes
61
61 Roadmap B-Tree B-Tree B-Tree variants B-Tree variants Hash-based Indexes Hash-based Indexes Static Hash Table Static Hash Table Extensible Hash Table Extensible Hash Table Linear Hash Table Linear Hash Table
62
62 Hash-Based Indexes Adaptations of main memory hash tables Adaptations of main memory hash tables Support equality searches Support equality searches No range searches No range searches
63
Indexing Problem (recap) a 1 2 a i a n a A = val Index Keys record pointers
64
64 Main Memory Hash Table buckets 32 (null) 10 48 2775 21 55 0 3 1 2 4 5 6 7 key h (key) h (key) = key % 8
65
65 Adapting to disk 1 Hash Bucket = 1 Block 1 Hash Bucket = 1 Block All keys that hash to bucket stored in the block All keys that hash to bucket stored in the block Intuition: keys in a bucket usually accessed together Intuition: keys in a bucket usually accessed together No need for linked lists of keys … No need for linked lists of keys …
66
66 Adapting to Disk How do we handle this?
67
67 Adapting to disk 1 Hash Bucket = 1 Block 1 Hash Bucket = 1 Block All keys that hash to bucket stored in the block All keys that hash to bucket stored in the block Intuition: keys in a bucket usually accessed together Intuition: keys in a bucket usually accessed together No need for linked lists of keys … No need for linked lists of keys … … but need linked list of blocks (overflow blocks) … but need linked list of blocks (overflow blocks)
68
68 Adapting to Disk
69
69 Adapting to disk Bucket Id Disk Address mapping Bucket Id Disk Address mapping Contiguous blocks Contiguous blocks Store mapping in main memory Store mapping in main memory Too large? Too large? Dynamic Linear and Extensible hash tables Dynamic Linear and Extensible hash tables
70
70 Beware of claims that assume 1 I/O for hash tables and 3 I/Os for B-Tree!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.