Download presentation
Presentation is loading. Please wait.
Published byCassandra Peters Modified over 9 years ago
1
Data Organization - B-trees
2
11.2Database System Concepts A simple index Brighton A-217 700 Downtown A-101 500 Downtown A-110 600 Mianus A-215 700 Perry A-102 400...... A-101 A-102 A-110 A-215 A-217...... Index of depositors on acct_no Index records: To answer a query for “acct_no=A-110” we: 1. Do a binary search on index file, searching for A-110 2. “Chase” pointer of index record Index file
3
11.3Database System Concepts Index Choices 1. Primary: index search key = physical order search key vs Secondary: all other indexes Q: how many primary indices per relation? 2. Dense: index entry for every search key value vs Sparse: some search key values not in the index 3. Single level vs Multilevel (index on the indices)
4
11.4Database System Concepts Measuring ‘goodness’ On what basis do we compare different indices? 1. Access type: what type of queries can be answered: selection queries (ssn = 123)? range queries ( 100 <= ssn <= 200)? 2. Access time: what is the cost of evaluating queries Measured in # of block accesses 3. Maintenance overhead: cost of insertion / deletion? (also BA’s) 4. Space overhead : in # of blocks needed to store the index
5
11.5Database System Concepts Indexing Primary (or clustering) index on SSN
6
11.6Database System Concepts Indexing Secondary (or non-clustering) index: duplicates may exist Address-index Can have many secondary indices but only one primary index
7
11.7Database System Concepts Indexing secondary index: typically, with ‘postings lists’ Postings lists
8
11.8Database System Concepts Indexing Primary/sparse index on ssn (primary key) >=123 >=456
9
11.9Database System Concepts Indexing Secondary / dense index Secondary on a candidate key: No duplicates, no need for posting lists
10
11.10Database System Concepts Summary DenseSparse Primaryrareusual secondaryusual All combinations are possible at most one sparse/clustering index as many as desired dense indices usually: one primary index (probably sparse) and a few secondary indices (non-clustering)
11
11.11Database System Concepts ISAM >=123 >=456 block 2 nd level sparse index on the values of the 1 st level What if index is too large to search in memory?
12
11.12Database System Concepts ISAM - observations What about insertions/deletions? >=123 >=456 124; peterson; fifth ave.
13
11.13Database System Concepts ISAM - observations What about insertions/deletions? 124; peterson; fifth ave. overflows Problems?
14
11.14Database System Concepts ISAM - observations What about insertions/deletions? 124; peterson; fifth ave. overflows overflow chains may become very long - what to do?
15
11.15Database System Concepts ISAM - observations What about insertions/deletions? 124; peterson; fifth ave. overflows overflow chains may become very long - thus: shut-down & reorganize start with ~80% utilization
16
11.16Database System Concepts ISAM - observations if index is too large, store it on disk and keep index on the index (in memory) usually two levels of indices, one first- level entry per disk block typically, blocks: 80% full initially (what are potential problems / inefficiencies?)
17
11.17Database System Concepts So far … indices (like ISAM) suffer in the presence of frequent updates alternative indexing structure: B - trees
18
11.18Database System Concepts Overview primary / secondary indices multilevel (ISAM) B - trees, B+ - trees hashing static hashing dynamic hashing
19
11.19Database System Concepts B-trees the most successful family of index schemes (B-trees, B +- trees, B * - trees) Can be used for primary/secondary, clustering/non-clustering index. balanced “n-way” search trees
20
11.20Database System Concepts B-trees Eg., B-tree of order 3: 1 3 6 7 9 13 <6 >6 <9 >9
21
11.21Database System Concepts B-tree Nodes v1v2 …v n-1 p1 pn v<v1 v1 <= v < v2 Vn-1 <= v Key values are ordered MAXIMUM: n pointer values MINIMUM: n/2 pointer values (Exception: root’s minimum = 2)
22
11.22Database System Concepts Properties “block aware” nodes: each node -> disk page O(log B (N)) for everything! (ins/del/search) typically, if m = 50 - 100, then 2 - 3 levels utilization >= 50%, guaranteed; on average 69%
23
11.23Database System Concepts Queries Algorithm for exact match query? (eg., ssn=8?) 1 3 6 7 9 13 <6 >6 <9 >9
24
11.24Database System Concepts Queries Algorithm for exact match query? (eg., ssn=8?) 1 3 6 7 9 13 <6 >6 <9 >9
25
11.25Database System Concepts Queries Algo for exact match query? (eg., ssn=8?) 1 3 6 7 9 13 <6 >6 <9 >9
26
11.26Database System Concepts Queries Algo for exact match query? (eg., ssn=8?) 1 3 6 7 9 13 <6 >6 <9 >9
27
11.27Database System Concepts Queries Algo for exact match query? (eg., ssn=8?) 1 3 6 7 9 13 <6 >6 <9 >9 H steps (= disk accesses)
28
11.28Database System Concepts Queries Algo for exact match query? (eg., ssn=8?) 1 3 6 7 9 13 <6 >6 <9 >9
29
11.29Database System Concepts Queries what about range queries? (eg., 5<salary<8) Proximity/ nearest neighbor searches? (eg., salary ~ 8 )
30
11.30Database System Concepts Queries what about range queries? (eg., 5<salary<8) Proximity/ nearest neighbor searches? (eg., salary ~ 8 ) 1 3 6 7 9 13 <6 >6 <9 >9
31
11.31Database System Concepts Queries what about range queries? (eg., 5<salary<8) Proximity/ nearest neighbor searches? (eg., salary ~ 8 ) 1 3 6 7 9 13 <6 >6 <9 >9
32
11.32Database System Concepts B-trees: Insertion Insert in leaf; on overflow, push middle up (recursively) split: preserves B - tree properties
33
11.33Database System Concepts B-trees Easy case: Tree T0; insert ‘8’ 1 3 6 7 9 13 <6 >6 <9 >9
34
11.34Database System Concepts B-trees Tree T0; insert ‘8’ 1 3 6 7 9 13 <6 >6 <9 >9 8
35
11.35Database System Concepts B-trees Hardest case: Tree T0; insert ‘2’ 1 3 6 7 9 13 <6 >6 <9 >9 2
36
11.36Database System Concepts B-trees Hardest case: Tree T0; insert ‘2’ 1 2 6 7 9 13 3 push middle up
37
11.37Database System Concepts B-trees Hardest case: Tree T0; insert ‘2’ 6 7 9 1313 2 2 Ovf; push middle
38
11.38Database System Concepts B-trees Hardest case: Tree T0; insert ‘2’ 7 9 1313 2 6 Final state
39
11.39Database System Concepts B-trees - insertion Q: What if there are two middles? (eg, order 4) A: either one is fine
40
11.40Database System Concepts B-trees: Insertion Insert in leaf; on overflow, push middle up (recursively – ‘propagate split’) split: preserves all B - tree properties (!!) notice how it grows: height increases when root overflows & splits Automatic, incremental re-organization (contrast with ISAM!)
41
11.41Database System Concepts INSERTION OF KEY ’K’ find the correct leaf node ’L’; if ( ’L’ overflows ){ split ’L’, by pushing the middle key upstairs to parent node ’P’; if (’P’ overflows){ repeat the split recursively; } else{ add the key ’K’ in node ’L’; /* maintaining the key order in ’L’ */ } Pseudo-code
42
11.42Database System Concepts Overview primary / secondary indices multilevel (ISAM) B – trees Dfn, Search, insertion, deletion B+ - trees hashing
43
11.43Database System Concepts Deletion Rough outline of algo: Delete key; on underflow, may need to merge In practice, some implementers just allow underflows to happen…
44
11.44Database System Concepts B-trees – Deletion Easiest case: Tree T0; delete ‘3’ 1 3 6 7 9 13 <6 >6 <9 >9
45
11.45Database System Concepts B-trees – Deletion Easiest case: Tree T0; delete ‘3’ 1 6 7 9 13 <6 >6 <9 >9
46
11.46Database System Concepts B-trees – Deletion Case1: delete a key at a leaf – no underflow Case2: delete non-leaf key – no underflow Case3: delete leaf-key; underflow, and ‘rich sibling’ Case4: delete leaf-key; underflow, and ‘poor sibling’
47
11.47Database System Concepts B-trees – Deletion Case1: delete a key at a leaf – no underflow (delete 3 from T0) 1 3 6 7 9 13 <6 >6 <9 >9
48
11.48Database System Concepts B-trees – Deletion Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0) 1 3 6 7 9 13 <6 >6 <9 >9 Delete & promote, ie:
49
11.49Database System Concepts B-trees – Deletion Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0) 1 3 7 9 13 <6 >6 <9 >9 Delete & promote, ie:
50
11.50Database System Concepts B-trees – Deletion Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0) 17 9 13 <6 >6 <9 >9 Delete & promote, ie: 3
51
11.51Database System Concepts B-trees – Deletion Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0) 17 9 13 <3 >3 <9 >9 3 FINAL TREE
52
11.52Database System Concepts B-trees – Deletion Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0) Q: How to promote? A: pick the largest key from the left sub-tree (or the smallest from the right sub-tree) Observation: every deletion eventually becomes a deletion of a leaf key
53
11.53Database System Concepts B-trees – Deletion Case1: delete a key at a leaf – no underflow Case2: delete non-leaf key – no underflow Case3: delete leaf-key; underflow, and ‘rich sibling’ Case4: delete leaf-key; underflow, and ‘poor sibling’
54
11.54Database System Concepts B-trees – Deletion Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0) 1 3 6 7 9 13 <6 >6 <9 >9 Delete & borrow, ie:
55
11.55Database System Concepts B-trees – Deletion Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0) 1 3 69 13 <6 >6 <9 >9 Delete & borrow, ie: Rich sibling
56
11.56Database System Concepts B-trees – Deletion Case3: underflow & ‘rich sibling’ ‘rich’ = can give a key, without underflowing ‘borrowing’ a key: THROUGH the PARENT!
57
11.57Database System Concepts B-trees – Deletion Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0) 1 3 69 13 <6 >6 <9 >9 Delete & borrow, ie: Rich sibling NO!!
58
11.58Database System Concepts B-trees – Deletion Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0) 1 3 69 13 <6 >6 <9 >9 Delete & borrow, ie:
59
11.59Database System Concepts B-trees – Deletion Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0) 1 39 13 <6 >6 <9 >9 Delete & borrow, ie: 6
60
11.60Database System Concepts B-trees – Deletion Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0) 1 39 13 <3 >3 <9 >9 Delete & borrow, through the parent 6 FINAL TREE
61
11.61Database System Concepts B-trees – Deletion Case1: delete a key at a leaf – no underflow Case2: delete non-leaf key – no underflow Case3: delete leaf-key; underflow, and ‘rich sibling’ Case4: delete leaf-key; underflow, and ‘poor sibling’
62
11.62Database System Concepts B-trees – Deletion Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0) 1 3 6 7 9 13 <6 >6 <9 >9
63
11.63Database System Concepts B-trees – Deletion Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0) 1 3 6 7 9 <6 >6 <9 >9
64
11.64Database System Concepts B-trees – Deletion Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0) 1 3 6 7 9 <6 >6 <9 >9 A: merge w/ ‘poor’ sibling
65
11.65Database System Concepts B-trees – Deletion Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0) Merge, by pulling a key from the parent exact reversal from insertion: ‘split and push up’, vs. ‘merge and pull down’ Ie.:
66
11.66Database System Concepts B-trees – Deletion Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0) 1 3 6 7 <6 >6 A: merge w/ ‘poor’ sibling 9
67
11.67Database System Concepts B-trees – Deletion Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0) 1 3 6 7 <6 >6 9 FINAL TREE
68
11.68Database System Concepts B-trees – Deletion Case4: underflow & ‘poor sibling’ -> ‘pull key from parent, and merge’ Q: What if the parent underflows? A: repeat recursively
69
11.69Database System Concepts B-tree deletion - pseudocode DELETION OF KEY ’K’ locate key ’K’, in node ’N’ if( ’N’ is a non-leaf node) { delete ’K’ from ’N’; find the immediately largest key ’K1’; /* which is guaranteed to be on a leaf node ’L’ */ copy ’K1’ in the old position of ’K’; invoke this DELETION routine on ’K1’ from the leaf node ’L’; else { /* ’N’ is a leaf node */... (next slide..)
70
11.70Database System Concepts B-tree deletion - pseudocode /* ’N’ is a leaf node */ if( ’N’ underflows ){ let ’N1’ be the sibling of ’N’; if( ’N1’ is "rich"){ /* ie., N1 can lend us a key */ borrow a key from ’N1’ THROUGH the parent node; }else{ /* N1 is 1 key away from underflowing */ MERGE: pull the key from the parent ’P’, and merge it with the keys of ’N’ and ’N1’ into a new node; if( ’P’ underflows){ repeat recursively } }
71
11.71Database System Concepts B-trees in practice In practice: no empty leaves; ptrs to records 1 3 6 7 9 13 <6 >6 <9 >9 theory
72
11.72Database System Concepts B-trees in practice In practice: no empty leaves; ptrs to records 1 3 6 7 9 13 <6 >6 <9 >9 practice
73
11.73Database System Concepts B-trees in practice In practice: 13 6 7 9 13 <6 >6 <9 >9 Ssn…… 3 7 6 9 1
74
11.74Database System Concepts B-trees in practice In practice, the formats are: -leaf nodes: (v1, rp1, v2, rp2, … vn, rpn) -Non-leaf nodes: (p1, v1, rp1, p2, v2, rp2, …) 13 6 7 9 13 <6 >6 <9 >9
75
11.75Database System Concepts Overview primary / secondary indices multilevel (ISAM) B – trees B+ - trees hashing
76
11.76Database System Concepts B+ trees - Motivation B-tree – print keys in sorted order: 1 3 6 7 9 13 <6 >6 <9 >9
77
11.77Database System Concepts B+ trees - Motivation B-tree needs back-tracking – how to avoid it? 1 3 6 7 9 13 <6 >6 <9 >9
78
11.78Database System Concepts Solution: B + - trees facilitate sequential ops They string all leaf nodes together AND replicate keys from non-leaf nodes, to make sure every key appears at the leaf level
79
11.79Database System Concepts B+-trees Eg., B+-tree of order 3: 3 4 69 9 <6 =>6 <9 =>9 6713 (3, Joe, 23) (3, Bob, 23) (4, John, 23) ………… root: internal node leaf node Data File
80
11.80Database System Concepts B+ tree insertion INSERTION OF KEY ’K’ insert search-key value to ’L’ such that the keys are in order; if ( ’L’ overflows) { split ’L’ ; insert (ie., COPY) smallest search-key value of new node to parent node ’P’; if (’P’ overflows) { repeat the B-tree split procedure recursively; /* Notice: the B-TREE split; NOT the B+ -tree */ }
81
11.81Database System Concepts B+-tree insertion – cont’d /* ATTENTION: a split at the LEAF level is handled by COPYING the middle key upstairs; A split at a higher level is handled by PUSHING the middle key upstairs */
82
11.82Database System Concepts B+ trees - insertion 1 3 6 6 9 9 <6 >=6>=6 <9 >=9>=9 713 Eg., insert ‘8’
83
11.83Database System Concepts B+ trees - insertion 1 3 6 6 9 9 <6 >=6 <9 >=9 713 Eg., insert ‘8’ 8
84
11.84Database System Concepts B+ trees - insertion 1 3 6 6 9 9 <6 >=6 <9 >=9 713 Eg., insert ‘8’ 8 COPY middle upstairs
85
11.85Database System Concepts B+ trees - insertion 1 3 6 6 9 <6 >=6 <9 >=9 9 13 Eg., insert ‘8’ COPY middle upstairs 7 8 7
86
11.86Database System Concepts B+ trees - insertion 1 3 6 6 9 <6 >=6 <9 >=9 9 13 Eg., insert ‘8’ COPY middle upstairs 7 8 7 Non-leaf overflow – just PUSH the middle
87
11.87Database System Concepts B+ trees – insertion 1 3 6 6 <6 >=6 >=9 9 13 Eg., insert ‘8’ 7 8 7 9 <7>=7 <9 FINAL TREE
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.