Download presentation
Presentation is loading. Please wait.
Published byBarnaby McCoy Modified over 9 years ago
1
Storage CMSC 461 Michael Wilson
2
Database storage At some point, database information must be stored in some format It’d be impossible to store hundreds of thousands/millions of rows in memory Numerous ways we could accomplish this We have to take a few things into consideration
3
Storage concerns Insertion efficiency When dealing with large amounts of data, it will become more and more of a problem to deal with inserting data depending on how you insert Retrieval efficiency Similarly, a larger index of data to search will also result in problems Space Make sure our data structure doesn’t take up a large amount of disk space
4
Storage structures Arrays? Hash map?
5
B-tree Generalization of a binary search tree (BST) Can have more than two children Non-leaf nodes have several keys Each key defines the bounds of the children of a node num keys = num children – 1 Nodes contain keys and are paired with values All leaves must be at the same depth
6
B-tree Number of possible children in the tree is the order of the tree (Knuth’s definition) Can have a minimum number of keys that must be in a node Typically choose the maximum number of keys to be twice the minimum number This helps with balancing A number of keys less than the minimum is called an underflow
7
B-tree Non-leaf node with 3 children Non-leaf node has keys k 1 and k 2 such that k 1 < k 2 All keys less than k 1 will be in the child to the left of k 1 All keys in between k 1 and k 2 are in the child between k 1 and k 2 All keys greater than k 2 are in the child to the right of k 2
8
B-tree example
9
Insertion Insert into the most appropriate leaf If the node isn’t full, no problem – insert in the proper order (ordered keys) If the node is full, we need to split
10
Splitting A node splits when we try to insert a value into it and it is full Take the list of numbers from the appropriate node and pick a median from that list Remove it and store it in a value x Make two new leaf nodes from the existing list Left node – all values less than x Right node – all values greater than x Insert x into the parent node of the two new nodes and attach them appropriately
11
Splitting note When inserting into the parent node, the two new child nodes stay at the same level A B tree only grows in height from the root
12
Deletion Deletion is more complicated Two cases Deleting from a leaf node Deleting values from a leaf Deleting from an internal node Deleting a separator value
13
Deleting from a leaf node If the value can be deleted and the node will not underflow, then delete it Otherwise, the node is deficient We must do work to rebalance the tree
14
Rotation (stealing from your siblings!) You may remember this from red black trees Similar, but not quite the same here If a deficient node has a right sibling and it has keys to spare, rotate left If a deficient node has a left sibling and it has keys to spare, rotate right
15
Rotating left Rotate left Copy the separator between the deficient node and it’s right sibling to the end of the deficient node Replace the separator with the lowest value from the right sibling
16
Rotating right Rotate right Copy the separator between the deficient node and it’s left sibling to the end of the deficient node Replace the separator with the lowest value from the left sibling
17
Third case What if neither sibling has keys to spare? Third case: We merge two siblings together Pick a sibling (any sibling!) Doesn’t matter which Refer to them as the left node and right node
18
Merging siblings (stealing from your parents!) Copy the separator between the two nodes from the parent to the left node Move all elements from the right node to the left Remove the separator from the parent and remove the right node If the parent was the root and it now has no elements, replace the root with the new node that was just created If the parent is now underflowing, rebalance using this method
19
Deleting from an internal node (stealing from children!) This is pretty simple The value to be deleted is a separator Pull the highest value from the left child or the lowest value from the right child and replace the separator, deleting it from the child it was taken from
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.