Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage CMSC 461 Michael Wilson. Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds.

Similar presentations


Presentation on theme: "Storage CMSC 461 Michael Wilson. Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds."— Presentation transcript:

1 Storage CMSC 461 Michael Wilson

2 Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds of thousands/millions of rows in memory  Numerous ways we could accomplish this  We have to take a few things into consideration

3 Storage concerns  Insertion efficiency  When dealing with large amounts of data, it will become more and more of a problem to deal with inserting data depending on how you insert  Retrieval efficiency  Similarly, a larger index of data to search will also result in problems  Space  Make sure our data structure doesn’t take up a large amount of disk space

4 Storage structures  Arrays?  Hash map?

5 B-tree  Generalization of a binary search tree (BST)  Can have more than two children  Non-leaf nodes have several keys  Each key defines the bounds of the children of a node  num keys = num children – 1  Nodes contain keys and are paired with values  All leaves must be at the same depth

6 B-tree  Number of possible children in the tree is the order of the tree (Knuth’s definition)  Can have a minimum number of keys that must be in a node  Typically choose the maximum number of keys to be twice the minimum number  This helps with balancing  A number of keys less than the minimum is called an underflow

7 B-tree  Non-leaf node with 3 children  Non-leaf node has keys k 1 and k 2 such that k 1 < k 2  All keys less than k 1 will be in the child to the left of k 1  All keys in between k 1 and k 2 are in the child between k 1 and k 2  All keys greater than k 2 are in the child to the right of k 2

8 B-tree example

9 Insertion  Insert into the most appropriate leaf  If the node isn’t full, no problem – insert in the proper order (ordered keys)  If the node is full, we need to split

10 Splitting  A node splits when we try to insert a value into it and it is full  Take the list of numbers from the appropriate node and pick a median from that list  Remove it and store it in a value x  Make two new leaf nodes from the existing list  Left node – all values less than x  Right node – all values greater than x  Insert x into the parent node of the two new nodes and attach them appropriately

11 Splitting note  When inserting into the parent node, the two new child nodes stay at the same level  A B tree only grows in height from the root

12 Deletion  Deletion is more complicated  Two cases  Deleting from a leaf node  Deleting values from a leaf  Deleting from an internal node  Deleting a separator value

13 Deleting from a leaf node  If the value can be deleted and the node will not underflow, then delete it  Otherwise, the node is deficient  We must do work to rebalance the tree

14 Rotation (stealing from your siblings!)  You may remember this from red black trees  Similar, but not quite the same here  If a deficient node has a right sibling and it has keys to spare, rotate left  If a deficient node has a left sibling and it has keys to spare, rotate right

15 Rotating left  Rotate left  Copy the separator between the deficient node and it’s right sibling to the end of the deficient node  Replace the separator with the lowest value from the right sibling

16 Rotating right  Rotate right  Copy the separator between the deficient node and it’s left sibling to the end of the deficient node  Replace the separator with the lowest value from the left sibling

17 Third case  What if neither sibling has keys to spare?  Third case:  We merge two siblings together  Pick a sibling (any sibling!)  Doesn’t matter which  Refer to them as the left node and right node

18 Merging siblings (stealing from your parents!)  Copy the separator between the two nodes from the parent to the left node  Move all elements from the right node to the left  Remove the separator from the parent and remove the right node  If the parent was the root and it now has no elements, replace the root with the new node that was just created  If the parent is now underflowing, rebalance using this method

19 Deleting from an internal node (stealing from children!)  This is pretty simple  The value to be deleted is a separator  Pull the highest value from the left child or the lowest value from the right child and replace the separator, deleting it from the child it was taken from


Download ppt "Storage CMSC 461 Michael Wilson. Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds."

Similar presentations


Ads by Google