B+-Trees j a0 k1 a1 k2 a2 … kj aj j = number of keys in node.

Slides:



Advertisements
Similar presentations
CPSC 231 B-Trees (D.H.)1 LEARNING OBJECTIVES Problems with simple indexing. Multilevel indexing: B-Tree. –B-Tree creation: insertion and deletion of nodes.
Advertisements

B + -Trees Same structure as B-trees. Dictionary pairs are in leaves only. Leaves form a doubly-linked list. Remaining nodes have following structure:
Indexing (cont.). Insertion in a B+ Tree Another B+ Tree
B-Trees (continued) Analysis of worst-case and average number of disk accesses for an insert. Delete and analysis. Structure for B-tree node.
B + -Trees (Part 2) COMP171. Slide 2 Review: B+ Tree of order M and of leaf size L n The root is either a leaf or 2 to M children n Each (internal) node.
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
2-3 Trees Extended tree.  Tree in which all empty subtrees are replaced by new nodes that are called external nodes.  Original nodes are called internal.
B + -Trees Same structure as B-trees. Dictionary pairs are in leaves only. Leaves form a doubly-linked list. Remaining nodes have following structure:
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
2-3 Tree. Slide 2 Outline  Balanced Search Trees 2-3 Trees Trees.
Binary Search Tree vs. Balanced Search Tree. Why care about advanced implementations? Same entries, different insertion sequence: 10,20,30,40,50,60,70,
Comp 335 File Structures B - Trees. Introduction Simple indexes provided a way to directly access a record in an entry sequenced file thereby decreasing.
2-3 Trees Extended tree.  Tree in which all empty subtrees are replaced by new nodes that are called external nodes.  Original nodes are called internal.
B-Tree – Delete Delete 3. Delete 8. Delete
B-Trees ( Rizwan Rehman) Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory.
B+-Tree Deletion Underflow conditions B+ tree Deletion Algorithm
COMP261 Lecture 23 B Trees.
Unit 9 Multi-Way Trees King Fahd University of Petroleum & Minerals
Red Black Trees Colored Nodes Definition Binary search tree.
B/B+ Trees 4.7.
Multiway Search Trees Data may not fit into main memory
Tree-Structured Indexes: Introduction
CS 728 Advanced Database Systems Chapter 18
B-Trees B-Trees.
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
CS522 Advanced database Systems
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
Database System Implementation CSE 507
Extra: B+ Trees CS1: Java Programming Colorado State University
Chapter 11: Multiway Search Trees
B+-Trees.
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
Multiway search trees and the (2,4)-tree
B+ Tree.
B+ Trees Similar to B trees, with a few slight differences
CPSC-629 Analysis of Algorithms
Dynamic Dictionaries Primary Operations: Additional operations:
Data Structures and Algorithms
Haim Kaplan and Uri Zwick November 2014
Chapter 6 Transform and Conquer.
Lecture 26 Multiway Search Trees Chapter 11 of textbook
B-Trees (continued) Analysis of worst-case and average number of disk accesses for an insert. Delete and analysis. Structure for B-tree node.
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
Height Balanced Trees 2-3 Trees.
CPSC-310 Database Systems
B+ Trees Similar to B trees, with a few slight differences
B-Trees.
B+-Trees and Static Hashing
B-Tree.
A Robust Data Structure
Lecture 21: B-Trees Monday, Nov. 19, 2001.
Multiway Trees Searching and B-Trees Advanced Tree Structures
Adapted from Mike Franklin
COMP171 B+-Trees (Part 2).
Solution for Section Worksheet 4, #7b & #7c
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
CSE 373: Data Structures and Algorithms
CPSC-608 Database Systems
2-3 Trees Extended tree. Tree in which all empty subtrees are replaced by new nodes that are called external nodes. Original nodes are called internal.
B+-Trees j a0 k1 a1 k2 a2 … kj aj j = number of keys in node.
CSE 373: Data Structures and Algorithms
COMP171 B+-Trees (Part 2).
CPSC-608 Database Systems
B-Trees.
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
CS210- Lecture 20 July 19, 2005 Agenda Multiway Search Trees 2-4 Trees
B+-trees In practice, B-trees are not used much as defined earlier.
Presentation transcript:

B+-Trees j a0 k1 a1 k2 a2 … kj aj j = number of keys in node. Same structure as B-trees. Dictionary pairs are in leaves only. Leaves form a doubly-linked list. Remaining nodes have following structure: j a0 k1 a1 k2 a2 … kj aj Non-leaf nodes can now be made smaller than leaf nodes, alternatively, capacity of non-leaf nodes can be made larger. Doubly-linked list useful for serial access in ascending order of key. May be dispensed with. Instead of smallest in right subtree, we may use largest in left or some key in-between. Item keys must still be distinct—I.e., no duplicates. j = number of keys in node. ai is a pointer to a subtree. ki <= smallest key in subtree ai and > largest in ai-1.

Example B+-tree 9 5 16 30 1 3 5 6 9 30 40 16 17  index node 16 30 1 3 5 6 9 30 40 Yellow nodes (leaves) have elements; green nodes have keys and pointers. Green nodes are index nodes. Leaf capacity may be different from index-node capacity. 16 17  index node  leaf/data node

B+-tree—Search 9 5 16 30 1 3 5 6 9 30 40 Range search 16 17 key = 5 6 <= key <= 20

B+-tree—Insert 9 5 16 30 1 5 6 9 30 40 16 17 Insert 10 16 30 1 5 6 9 30 40 Note that an insert that does not cause an overflow cannot change any of the index node entries. 16 17 Insert 10

Insert 9 5 16 30 1 3 5 6 9 30 40 16 17 Insert a pair with key = 2. 16 30 1 3 5 6 9 30 40 16 17 Insert a pair with key = 2. New pair goes into a 3-node.

Insert Into A 3-node Insert new pair so that the keys are in ascending order. 1 2 3 Split into two nodes. 2 3 1 Insert smallest key in new node and pointer to this new node into parent. The code will do all 3 steps as one. 2 2 3 1

Insert Insert an index entry 2 plus a pointer into parent. 9 5 2 16 30 16 30 2 3 1 5 6 9 30 40 16 17 Insert an index entry 2 plus a pointer into parent.

Insert 9 1 2 5 5 6 30 40 16 17 16 30 2 3 Now, insert a pair with key = 18.

Insert Insert an index entry17 plus a pointer into parent. 9 17 2 5 2 5 16 30 17 18 1 2 3 5 6 9 16 30 40 Insertion of index entries works as for B-trees. Now, insert a pair with key = 18. Insert an index entry17 plus a pointer into parent.

Insert Insert an index entry17 plus a pointer into parent. 17 9 2 5 16 2 5 16 30 1 2 3 5 6 9 16 17 18 30 40 Insertion of index entries works as for B-trees. Now, insert a pair with key = 18. Insert an index entry17 plus a pointer into parent.

Insert 9 17 2 5 16 30 1 2 3 5 6 9 16 17 18 30 40 Yellow splits into 5 & 6,7. Index entry 6 is inserted into parent. Parent splits as in a 2-3 tree, grandparent splits, root splits and height increases by 1. Now, insert a pair with key = 7.

Delete 9 1 2 5 5 6 30 40 16 17 16 30 2 3 Delete pair with key = 16. 2 5 5 6 30 40 16 17 16 30 2 3 Note that all deletions are necessarily from a leaf as only leaves have data. Delete pair with key = 16. Note: delete pair is always in a leaf.

Delete 9 2 5 16 30 1 2 3 5 6 9 17 30 40 Delete pair with key = 16. 2 5 16 30 1 2 3 5 6 9 17 30 40 Delete pair with key = 16. Note: delete pair is always in a leaf.

Delete 9 2 5 16 30 1 2 3 5 6 9 17 30 40 Delete pair with key = 1. 2 5 16 30 1 2 3 5 6 9 17 30 40 Note: we may borrow more than 1 pair from sibling; could balance size of deficient node and sibling! Delete pair with key = 1. Get >= 1 from adjacent sibling and update parent key.

Delete 9 3 5 16 30 2 3 5 6 9 17 30 40 Delete pair with key = 1. 3 5 16 30 2 3 5 6 9 17 30 40 Delete pair with key = 1. Get >= 1 from sibling and update parent key.

Delete 9 3 5 16 30 2 3 5 6 9 17 30 40 Delete pair with key = 2. 3 5 16 30 2 3 5 6 9 17 30 40 Delete pair with key = 2. Merge with sibling, delete in-between key in parent.

Delete Get >= 1 from sibling and update parent key. 9 5 16 30 3 9 16 30 3 9 17 30 40 5 6 Delete pair with key = 3. Get >= 1 from sibling and update parent key.

Delete Merge with sibling, delete in-between key in parent. 9 6 16 30 16 30 5 6 9 17 30 40 Delete pair with key = 9. Merge with sibling, delete in-between key in parent.

Delete 9 6 30 30 40 17 5 6

Delete Merge with sibling, delete in-between key in parent. 9 6 16 30 16 30 5 6 9 17 30 40 Delete pair with key = 6. Merge with sibling, delete in-between key in parent.

Delete 9 16 30 5 9 17 30 40 Index node becomes deficient. Get >= 1 from sibling, move last one to parent, get parent key.

Delete Merge with sibling, delete in-between key in parent. 16 9 30 5 17 30 40 9 Delete 9. Merge with sibling, delete in-between key in parent.

Delete Index node becomes deficient. 16 30 5 17 30 40 Note merge of index nodes is different from merge of data nodes. Index node becomes deficient. Merge with sibling and in-between key in parent.

Delete Index node becomes deficient. It’s the root; discard. 16 30 17 16 30 17 30 40 5 Index node becomes deficient. It’s the root; discard.

B*-Trees Root has between 2 and 2 * floor((2m – 2)/3) + 1 children. Remaining nodes have between ceil((2m – 1)/3) and m children. All external/failure nodes are on the same level. m = 3 m = 4 Assume m > 3 in following. For m = 3 the lower bound is the same as for B-trees. At m = 4, B-tree allows nodes with degree 2 but B* needs degree 3. Assume m > 3 for future slides (needed for delete). Fat root needed to support insert and delete. Improves worst-case search time at the expense of inserts/deletes. Useful when search is dominant operation.

Insert When insert node is overfull, check adjacent sibling. If adjacent sibling is not full, move a dictionary pair from overfull node, via parent, to nonfull adjacent sibling. If adjacent sibling is full, split overfull node, adjacent full node, and in-between pair from parent to get three nodes with floor((2m – 2)/3), floor((2m – 1)/3), floor(2m/3) pairs plus two additional pairs for insertion into parent. Note, sum of the 3 floors + 2 always equals 2m, which is the number of pairs in the two nodes being split (m & m-1) plus the in between pair.

Delete When combining, must combine 3 adjacent nodes and 2 in-between pairs from parent. Total # pairs involved = 2 * floor((2m-2)/3) + [floor((2m-2)/3) – 1] + 2. Equals 3 * floor((2m-2)/3) + 1. Combining yields 2 nodes and a pair that is to be inserted into the parent. m mod 3 = 1 => nodes have m – 1 pairs each. m mod 3 = 0 => one node has m – 1 pairs and the other has m – 2. m mod 3 = 2 => nodes have m – 2 pairs each. If there aren’t 3 adjacent nodes, must be children of root (assume m > 3). Can combine two nodes and in-between item to have a new root. Search time is improved. Insert and delete time is worse. Need about 40% height reduction vs B-tree for worst-case disk accesses in insert/delete to be same as for B-tree. Each split is now 4 accesses (1 read of neighbor and 3 writes) for wc of h+4s+1 or 5h+1. Each combine is 2 reads plus 2 writes vs 1 read plus 1 write.