B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree

Slides:



Advertisements
Similar presentations
 Definition of B+ tree  How to create B+ tree  How to search for record  How to delete and insert a data.
Advertisements

B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
CS 206 Introduction to Computer Science II 12 / 01 / 2008 Instructor: Michael Eckmann.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
Preliminaries Multiway trees have nodes with greater than two children. Multiway trees of order k have nodes with most k children Trees –For all.
Tirgul 6 B-Trees – Another kind of balanced trees.
CS4432: Database Systems II
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B+ Tree What is a B+ Tree Searching Insertion Deletion.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
 B+ Tree Definition  B+ Tree Properties  B+ Tree Searching  B+ Tree Insertion  B+ Tree Deletion.
B-trees (Balanced Trees) A B-tree is a special kind of tree, similar to a binary tree. However, It is not a binary search tree. It is not a binary tree.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
COSC 2007 Data Structures II Chapter 15 External Methods.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy.
B+ Trees  What if you have A LOT of data that needs to be stored and accessed quickly  Won’t all fit in memory.  Means we have to access your hard.
 B-tree is a specialized multiway tree designed especially for use on disk  B-Tree consists of a root node, branch nodes and leaf nodes containing the.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005.
COMP261 Lecture 23 B Trees.
TCSS 342, Winter 2006 Lecture Notes
Multilevel Indexing and B+ Trees
Multilevel Indexing and B+ Trees
Multiway Search Trees Data may not fit into main memory
Tree-Structured Indexes: Introduction
B-Trees .
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
CSE 332 Data Abstractions B-Trees
Extra: B+ Trees CS1: Java Programming Colorado State University
B+-Trees.
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
Btrees Insertion.
B+-Trees.
B+-Trees.
Lecture 22 Binary Search Trees Chapter 10 of textbook
B+ Tree.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Trees 4 The B-Tree Section 4.7
B Tree Adhiraj Goel 1RV07IS004.
(2,4) Trees (2,4) Trees 1 (2,4) Trees (2,4) Trees
Wednesday, April 18, 2018 Announcements… For Today…
Lecture 26 Multiway Search Trees Chapter 11 of textbook
BTrees.
Search Sorted Array: Binary Search Linked List: Linear Search
(2,4) Trees /26/2018 3:48 PM (2,4) Trees (2,4) Trees
CSE 326: Data Structures Lecture #9 AVL II
B+-Trees and Static Hashing
CS222/CS122C: Principles of Data Management Notes #07 B+ Trees
B-Tree.
B+Trees The slides for this text are organized into chapters. This lecture covers Chapter 9. Chapter 1: Introduction to Database Systems Chapter 2: The.
(2,4) Trees (2,4) Trees (2,4) Trees.
A Robust Data Structure
Multiway Trees Searching and B-Trees Advanced Tree Structures
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
2-3-4 Trees Red-Black Trees
Adapted from Mike Franklin
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
(2,4) Trees 2/15/2019 (2,4) Trees (2,4) Trees.
(2,4) Trees /24/2019 7:30 PM (2,4) Trees (2,4) Trees
(2,4) Trees (2,4) Trees (2,4) Trees.
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
B-Trees.
Tree-Structured Indexes
Indexing, Access and Database System Architecture
CSE 326: Data Structures Lecture #10 B-Trees
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #06 B+ trees Instructor: Chen Li.
CS222P: Principles of Data Management UCI, Fall Notes #06 B+ trees
Presentation transcript:

B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree Searching Insertion Deletion

B-trees A self-balancing tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in log n. Generalization of a binary search tree in that a node can have more than two children. For systems that read and write large blocks of data. B-trees are a good example of a data structure for external memory. Commonly used in databases and filesystems.

What are B+ Trees Used For? When we store data in a table in a DBMS we want Fast lookup by primary key Ability to add/remove records on the fly Some kind of dynamic tree on disk Sequential access to records (physically sorted by primary key on disk) Tree structured keys (hierarchical index for searching) Records all at leaves in sorted order

What is an M-ary Search Tree? Maximum branching factor of M Complete tree has depth = logMN Each internal node in a complete tree has M - 1 keys Binary search tree is a B Tree where M is 2

B Trees B-Trees are specialized M-ary search trees Each node has many keys subtree between two keys x and y contains values v such that x  v < y binary search within a node to find correct subtree Each node takes one full {page, block, line} of memory (disk) 3 7 12 21 x<3 3<x<7 7<x<12 12<x<21 21<x

B-Tree Properties Properties Result maximum branching factor of M the root has between 2 and M children or at most M-1 keys All other nodes have between M/2 and M records Keys+data Result tree is O(log M) deep all operations run in O(log M) time operations pull in about M items at a time All the other internal nodes (non-leaves) will have between M/2 and M children. The funky symbol is ceiling, the next higher integer above the value. The result of this is that the tree is “pretty” full. Not every node has M children but they’ve all at least got M/2 (a good number). Internal nodes contain only search keys. A search key is a value which is solely for comparison; there’s no data attached to it. The node will have one fewer search key than it has children (subtrees) so that we can search down to each child. The smallest datam between two search keys is equal to the lesser search key. This is how we find the search keys to use.

What is a B+ Tree? A variation of B trees in which internal nodes contain only search keys (no data) leaf nodes contain pointers to data records data records are in sorted order by the search key all leaves are at the same depth The properties of B-Trees (and the trees themselves) are a bit more complex than previous structures we’ve looked at. Here’s a big, gnarly list; we’ll go one step at a time. The maximum branching factor, as we said, is M (tunable for a given tree). The root has between 2 and M children or at most L keys. (L is another parameter) These restrictions will be different for the root than for other nodes.

… … B+ Tree Nodes Internal node Leaf Pointer (Key, NodePointer)*M-1 in each node First i keys are currently in use k0 k1 … ki-1 __ … __ 1 i-1 M - 2 Leaf (Key, DataPointer)* L in each node first j Keys are currently in use FIX M-I to M-I-1!! Alright, before we look at any examples, let’s look at what the node structure looks like. Internal nodes are arrays of pointers to children interspersed with search keys. Why must they be arrays rather than linked lists? Because we want contiguous memory! If the node has just I+1 children, it has I search keys, and M-I empty entries. A leaf looks similar (I’ll use green for leaves), and has similar properties. Why are these different? Because internal nodes need subtrees-1 keys. k0 k1 … kj-1 __ … __ 1 L-1 Data for kj-1 Data for k0 Data for k2

Example B+ Tree with M = 4 Often, leaf nodes linked together 10 40 3 15 20 30 50 This is just an example B-tree. Notice that it has 24 entries with a depth of only 2. A BST would be 4 deep. Notice also that the leaves are at the same level in the tree. I’ll use integers as both key and data, but we all know that that could as well be different data at the bottom, right? 1 2 10 11 12 20 25 26 40 42 3 5 6 9 15 17 30 32 33 36 50 60 70

Advantages of B+ tree usage for databases keeps keys in sorted order for sequential traversing uses a hierarchical index to minimize the number of disk reads uses partially full blocks to speed insertions and deletions keeps the index balanced with a recursive algorithm In addition, a B+ tree minimizes waste by making sure the interior nodes are at least half full. a B+ tree can handle an arbitrary number of insertions and deletions.

Searching Just compare the key value with the data in the tree, then return the result. For example: find the value 45, and 15 in below tree.

Searching Result: 1. For the value of 45, not found. 2. For the value of 15, return the position where the pointer located.

Insertion inserting a value into a B+ tree may unbalance the tree, so rearrange the tree if needed. Example #1: insert 28 into the below tree. 25 28 30 Fits inside the leaf

Insertion Result:

Insertion Example #2: insert 70 into below tree

Does not fit inside the leaf Insertion Process: split the leaf and propagate middle key up the tree 50 55 60 65 70 Does not fit inside the leaf 50 55 60 65 70

Insertion Result: chose the middle key 60, and place it in the index page between 50 and 75.

Insertion The insert algorithm for B+ Tree Leaf Node Full Index Node Full Action NO Place the record in sorted position in the appropriate leaf page YES Split the leaf node Place Middle Key in the index node in sorted order. Left leaf node contains records with keys below the middle key. Right leaf node contains records with keys equal to or greater than the middle key. Split the leaf node. Records with keys < middle key go to the left leaf node. Records with keys >= middle key go to the right leaf node. Split the index node. Keys < middle key go to the left index node. Keys > middle key go to the right index node. The middle key goes to the next (higher level) index node. IF the next level index node is full, continue splitting the index nodes.

Leaf node full, split the leaf. Insertion Exercise: add a key value 95 to the below tree. Add 95 to 75, 80, 85, 90; node full. Split into: 75, 80 and 85, 90 95. Propagate middle value (85) to parent. Parent is full with 25, 50, 60, 75. Split into 25, 50 and 60, 75, 85. Propagate middle value (60) up the tree. 75 80 85 90 95 Leaf node full, split the leaf. 75 80 85 90 95 25 50 60 75 85

Insertion Result: again put the middle key 60 to the index page and rearrange the tree. B+ Trees grow UP from the bottom

Deletion Same as insertion, the tree has to be rebuild if the deletion result violate the rule of B+ tree. Example #1: delete 70 from the tree OK. Node >=50% full 60 65

HERE…..

Deletion Result: Locate it in the leaf node. Delete it. Node is still >= 50% full, all done.

Deletion Example #2: delete 25 from below tree, but 25 appears in the index page. But… Locate 25 in leaf. Delete it. Node still >= 50% full so okay. BUT, 25 was used in index node…. Propagate smallest value in leaf node (28) up the tree to replace it. 28 30 This is OK.

Deletion Result: replace 28 in the index page. Add 28

Deletion Example #3: delete 60 from the below tree 65 Locate 60 in leaf node Delete it. Leaf now < 50% full. Neighbor at most 50% full, so merge with neighbor to the right. Guaranteed to fit. 65 Less than 50% full 50 55 65

Deletion Result: delete 60 from the index page and combine the rest of index pages.

Deletion Delete algorithm for B+ trees Data Page Below Fill Factor Index Page Below Fill Factor Action NO Delete the record from the leaf page. Arrange keys in ascending order to fill void. If the key of the deleted record appears in the index page, use the next key to replace it. YES Combine the leaf page and its sibling. Change the index page to reflect the change. Combine the leaf page and its sibling. Adjust the index page to reflect the change. Combine the index page with its sibling. Continue combining index pages until you reach a page with the correct fill factor or you reach the root page.

Conclusion For a B+ Tree: It is “easy” to maintain its balance Insert/Deletion complexity O(logM) The searching time is shorter than most of other types of trees because branching factor is high