Btrees21 B-trees: The rest of the story. btrees22 Review of B-tree rules All nodes except root must have at least MINIMUM data entries No node may exceed.

Slides:



Advertisements
Similar presentations
Advanced Database Discussion B Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if.
Advertisements

B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
They’re not just binary anymore!
B-Trees. Motivation for B-Trees Index structures for large datasets cannot be stored in main memory Storing it on disk requires different approach to.
Trees II Kruse and Ryba Ch 10.1,10.2,10.4 and 11.3.
Data Structures and Algorithms1 B-Trees with Minimum=1 2-3 Trees.
CS 206 Introduction to Computer Science II 12 / 03 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 12 / 01 / 2008 Instructor: Michael Eckmann.
B-Trees and B+-Trees Disk Storage What is a multiway tree?
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
Data Structures Using C++ 2E Chapter 11 Binary Trees and B-Trees.
E.G.M. PetrakisB-trees1 Multiway Search Tree (MST)  Generalization of BSTs  Suitable for disk  MST of order n:  Each node has n or fewer sub-trees.
Trees Main and Savitch Chapter 10. Binary Trees A binary tree has nodes, similar to nodes in a linked list structure. Data of one sort or another may.
CS4432: Database Systems II
1 Joe Meehean.  Important and common problem  Given a collection, determine whether value v is a member  Common variation given a collection of unique.
CSC 213 – Large Scale Programming. Today’s Goals  Review a new search tree algorithm is needed  What real-world problems occur with old tree?  Why.
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
1 Multiway trees & B trees & 2_4 trees Go&Ta Chap 10.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B+ Tree What is a B+ Tree Searching Insertion Deletion.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
CSCE 3110 Data Structures & Algorithm Analysis Binary Search Trees Reading: Chap. 4 (4.3) Weiss.
Index Structures for Files Indexes speed up the retrieval of records under certain search conditions Indexes called secondary access paths do not affect.
Spring 2006 Copyright (c) All rights reserved Leonard Wesley0 B-Trees CMPE126 Data Structures.
B-trees (Balanced Trees) A B-tree is a special kind of tree, similar to a binary tree. However, It is not a binary search tree. It is not a binary tree.
INTRODUCTION TO AVL TREES P. 839 – 854. INTRO  Review of Binary Trees: –Binary Trees are useful for quick retrieval of items stored in the tree –order.
TREES A tree's a tree. How many more do you need to look at? --Ronald Reagan.
Searching: Binary Trees and Hash Tables CHAPTER 12 6/4/15 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education,
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
INTRODUCTION TO BINARY TREES P SORTING  Review of Linear Search: –again, begin with first element and search through list until finding element,
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee.
External Searching: B-Trees Dr. Jicheng Fu Department of Computer Science University of Central Oklahoma.
COSC 2007 Data Structures II Chapter 15 External Methods.
P p Chapter 10 has several programming projects, including a project that uses heaps. p p This presentation shows you what a heap is, and demonstrates.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
B-Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
 Trees Data Structures Trees Data Structures  Trees Trees  Binary Search Trees Binary Search Trees  Binary Tree Implementation Binary Tree Implementation.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of CHAPTER 12: Multi-way Search Trees Java Software Structures: Designing.
CPSC 221: Algorithms and Data Structures Lecture #7 Sweet, Sweet Tree Hives (B+-Trees, that is) Steve Wolfman 2010W2.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture17.
Data Structures Trees Phil Tayco Slide version 1.0 Apr. 23, 2015.
CompSci 100E 39.1 Memory Model  For this course: Assume Uniform Access Time  All elements in an array accessible with same time cost  Reality is somewhat.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
CompSci Memory Model  For this course: Assume Uniform Access Time  All elements in an array accessible with same time cost  Reality is somewhat.
 B-tree is a specialized multiway tree designed especially for use on disk  B-Tree consists of a root node, branch nodes and leaf nodes containing the.
Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
Data Structures Using C++ 2E Chapter 11 Binary Trees.
@ Zhigang Zhu, CSC212 Data Structure - Section FG Lecture 17 B-Trees and the Set Class Instructor: Zhigang Zhu Department of Computer Science.
2 Binary Heaps What if we’re mostly concerned with finding the most relevant data?  A binary heap is a binary tree (2 or fewer subtrees for each node)
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
SUYASH BHARDWAJ FACULTY OF ENGINEERING AND TECHNOLOGY GURUKUL KANGRI VISHWAVIDYALAYA, HARIDWAR.
COMP261 Lecture 23 B Trees.
CSC212 Data Structure - Section AB
Binary Search Trees Chapter 7 Objectives
(edited by Nadia Al-Ghreimil)
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
Height Balanced Trees 2-3 Trees.
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
B- Trees D. Frey with apologies to Tom Anastasio
B- Trees D. Frey with apologies to Tom Anastasio
B-Trees This presentation shows you the potential problem of unbalanced tree and show one way to fix it This lecture introduces heaps, which are used.
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
Multiway Trees Searching and B-Trees Advanced Tree Structures
B- Trees D. Frey with apologies to Tom Anastasio
(edited by Nadia Al-Ghreimil)
B-Trees.
B-Trees This presentation shows you the potential problem of unbalanced tree and show one way to fix it This lecture introduces heaps, which are used.
Presentation transcript:

btrees21 B-trees: The rest of the story

btrees22 Review of B-tree rules All nodes except root must have at least MINIMUM data entries No node may exceed MAXIMUM data entries (MAXIMUM is MINIMUM * 2) Entries in individual nodes are sorted The number of subtrees below any non-leaf node is one more than the number of entries in that node

btrees23 Review of B-tree rules In any non-leaf node, for any index n: –the entry at n is greater than all entries in subtree[n] –the entry at n is less than all entries in subtree[n+1] Every leaf node has the same depth

btrees24 Private member variables describe root node data: an array containing from 0 to MAXIMUM + 1 data entries (tree valid when there are 1.. MAXIMUM entries) count: contains a tally of the number of entries in the data array

btrees25 Private member variables describe root node subset: an array of pointers to from 0 to MAXIMUM + 2 subtrees (tree valid when there are MAXIMUM + 1 or fewer subtrees and the number of subtrees is one greater than the number of data entries) children: contains a tally of the number of entries in the subset array

btrees26 Insertion Implementation of insertion function involved a temporary relaxation of the rules, allowing the root node to end up with MAXIMUM + 1 data entries If such a condition occurs, the node is split into three nodes: a new root node with a single entry, and two subtrees each containing half the data (and half the subtrees) of the original root node

btrees27 Insertion example MINIMUM = 1 MAXIMUM = 2 Data entered in this order: 0,1,2,3,4,5,6,7,8 012 Entries exceed MAXIMUM; split node and grow tree upward Child node has too many entries; Split node in two, sending middle entry up to root Continue adding entries, splitting nodes and growing tree upward when necessary , Regardless of data entry order, tree will remain balanced

btrees28 Code for insertion template bool Set ::insert (const item& entry) { // do loose_insert; if entry added, check for excess if(loose_insert(entry)) // returns false { // if entry already in tree if (count > MAXIMUM) {// copy info from root Set *child = new Set; for(int x=0; x<count; x++) child->data[x]=data[x]; for(int y=0; y<children; y++) child->subset[y]=subset[y];

btrees29 Code for insertion child->children=children; child->count=count; // clear root node count=0; children=1; // former root becomes child of new root subset[0]=child; fix_excess(0); // split node to restore B-tree } // ends inner if return true; // insertion succeeded } // ends outer if return false; // if loose_insert failed, so did insert }

btrees210 Helper functions: loose_insert and fix_excess Loose_insert actually adds a data entry to the B-tree; may result in root having too many entries Fix_excess takes care of a problem node by splitting it into two subtrees and sending the middle data item up to root

btrees211 Code for loose_insert template int Set ::loose_insert (const item& entry) { int t; // find first item in data >= entry; save the index for (t=0; (t<count && data[t]<entry); t++); if (t<count && data[t]==entry) { // entry already in set -- not inserted cout << data[t] << " already in set" << endl; return false; }

btrees212 Code for loose_insert if (is_leaf()) // entry not found, {// root has no subtrees // add new entry at root -- // shift data right to make room for new entry for(int x=count; x>t; x--) data[x] = data[x-1]; count++; data[t] = entry; return true;// entry was inserted }

btrees213 Code for loose_insert else// entry wasn't found and node has children { // do loose_insert on appropriate subtree bool added = subset[t]->loose_insert(entry); // if loose_insert results in excess entries // in subtree, split node in two and add // middle data entry to subtree’s root if (subset[t]->count > MAXIMUM) fix_excess(t); return added; } // end else } // end loose_insert function

btrees214 Code for fix_excess template void Set ::fix_excess (item x) { int ct; // copy middle entry of child to root, // first making room in data array for (ct=count; ct>x; ct--) data[ct]=data[ct-1]; data[x]=subset[x]->data[MINIMUM]; count++;

btrees215 Code for fix_excess // split node in 2: Set *left, *right; // will hold child's old entries left=new Set; // allocate memory for right=new Set; // new sets left->count=MINIMUM; right->count=MINIMUM; for(ct=0; ct<MINIMUM; ct++) // copy data to new nodes { left->data[ct]=subset[x]->data[ct]; right->data[ct]=subset[x]->data[ct+MINIMUM+1]; }

btrees216 Code for fix_excess if(!(subset[x]->is_leaf()))// copy subsets if any exist { int chct=(subset[x]->children)/2; for(ct=0; ct<chct; ct++) { left->subset[ct]=subset[x]->subset[ct]; right->subset[ct]=subset[x]->subset[ct+chct]; } left->children=MINIMUM+1; right->children=MINIMUM+1; }

btrees217 Code for fix_excess // make room for new subset in root’s array of subsets subset[children]=new Set; for(ct=children; ct>x; ct--) subset[ct]=subset[ct-1]; children++; // attach new subtrees to root node subset[x]=left; subset[x+1]=right; } // ends fix_excess function

btrees218 Removing a B-tree entry Four functions involved; three are analogous to insertion functions: –remove: public function -- performs “loose” remove, then other functions as necessary to restore B-tree –loose_remove: performs actual removal of data entry; may leave B-tree invalid, with root node having 0 or subtree root having MINIMUM-1 entries

btrees219 Removing a B-tree entry Additional removal functions: –fix_shortage: deals with the problem of a subtree’s root having MINIMUM-1 entries –remove_largest: helper function called by loose_remove to ensure that root node contains children-1 data entries; works by copying largest data value from a subtree into root

btrees220 Pseudocode for public remove function template bool Set ::remove(const item& target) { if (!(loose_remove(target)) return false; // target not found if (count == 0 && children ==1) // root was emptied by loose_erase: shrink the // tree by : //- setting temporary pointer to subset // - copying all member variables from // temp to root //- deleting original child node

btrees221 Pseudocode for loose_remove template bool Set ::loose_remove(const item& target) { find first index such that data[index]>=target; if no such index found, index=count if (target not found and is_leaf()) return false; if (target found and is_leaf()) remove target from data array; shift contents to the left and decrement count return true;

btrees222 Pseudocode for loose_remove if (target not found and root has children) { subset[index]->loose_remove(target); if(subset[index]->count < MINIMUM) fix_shortage(index); return true; }

btrees223 Pseudocode for loose_remove if (target found and root has children) { subset[index]->remove_largest(data[index]); if(subset[index]->count < MINIMUM) fix_shortage(index); return true; }

btrees224 Action of fix_shortage function In order to remedy a shortage of entries in subset[n], do one of the following: –borrow an entry from the node’s left neighbor (subset[n-1]) or right neighbor (subset[n+1]) if either of these two has more than MINIMUM entries –combine subset[n] with either of its neighbors if they don’t have excess entries to give

btrees225 Pseudocode for fix_shortage template void Set ::fix_shortage(int x) { if (subset[x-1]->count > MINIMUM) shift existing entries in subset[x] over one, copy data[x-1] to subset[x]->data[0] and increment subset[x]->count data[x-1] = last item in subset[x-1]->data and decrement subset[x-1]->count if(!(subset[x-1]->is_leaf())) transfer last child of subset[x-1] to front of subset[x], incrementing subset[x]->children and decrementing subset[x-1]->children

btrees226 Example 1 for fix_shortage MINIMUM = 2 x = 1

btrees227 Example 1 for fix_shortage MINIMUM = 2 x = 1

btrees228 Example 1 for fix_shortage MINIMUM = 2 x = 1

btrees229 Example 1 for fix_shortage MINIMUM = 2 x = 1

btrees230 Example 1 for fix_shortage MINIMUM = 2 x = 1

btrees231 Pseudocode for fix_shortage else if (subset[x+1]->count > MINIMUM) increment subset[x]->count and copy data[x] to subset[x]->data[subset[x]->count-1] data[x] = subset[x+1]->data[0] and shift entries in subset[x+1]->data to the left and decrement subset[x+1]->count if (!(subset[x+1]->is_leaf())) transfer first child of subset[x+1] to subset[x], incrementing subset[x]->children and decrementing subset[x+1]->children

btrees232 Example 2 for fix_shortage MINIMUM = 2 x = 1

btrees233 Example 2 for fix_shortage MINIMUM = 2 x = 1

btrees234 Example 2 for fix_shortage MINIMUM = 2 x = 1

btrees235 Example 2 for fix_shortage MINIMUM = 2 x = 1

btrees236 Example 2 for fix_shortage MINIMUM = 2 x = 1

btrees237 Pseudocode for fix_shortage else if (subset[x-1]->count == MINIMUM) add data[x-1] to the end of subset[x-1]->data shift data array leftward, decrementing count and incrementing subset[x-1]->count transfer all data items and children from subset[x] to end of subset[x-1]; update values of subset[x-1]->count and subset[x-1]->children, and set subset[x]->count and subset[x]->children to 0 delete subset[x] and shift subset array to the left and decrement children

btrees238 Example 3 for fix_shortage MINIMUM = 2 x = 1

btrees239 Example 3 for fix_shortage MINIMUM = 2 x = 1

btrees240 Example 3 for fix_shortage MINIMUM = 2 x = 1

btrees241 Example 3 for fix_shortage MINIMUM = 2 x = 1

btrees242 Example 3 for fix_shortage MINIMUM = 2 x = 1

btrees243 Pseudocode for fix_shortage else combine subset[x] with subset[x+1] -- work is similar to previous combination operation: borrow an entry from root and add to subset[x] transfer all private members from subset[x+1] to subset[x], and zero out subset[x+1]’s children and count variables delete subset[x-1] and update root’s subset information

btrees244 Example 4 for fix_shortage MINIMUM = 2 x = 0

btrees245 Example 4 for fix_shortage MINIMUM = 2 x = 0