Download presentation
Presentation is loading. Please wait.
1
Preliminaries Multiway trees have nodes with greater than two children. Multiway trees of order k have nodes with most k children 2-3-4 Trees –For all non leaf nodes, Nodes with One data items have two pointers Two data items have three pointers Three data items have four pointers –Children of pointer p have keys less than data item p. –Children of the last pointer contains keys > than the last data item. B-Trees (Balanced, Boeing, broad, bushy, or Bayer (for Rudolph Bayer)??) –Each node contains links to as many children as can fit in a disk block.
2
Node Structures 2-3-4 tree typedef struct Nodelink { int numElems; Item *items[3]; struct Nodelink*links[4]; } Node; B-Tree typedef struct Nodelink { Item[k] items; Nodelink[k+1] nodes; } Node;
3
2-3-4 Insertion Algorithm Insert( node ) If node is full Then Call splitNode If key is found in node, then Return “DuplicatesNotAllowed” If this is a leaf node, Insert the Data item and Return Call Insert(appropriateChildPointer) SplitNode Allocate a newNode and add the right child to it If parent exists Then Insert middleChild to parent node and point to newNode Else Allocate new Root containing middleChild of node root’s firstChildPointer points to newNode root’s secondChildPointer points to node
4
2-3-4 Deletion Algorithm Find the node to delete. If it is not a leaf node, replace its data by its successor, and then remove the successor. Cases to consider when deleting an item from a 2-3-4 node: 1.If more than one item remains in a leaf node that contains the item to delete, simply remove it 2.If the item to delete is the only one in the node a.If there is a sibling with more than entry, then promote sibling and demote parent (possibly cascading) till the node to delete has a spare entry. Then delete the item in question b.If all sibling nodes have only one entry, demote the parent and merge it with the sibling and then delete the current node. If the parent node now is empty. Recursively, traverse up the tree applying the above steps needed. 3.If the root node becomes empty, simply remove it from the tree.
5
Visual Illustration of the 2-3-4-Delete Case 1: Case 2: 11, 22, 3311, 33 12 11, 22, 33 08, 0911 09, 22, 33 08 Case 3: 12 11 0808,11 The algorithm recursively works its way up the tree
6
Characteristics of External Storage Speed is at least three orders of magnitude slower than memory. The extra overhead of searching through multiway tree nodes is more than compensated because less tree depth means less disk access. It is desirable to design the record sizes with disk block sizes in mind. Each disk read/write will be in multiples of its block size.
7
B-Tree Insertion Algorithm Differences from the 2-3-4 algorithm –Node splitting is from the bottom up rather than the top down. Advantage: The tree is kept more full. Disadvantage: A tree down could be followed by a tree up if multiple splits are necessary. –Half of the items go to the new node, half remain in the old node. –The middle key is promoted to the next level up. –Contraction occurs when a node and a sibling have less than a full block of data items. Note: Standard B-tree implementations require at least half full nodes.
8
External Storage Optimizations It is more efficient to keep the index and data separate –Separate indices allow for multi-keyed files Refinements exist to guarantee that no record is less than 2/3 full. Nodes are balanced over three siblings. Some implementations only have data pointers at the last level. A linked list of free disk blocks is often used to reclaim storage space after deletions. Efficiency: Assume a block contains 8096 bytes, each key is 24 bytes, the blocks are half full, and the pointers require 4 bytes. How many levels deep is the tree?
9
Other External Storage Algorithms Create binary tree in memory for the index Sorting external data with a type of merge sort –On Each pass Read large block from each piece of the file Perform merge Write back to second file Keep reading blocks from each half until they run out. –There will be log k N merges where k is the number of data elements that can fit in the memory blocks.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.