Multiway Trees Searching and B-Trees Advanced Tree Structures

Slides:



Advertisements
Similar presentations
 Definition of B+ tree  How to create B+ tree  How to search for record  How to delete and insert a data.
Advertisements

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
General Trees and Variants CPSC 335. General Trees and transformation to binary trees B-tree variants: B*, B+, prefix B+ 2-4, Horizontal-vertical, Red-black.
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
B-trees (Balanced Trees) A B-tree is a special kind of tree, similar to a binary tree. However, It is not a binary search tree. It is not a binary tree.
Storage CMSC 461 Michael Wilson. Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds.
ALGORITHMS FOR ISNE DR. KENNETH COSH WEEK 6.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Multi-way Trees. M-way trees So far we have discussed binary trees only. In this lecture, we go over another type of tree called m- way trees or trees.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee.
COSC 2007 Data Structures II Chapter 15 External Methods.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of CHAPTER 12: Multi-way Search Trees Java Software Structures: Designing.
CPSC 221: Algorithms and Data Structures Lecture #7 Sweet, Sweet Tree Hives (B+-Trees, that is) Steve Wolfman 2010W2.
Chapter 7 Trees_Part3 1 SEARCH TREE. Search Trees 2  Two standard search trees:  Binary Search Trees (non-balanced) All items in left sub-tree are less.
SUYASH BHARDWAJ FACULTY OF ENGINEERING AND TECHNOLOGY GURUKUL KANGRI VISHWAVIDYALAYA, HARIDWAR.
COMP261 Lecture 23 B Trees.
TCSS 342, Winter 2006 Lecture Notes
Multilevel Indexing and B+ Trees
Multilevel Indexing and B+ Trees
Multiway Search Trees Data may not fit into main memory
B-Trees .
Tree-Structured Indexes
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
CS522 Advanced database Systems
CSE 332 Data Abstractions B-Trees
Database System Implementation CSE 507
Extra: B+ Trees CS1: Java Programming Colorado State University
Chapter 11: Multiway Search Trees
B+-Trees.
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
B+-Trees.
B+-Trees.
B+ Tree.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
(edited by Nadia Al-Ghreimil)
B Tree Adhiraj Goel 1RV07IS004.
COP3530- Data Structures B Trees
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
Lecture 26 Multiway Search Trees Chapter 11 of textbook
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
B-Trees.
B+-Trees and Static Hashing
Tree-Structured Indexes
B-Tree.
B+Trees The slides for this text are organized into chapters. This lecture covers Chapter 9. Chapter 1: Introduction to Database Systems Chapter 2: The.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Adapted from Mike Franklin
B-Trees CSE 373 Data Structures CSE AU B-Trees.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Database Design and Programming
(edited by Nadia Al-Ghreimil)
Database Systems (資料庫系統)
Indexing 4/11/2019.
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
B+-Trees j a0 k1 a1 k2 a2 … kj aj j = number of keys in node.
CSE 373: Data Structures and Algorithms
B-Trees CSE 373 Data Structures CSE AU B-Trees.
B-Trees.
Tree-Structured Indexes
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #06 B+ trees Instructor: Chen Li.
CS222P: Principles of Data Management UCI, Fall Notes #06 B+ trees
B+-trees In practice, B-trees are not used much as defined earlier.
Presentation transcript:

Multiway Trees Searching and B-Trees Advanced Tree Structures CIS265 Multiway Trees Searching and B-Trees Advanced Tree Structures 2-3-4 Trees 1

Multiway Trees We now look at trees that are not restricted to one entry per node Used for internal and external indices and search trees, among other things Multiway Trees

Multiway search trees Basic Theory “A multi-way search tree of order n is a general tree in which each node refers to n or few sub-trees and contains one few keys than it has sub-trees.” Pointer to Keys < Key1 Key 1 Pointer to keys between Key1 and Key2 Key 2 Key2 and Key3 Key 3 Pointer to Keys > Key3 Multiway Trees 2-3-4 Trees 3

Multiway search trees Basic Theory A node in a serach tree with pointers to sub-trees below it Multiway Trees 2-3-4 Trees 3

Say That Again, Please… Each node has n children and n-1 keys The keys in each node are in ascending order The keys in the first i children are smaller than the ith key The keys in the last n-i children are larger than the ith key Multiway Trees 2-3-4 Trees 4

How about an example…? This is a 4-way tree. n=4 This node has 4 children, and 3 keys 50 60 80 30 35 58 59 63 70 73 100 Each node has n children and n-1 keys. 52 54 61 62 After that, it depends on your data. Of course, you also have to follow all the other rules. 57 55 56 Multiway Trees 2-3-4 Trees 5

The keys in each node are in ascending order 52 54 61 62 50 60 80 30 35 58 59 63 70 73 100 The keys in each node are in ascending order 52 54 61 62 This holds true for each node. 57 55 56 Multiway Trees 2-3-4 Trees 6

i=2. The first & second child all have keys smaller than this one. i=3. The first three children all have keys smaller than this key. i=1. The first child has keys all smaller than this key. 50 60 80 30 35 58 59 63 70 73 100 The keys in the first i children are smaller than the ith key. 52 54 61 62 57 55 56 Multiway Trees 2-3-4 Trees 7

i=1. The last three children have keys all larger than this key. i=2. The third & fourth children all have keys larger than this one. i=3. The last child has keys larger than this key. 50 60 80 30 35 58 59 63 70 73 100 The keys in the last n-i children are larger than the ith key. 52 54 61 62 57 55 56 Multiway Trees 2-3-4 Trees 8

Implementing a Multiway Tree When nodes are not full, multiway search trees waste storage. Used primarily to store data on external direct-access device such as a disk. Also used as indices for non-ordered files stored on direct access devices. Multiway Trees 2-3-4 Trees 12

Implementing a Multiway Tree Simple rule to improve operations: Try & keep as many keys as possible in each node (faster to search) Multiway Trees 2-3-4 Trees 15

B-Trees Reduces the time required for accessing secondary storage The nodes can become very large - typically the size of a block Multiway Trees 2-3-4 Trees 23

B-Trees A B-tree of order n is a multi-way search tree with the following properties: The root has at least one key. Each non-root node holds (k-1) keys and k pointers to sub-trees where: n/2  k  n (ceiling function) All leaves (terminal nodes) are at the same level. Multiway Trees 2-3-4 Trees 24

B-Trees According to these conditions, a B-Tree is always at least half full, has relatively few levels, and ‘perfectly’ balanced. Multiway Trees 2-3-4 Trees 26

B-Trees B3-Tree (non-root nodes at full and lowest capacity) ptr1 key1 ptr2 key2 ptr3 ptr1 key1 ptr2 ptr1 key1 ptr2 key2 ptr3 key3 ptr4 key4 ptr5 ptr1 key1 ptr2 key2 ptr3 Multiway Trees 2-3-4 Trees 26

B-Trees Multiway Trees 2-3-4 Trees 26

A B-Tree of Order 5 root Minimum Entries 42 16 21 58 76 81 93 11 14 17 The 5 Subtrees are not shown.. 11 14 17 19 20 21 22 23 24 Maximum Entries Multiway Trees

What’s inside the node? 42 A very simplistic drawing. Key Data K D K D Num Entries K D K D entry Entry key <key type> data <data type> rightPtr <node pointer> end entry node node firstPtr <pointer to node> numEntries <integer> entries <array [1 .. M-1] of entry> end node Multiway Trees

Inserting a key in a B-Tree Remember that all leaves in a B-Tree have to be at the same level This can present several challenges There are three cases we need to consider Multiway Trees 2-3-4 Trees 2

Case 1: B-tree Insertion A key is placed in a leaf which still has some room. 12 5 8 13 15 12 5 7 8 13 15 need to preserve order, so move 8 down by 1, and insert the 7. 2-3-4 Trees 3

Case 2: B-tree Insertion The leaf in which a key should be placed is full. In this case, the leaf is split, creating a new leaf, and half the keys are moved from the full leaf to the new leaf. Then, the last key of the new left leaf is moved to the parent, and a pointers to the new left and right leaves is placed in the parent as well. Multiway Trees 2-3-4 Trees 4

12 2 5 7 8 13 15 we want to insert a 6 in this leaf 12 2 5 6 7 8 13 15 split the leaf, and add the new value 6 12 2 5 7 8 13 15 the last key of the old leaf is moved to the parent and a pointer to the new leaf is added to the parent as well. 2-3-4 Trees 5

Case 3: B-tree Insertion A special case arises if the root of the B-tree is full. In this case, a new root and a new sibling of the existing overfloding root is created. This is the only case in which a B-tree increases in height. Multiway Trees 2-3-4 Trees 6

6 12 20 30 2 3 4 5 7 8 10 11 1415 18 19 21 23 25 28 31 33 34 35 we want to insert a 13 6 12 20 30 2 3 4 5 7 8 10 11 18 19 21 23 25 28 31 33 34 35 13 14 15 split the node as case #2 the last key of the old leaf is moved to the parent and a pointer to the new leaf is added to the parent as well. Multiway Trees 2-3-4 Trees 7

6 12 15 20 30 2 3 4 5 7 8 1011 18 19 21 23 25 28 31 33 34 35 13 14 split the root - as there will be more keys than slots 15 6 12 20 30 2 3 4 5 7 8 1011 18 19 21 23 25 28 31 33 34 35 13 14 add a new root, increasing the tree’s depth Multiway Trees 2-3-4 Trees 8

B-Tree Insert A B-Tree grows from the bottom up Multiway Trees

Deleting a Key from a B-Tree Deleting a key is similar to inserting a key More special cases Merging rather than splitting nodes Multiway Trees 2-3-4 Trees 10

Deleting a node - Simple Case If, after deleting a key K, the leaf is at least half full and only keys greater than K are moved to the left to fill the hole. Opposite of insertion Multiway Trees 2-3-4 Trees 11

Deleting a node - Underflow Case If after deleting K, the number of keys in the leaf is less than [m/2]-1, causing an underflow If there is a left or right sibling with the number of keys exceeding the minimum ([m/2]-1), then all keys from this leaf and this sibling are redistributed between them by moving the separator key from the parent to the leaf and moving one key from the sibling to the parent Multiway Trees 2-3-4 Trees 12

Deleting a node - Underflow Case (Cont.) If the leaf underflows and the number of siblings is [m/2]-1, then the leaf and a sibling are merged; the keys from the leaf, from its sibling, and the separating key from the parent are all put in the leaf, and the sibling node is discarded. If we merge siblings, and the parent is the root with only one key, all the keys are combined in to a new root, and the old root and old nodes are discarded Multiway Trees 2-3-4 Trees 13

16 3 8 22 25 1 2 5 6 7 18 20 23 24 27 37 13 14 15 Delete 6 16 3 8 22 25 1 2 5 7 18 20 23 24 27 37 13 14 15 Remove the 6, move the 7 over 2-3-4 Trees 14

16 3 8 22 25 1 2 5 7 18 20 23 24 27 37 13 14 15 Delete 7 16 3 13 22 25 1 2 5 8 18 20 23 24 27 37 14 15 Note the re-arrangement! Nodes can not be less than 50% full 2-3-4 Trees 15

part 1. We merge 2 cells - deleting the empty one 16 3 13 22 25 1 2 5 8 18 20 23 24 27 37 14 15 Delete 8 16 3 22 25 1 2 5 13 1415 18 20 23 24 27 37 part 1. We merge 2 cells - deleting the empty one 2-3-4 Trees 16

part 2. We merge 3 cells - deleting the empty ones 16 3 22 25 1 2 5 13 1415 18 20 23 24 27 37 continuing... 3 16 22 25 1 2 5 1314 15 18 20 23 24 27 37 part 2. We merge 3 cells - deleting the empty ones Multiway Trees 2-3-4 Trees 17

We adjust keys so that the root remains full 3 16 22 25 1 2 5 13 14 15 18 20 23 24 27 37 Delete 16 3 15 22 25 1 2 5 13 14 18 20 23 24 27 37 We adjust keys so that the root remains full Multiway Trees 2-3-4 Trees 18

Another look at Deleting Nodes from a B-Tree We will look at 5 more examples to make sure this is all clear. Multiway Trees

21 78 21 42 11 14 42 45 63 74 85 97 Delete 11: We need to borrow from the right, and move the 21 down, and the 42 up. This insures all the B-Tree rules are still being followed. 42 78 14 21 45 63 74 85 97 Multiway Trees

42 78 78 74 14 21 45 63 74 85 97 Delete 97: We need to borrow from the left, and move the 78 down, and the 74 up. This insures all the B-Tree rules are still being followed. 42 74 14 21 45 63 78 85 Multiway Trees

42 74 14 21 45 63 78 85 Delete 45: We need to “combine” 63, 74, 78, 85. This insures all the B-Tree rules are still being followed. 42 14 21 63 74 78 85 Multiway Trees

42 14 21 63 74 78 85 Delete 63 & 78: Shift 74 and 85 down to positions 1 & 2 in the node. This insures all the B-Tree rules are still being followed. 42 14 21 74 85 Multiway Trees

Step 1: Copy 21 to the parent and delete from the leaf 42 14 21 74 85 Finally, we delete 42: We need to remove the original root, and combine all the remaining entries in to one node. This insures all the B-Tree rules are still being followed. 42 21 14 21 74 85 14 74 85 Step 1: Copy 21 to the parent and delete from the leaf Multiway Trees

Step 2: Combine 14, 21, 74, 85 and delete original root. Multiway Trees

B* Trees Variant of the B-Tree Introduced by Donald Knuth, named by Douglas Comer All nodes required to bit at least 2/3 full Two nodes are split in to three, rather than one into two Multiway Trees 2-3-4 Trees 19

B+ Trees Enhancement of B-Trees Allows all data to be accessed sequentially References to data are made only from leaves Internal nodes are indexed for faster access Leaves are indexed sequentially Multiway Trees 2-3-4 Trees 20

B+ Trees Internal Nodes store keys, pointers and a key count Leaves store keys, references to records in a data file associated with the keys, and a pointer to the next leaf Multiway Trees 2-3-4 Trees 21

index set leaves A B+ tree, order of 4 1 CD244 2 BF90 BQ322 2 CF04 DR300 index set 2 DR300 DR305 2 BQ322 CD123 2 BF90 BF130 3 CF04 CF05 DP102 3 AB203 AS09 BC26 2 CD244 CF03 leaves A B+ tree, order of 4 Multiway Trees 2-3-4 Trees 22

Inserting in to a B+ tree Case 1: There is still room in the leaf: The keys still must be in order No changes are made to the index set Case 2: The leaf is full The leaf is split The new leaf node is included in the sequence set keys are distributed evenly first key from the new node is copied (not moved) to the parent Multiway Trees 2-3-4 Trees 23

29 11 19 1 2 8 10 11 13 15 19 26 part of a B+ tree 29 6 11 19 1 2 6 8 10 11 13 15 19 26 after inserting a 6 - note the copy of the new key in the index set Multiway Trees 2-3-4 Trees 24

Deleting from a B+Tree Case 1: No underflow after removal Insure keys are still in order No changes made to index set Case 2: Removal causes underflow Keys from leaf & keys from sibling are redistributed -or- leaf is deleted and its keys are added to sibling Need to update parent May require changes to index set Multiway Trees 2-3-4 Trees 25

Delete the 6 - note there is no change made to the index set! 29 6 11 19 1 2 6 8 10 11 13 15 19 26 Delete the 6 - note there is no change made to the index set! 29 6 11 19 1 2 8 10 11 13 15 19 26 Multiway Trees 2-3-4 Trees 26

29 6 11 19 1 2 8 10 11 13 15 19 26 Delete the 2 - note we need to merge 2 leaves, and update the index set 29 11 19 1 8 10 11 13 15 19 26 Multiway Trees 2-3-4 Trees 27

2-3-4 Trees 27

Multiway Trees

Prefix B+ Trees A prefix B+ tree is a B+ tree in which the chosen separators are the shortest prefixes that allow us to distinguish two neighboring index keys Operations are generally the same as “standard” B+ trees Not much difference in execution times between B+ and Prefix B+ trees Largely of theoretical interest Multiway Trees 2-3-4 Trees 28

a B+ tree a Prefix B+ tree 1 CD244 2 BF90 BQ322 2 CF04 DR300 2 DR300 DR305 2 BQ322 CD123 2 BF90 BF130 3 CF04 CF05 DP102 3 AB203 AS09 BC26 2 CD244 CF03 a Prefix B+ tree 1 CD2 2 BF BQ 2 CF04 DR 2 DR300 DR305 2 BQ322 CD123 2 BF90 BF130 3 CF04 CF05 DP102 3 AB203 AS09 BC26 2 CD244 CF03 2-3-4 Trees 29

Bit Trees Bit Trees take Prefix B+ Trees to the extreme Rather than using bytes of data as separators, we use bits of data Multiway Trees 2-3-4 Trees 30

R-Trees R-Trees are used to store spatial data The “R” is for rectangle Based on B-trees A leaf in an R-Tree contains entries of the form (rect, id) where rect = ([x0, y0], … [xn-1, yn-1]) and is an n-dimensional rectangle and id is a pointer to the data file A non-leaf has the form (rect, id) where rect is the smallest rectangle encompassing all the rectangles found in child Multiway Trees 2-3-4 Trees 31

2-4 Trees A special case of B Trees Rather than storing an entire block of data, we store 1, 2, or at most 3 elements. Wastes space Transform to binary trees Uses two types of links links between nodes representing keys belong to same node links representing regular parent-child links Multiway Trees 2-3-4 Trees 32

Digital Search Trees Forms a general tree based on the symbols of which the keys are composed Example: If keys are integers, each digit position determines one of ten possible sons of a given node If the keys are alphabetic, each letter of the alphabet determines a branch in the tree Every leaf contains the special symbol “eok” - signifying end of key Multiway Trees 2-3-4 Trees 33

Digital Search Trees Keys in this tree: 180 185 186 195 1 8 9 5 6 5 5 6 5 eok eok eok eok Multiway Trees 2-3-4 Trees 34