Multi-way Trees. M-way trees So far we have discussed binary trees only. In this lecture, we go over another type of tree called m- way trees or trees.

Slides:



Advertisements
Similar presentations
 Definition of B+ tree  How to create B+ tree  How to search for record  How to delete and insert a data.
Advertisements

Advanced Database Discussion B Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Indexing Structures for Files.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
COMP 451/651 Indexes Chapter 1.
Multilevel Indexing and B+ Trees
B-Trees. Motivation for B-Trees Index structures for large datasets cannot be stored in main memory Storing it on disk requires different approach to.
CPSC 231 B-Trees (D.H.)1 LEARNING OBJECTIVES Problems with simple indexing. Multilevel indexing: B-Tree. –B-Tree creation: insertion and deletion of nodes.
Other time considerations Source: Simon Garrett Modifications by Evan Korth.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
1 B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Comparing B-trees and AVL-trees Searching a B-tree Insertion in a B-tree.
1 Database indices Database Systems manage very large amounts of data. –Examples: student database for NWU Social Security database To facilitate queries,
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
File Organizations March 2007R McFadyen ACS In SQL Server 2000 Tree terms root, internal, leaf, subtree parent, child, sibling balanced, unbalanced.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
Primary Indexes Dense Indexes
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
B-Trees and B+-Trees Disk Storage What is a multiway tree?
Preliminaries Multiway trees have nodes with greater than two children. Multiway trees of order k have nodes with most k children Trees –For all.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
E.G.M. PetrakisB-trees1 Multiway Search Tree (MST)  Generalization of BSTs  Suitable for disk  MST of order n:  Each node has n or fewer sub-trees.
CS4432: Database Systems II
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
1 Multiway trees & B trees & 2_4 trees Go&Ta Chap 10.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
 B+ Tree Definition  B+ Tree Properties  B+ Tree Searching  B+ Tree Insertion  B+ Tree Deletion.
Index Structures for Files Indexes speed up the retrieval of records under certain search conditions Indexes called secondary access paths do not affect.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
ALGORITHMS FOR ISNE DR. KENNETH COSH WEEK 6.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
CSE AU B-Trees1 B-Trees CSE 373 Data Structures.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
1 B-Trees & (a,b)-Trees CS 6310: Advanced Data Structures Western Michigan University Presented by: Lawrence Kalisz.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee.
COSC 2007 Data Structures II Chapter 15 External Methods.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
B-Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it.
Adapted from Mike Franklin
2-3 Tree. Slide 2 Outline  Balanced Search Trees 2-3 Trees Trees.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of CHAPTER 12: Multi-way Search Trees Java Software Structures: Designing.
File Organization and Processing Week Tree Tree.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture17.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
 B-tree is a specialized multiway tree designed especially for use on disk  B-Tree consists of a root node, branch nodes and leaf nodes containing the.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005.
Indexing Database Management Systems. Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files File Organization 2.
1 Query Processing Part 3: B+Trees. 2 Dense and Sparse Indexes Advantage: - Simple - Index is sequential file good for scans Disadvantage: - Insertions.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
SUYASH BHARDWAJ FACULTY OF ENGINEERING AND TECHNOLOGY GURUKUL KANGRI VISHWAVIDYALAYA, HARIDWAR.
COMP261 Lecture 23 B Trees.
Multilevel Indexing and B+ Trees
Multilevel Indexing and B+ Trees
B-Trees .
B+ Tree.
B Tree Adhiraj Goel 1RV07IS004.
Multiway Trees Searching and B-Trees Advanced Tree Structures
Presentation transcript:

Multi-way Trees

M-way trees So far we have discussed binary trees only. In this lecture, we go over another type of tree called m- way trees or trees of order m. In a binary tree Each node has only one key and Each node has up to two children In a m-way tree Each node hold at least 1 and at most m-1 keys and Each node has at most m children

B-Tree An example of m-way tree is called B-trees A B-tree of order m is a multiway search tree with the following properties The root has at least two subtrees unless it is a leaf Except the root and the leaves, every other node has at most m children and at least m/2 children This means every node hold at most m-1 keys and at least (m/2) -1 keys For example, a tree of order m=5, holds at least 2 and at most 4 keys and 5 pointers. Similarly, a tree of order m=8, hold at least 3 and at most 7 keys in each node and has 8 pointers. Every leaf node holds at most m-1 keys and at least (m/2) -1 keys All leaves are on the same level

Inserting a key into a B-Tree Some of the differences of B-trees compare to binary trees is that: All leaves are in the last level of the tree. Note that this was not necessarily the case for the binary tree The tree is built bottom-up rather than up to bottom as it is in binary trees In general there are three cases to consider when we insert a element into a B-tree of order m.

Insertion into B Tree: Case 1: A key is placed in a leaf that still has some room For example, as shown in the following example, in a B- tree of order 5, a new key, 7, is placed in a leaf, preserving the order of keys in the leaf so that key 8 must be shifted to the right by one position

Insertion into B Tree: Case 2: The leaf in which a key should be placed is full. In this case the leaf is split, creating a new leaf, and half of the keys are moved from the leaf full node to the new leaf. The last key of the old leaf is moved to the parent and a pointer to the new leaf is placed in the parent as well move 78 Want to insert 6

Insertion into B Tree: Case 3: Suppose you want to insert a key into a full node Because the node is full, the node is split and the middle key is moved to the parent node as we explained in case 2. What if the parent has no more room? If parent has no more room, we need to split the parent node, create two nodes and move the middle key up the tree into the parent of the parent. If parent of the parent does not exist, we create one If parent of the parent has room, we insert the middle key there If parent of the parent is also full, we repeat the same process again

Move Insert

Algorithm for inserting into B-Tree BTreeInsert(K) Find a leaf node to insert While (true) { Find a proper position in the leaf for K; If there is space in that node Insert K in proper position Return Else split in node in node1 and node2 Distribute keys and pointers evenly between node1 and node2 K = the last key of node1 If node was the root Create a new root as parent of node1 and node2 Put K and pointers to node1 and node2 in the root Return Else node = its parent // now process the parent node if it is full }

After inserting After inserting 8, 14, 2, and 15 Step by step of insertion into a B-Tree Insert 8, 14, 2, 15, 3, 1, 16, 6, 5, 27, 37, 18, 25, 7, 13, 20, 22, 23, 24 into a tree of order m = 5

After inserting After inserting 1, 16,

After inserting After inserting

After inserting After inserting 18, 25, 7,

After inserting After inserting 22,

Another example: This time we want to insert a set of numbers into a tree of order m = 4. For order 4, the number of keys in each node is at least 1 and at most 3. Insert 8, 14, 2, 15, 3, 1, 16, 6, 5, 27, 37, 18, 25, 7, 13, 20, 22, 23, 24 After inserting After inserting 8, 14, and

After inserting 3, 1, and After inserting

After inserting After inserting

After inserting 37, 13, After inserting

Another example: This time we present the B-tree with more detail that shows how the index keys are connected to specific records in the disk. Suppose we want to create indexing using B-tree of order m = 3 for the following employee records in the disk Assuming that the EmpId is unique in the employee table, the best index key can be the EmpId This example shows step by step of creating B-tree of order m=3 and illustrates how the pointers are linked to the records in the disk 2Jack30,000 80Steve32,000 8John50,000 71Nancy55,000 15Rose90,000 63Abdul35,000 90Pat42,000 55Kathy45,000 35Melissa38,000 51Joe39,000 EmpIdNameSalary

Insert index for record: 2Jack30,000 2 Null Pointer Before After

Insert index for record: 80Steve32, Before After

Insert index for record: 8John50, Before After

Insert index for record: 71Nancy55, Before After

Insert index for record: 15Rose90, Before After

Insert index for record: 63Abdul35, Before After

Insert index for record: 90Pat42, Before After

Insert index for record: 55Kathy45, Before After

Insert index for record: 35Melissa38, Before After

Insert index for record: 51Joe39, Before After

Jack30,000 80Steve32,000 8John50,000 71Nancy55,000 15Rose90,000 63Abdul35,000 90Pat42,000 55Kathy45,000 35Melissa38,000 51Joe39,000 EmpIdNameSalary

Deleting from a B-tree For the delete operation, there are two general cases: Deleting a key from the leaf Deleting a key from a non-leaf Case 1: Deleting from a leaf node: If after deleting a key K, the leaf is at least half full, simply delete the element

If after deleting, the number of keys in the leaf is less than (m/2) -1, causing an underflow: If there is a left or right sibling with the number of keys exceeding the minimal (m/2) -1, then all keys from this leaf and this sibling are redistributed between the two nodes Before deleting After deleting 7

If after deleting, the number of keys in the leaf is less than (m/2) -1, causing an underflow: If neither left no right sibling have more than minimal f (m/2) -1, then merge the node with one of the siblings and place proper index in the parent node Before deleting After deleting merge

A particular case results in merging a leaf or nonleaf with its sibling when its parent is the root with only one key. In this case, the keys from the node and its sibling, along with the only key of the root, are put in the node which becomes a new root, and both the sibling and the old root nodes are discarded. This is the only case when two nodes disappear at the same time. Also the height of the tree is decreased by one See the next example.

Before Deleting is deleted but process continues merge After deleting 8

Case 2: Deleting from a non-leaf node This can lead to problems with reorganization. Therefore, deleting from a nonleaf node should be reduced to deleting a key from a leaf to make the task simple The key to be deleted is replaced by its immediate predecessor (the successor could also be used) which can only be found in a leaf. This predecessor key is deleted from the leaf based on the algorithm we discussed in case 1

Before deleting Swap 16 and 15 and delete After deleting 16

B-Tree delete algorithm Node = Search for the node that contains key K to be deleted; If (node is not a leaf) Find a leaf with the closest successor/predecessor S of K Copy S over K in node; Node = the leaf containing S Delete S from node Else delete K from node; While (1) { If node does not underflow Return else if there is a sibling of node with enough keys (i. e. more than (m/2)-1) Redistribute keys between node and its sibling Return else if node’s parent is the root If the parent has only one key Merge node, its sibling, and parent to form a new root else merge node and its sibling Return else merge node and its sibling node = its parent }

B* Tree A “B*-Tree” is a variant of the B-Tree. All the nodes except the root are required to be at least two-third full. More precisely, the number of keys in all nodes except the root in a B*-tree of order m is k where (2m-1/3) <=k <= m-1 In this type of tree, the frequency of node splitting is decreased by delaying a split and when the time comes we split two nodes into three (not one node into two as done in B-tree) Lets see some examples of inserting into a B* tree

Before inserting 6 After inserting

B* Tree – Cont. As shown in the previous slide, the key 6 is to be inserted into the left node which is already full Instead of splitting the node, all keys from this node and its sibling are evenly divided and the median key, key 10, is put into the parent Notice that this not only evenly divides the keys, but also it frees some space in the nodes for more key If the sibling is also full, a split occurs and one new node is created, the keys from the node and its sibling (along with the separating keys from the parent) are evenly divided among three nodes and two separating keys are put into the parent See the next slide for an example.

Before inserting After inserting

B+ Tree Basically, a node in a B-tree structure represents one secondary page or a disk block The passing from one node to another node can be a time consuming operation in case we need to do something like in- order traversal or print of the B-tree B+ tree is enhanced form of B-tree that allow us to access data sequentially in a faster manner than using in order traversal In a B-tree, references to data are made from any node of the tree but in a B+ tree, these references are made only from the leaves The internal nodes of a B-tree are indexes to the leaves for fast access to the data

B+ Tree Cont. In a B+ tree, the leaves have a different structure than other nodes of the B+ tree and usually they are linked sequentially to form a sequence set so that scanning this list of leaves results in data given in ascending order The reason this is called B+ tree is that The internal nodes (not the leaves) all have the same structure as the B-tree) plus The leaves make a linked list of the keys Thus we can say that B+ tree is a combination of indexes plus a linked list of keys The internal node of B+ tree stores keys, and pointers to the next level nodes The leaves store keys, references to the records in a file, and pointer to the next leaf.

Algorithm for inserting into a B+ tree During the insert, when a leaf node is full and a new entry is inserted there, the node overflows and must split Given that the order of the B + -tree is p, the split of a leaf node causes the first p/2 entries (index keys) to remain in the original node and the rest move to the new node A copy of the middle key is placed into the parent node If the parent (non-leaf node) is full and we try to insert a new key there, the parent splits. Half of the nodes stay in the original node, the other half move to the new node, and the middle key is moved (not copied) to the parent node (just like B-tree). This process can be propagated all the way up to the root. In the next example, we go through step by step (with pointer details) of inserting into a B+ tree

Example of a B + -Tree of order 3 Example of inserting the index for the following records into a B + -tree 8Jack30,000 5Steve32,000 1John50,000 7Nancy55,000 3Rose90,000 12Abdul35,000 9Pat42,000 6Kathy45,000 EmpIdNameSalary

Insert index for record: 8Jack30,000 8 Null Pointer Before After

Insert index for record: 5Steve32, Before After

Insert index for record: 1John50, Before After

Insert index for record: 7Nancy55, Before After

Insert index for record: 3Rose90, Before After

Insert index for record: 12Abdul35, Before After

Insert index for record: 9Pat42, Before After

Insert index for record: 6Kathy45, Before After 7 8

8Jack30,000 5Steve32,000 1John50,000 7Nancy55,000 3Rose90,000 12Abdul35,000 9Pat42,000 6Kathy45,000 EmpIdNameSalary

Deleting from B+ tree If the deleting of a key does not cause underflow, we just have to make sure other keys are properly sorted Even if the index of the key to be deleted is in the internal node, the index can still be there because it is just a separator After Deleting 3 Before Deleting 3

When delete of a node from a leaf causes an underflow, then either the keys from this leaf and the keys of a sibling are redistributed between this leaf and its sibling or the leaf is deleted and the remaining keys are included in the sibling After Deleting Before Deleting 12

Trie In the previous examples we have used the entire key (not just part of it) to do searching of an index or an element A tree that uses parts of the key to navigate the search is called a trie (pronounced “try”) Each key is a sequence of characters and a trie is organized around these characters rather than entire keys Suppose that all keys are made of 5 letters A, E, I, P and R The next slide shows an example of a trie. For example, search for word “ERIE”, we first check the first level of trie, the pointer corresponding to the first letter of this word “E” is checked Since this pointer is not null, the second level is checked. Again it is not null and we follow the pointer from letter “R” Again other levels are checked till you either find the word or you reach NULL.

# A E IP R # A E IP R # A E IP R # A E IP R # A E IP R # A E IP R # A E IP R # A E IP R # A E IP R # A E IP R # A E IP R AraAra AreaArea EraEra EIreEIre IPAIPA A AreAre IREIRE RearRear RepRep PierPier PearPear PeerPeer PerPer EreEre ErieErie