CS235102 Data Structures Chapter 10 Search Structures.

Slides:



Advertisements
Similar presentations
CPSC 252 AVL Trees Page 1 AVL Trees Motivation: We have seen that when data is inserted into a BST in sorted order, the BST contains only one branch (it.
Advertisements

AVL Tree Smt Genap Outline AVL Tree ◦ Definition ◦ Properties ◦ Operations Smt Genap
One more definition: A binary tree, T, is balanced if T is empty, or if abs ( height (leftsubtree of T) - height ( right subtree of T) )
CS202 - Fundamental Structures of Computer Science II
Chapter 10 Efficient Binary Search Trees
CPSC 335 Height Balanced Trees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
CS Data Structures Chapter 10 Search Structures (Selected Topics)
AVL-Trees (Part 1) COMP171. AVL Trees / Slide 2 * Data, a set of elements * Data structure, a structured set of elements, linear, tree, graph, … * Linear:
1 Trees. 2 Outline –Tree Structures –Tree Node Level and Path Length –Binary Tree Definition –Binary Tree Nodes –Binary Search Trees.
Chapter 6: Transform and Conquer Trees, Red-Black Trees The Design and Analysis of Algorithms.
1 Binary Search Trees Implementing Balancing Operations –AVL Trees –Red/Black Trees Reading:
Self-Balancing Search Trees Chapter 11. Chapter 11: Self-Balancing Search Trees2 Chapter Objectives To understand the impact that balance has on the performance.
Self-Balancing Search Trees Chapter 11. Chapter Objectives  To understand the impact that balance has on the performance of binary search trees  To.
Chapter 13 Binary Search Trees. Copyright © 2005 Pearson Addison-Wesley. All rights reserved Chapter Objectives Define a binary search tree abstract.
CSC 2300 Data Structures & Algorithms February 13, 2007 Chapter 4. Trees.
Static Dictionaries Collection of items. Each item is a pair.  (key, element)  Pairs have different keys. Operations are:  initialize/create  get (search)
Data Structures Using C++ 2E Chapter 11 Binary Trees and B-Trees.
Binary search trees Definition Binary search trees and dynamic set operations Balanced binary search trees –Tree rotations –Red-black trees Move to front.
E.G.M. PetrakisTrees in Main Memory1 Balanced BST  Balanced BSTs guarantee O(logN) performance at all times  the height or left and right sub-trees are.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 45 AVL Trees and Splay.
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
Chapter 19 - basic definitions - order statistics ( findkth( ) ) - balanced binary search trees - Java implementations Binary Search Trees 1CSCI 3333 Data.
1 AVL-Trees: Motivation Recall our discussion on BSTs –The height of a BST depends on the order of insertion E.g., Insert keys 1, 2, 3, 4, 5, 6, 7 into.
CHAPTER 10 Search Structures All the programs in this file are selected from Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed “Fundamentals of Data.
§4 AVL Trees Target : Speed up searching (with insertion and deletion) Tool : Binary search trees root smalllarge Problem : Although T p = O( height ),
CS Data Structures Chapter 5 Trees. Chapter 5 Trees: Outline  Introduction  Representation Of Trees  Binary Trees  Binary Tree Traversals 
Balanced Trees (AVL and RedBlack). Binary Search Trees Optimal Behavior ▫ O(log 2 N) – perfectly balanced tree (e.g. complete tree with all levels filled)
Balanced Binary Search Tree 황승원 Fall 2010 CSE, POSTECH.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Trees Chapter.
Copyright Curt Hill Balance in Binary Trees Impact on Performance.
1 Trees 4: AVL Trees Section 4.4. Motivation When building a binary search tree, what type of trees would we like? Example: 3, 5, 8, 20, 18, 13, 22 2.
Chapter 9 Binary Tree and General Tree. Overview ● Two-way decision making is one of the fundamental concepts in computing.  A binary tree models two-way.
Static Dictionaries Collection of items. Each item is a pair.  (key, element)  Pairs have different keys. Operations are:  initialize/create  get (search)
Data Structures Chapter 10: Efficient Binary Search Trees 10-1.
D. ChristozovCOS 221 Intro to CS II AVL Trees 1 AVL Trees: Balanced BST Binary Search Trees Performance Height Balanced Trees Rotation AVL: insert, delete.
Chapter 4: Trees Part I: General Tree Concepts Mark Allen Weiss: Data Structures and Algorithm Analysis in Java.
AVL Trees An AVL tree is a binary search tree with a balance condition. AVL is named for its inventors: Adel’son-Vel’skii and Landis AVL tree approximates.
© Copyright 2012 by Pearson Education, Inc. All Rights Reserved. 1 Chapter 20 AVL Trees.
Lecture 10COMPSCI.220.FS.T Binary Search Tree BST converts a static binary search into a dynamic binary search allowing to efficiently insert and.
AVL TREES By Asami Enomoto CS 146 AVL Tree is… named after Adelson-Velskii and Landis the first dynamically balanced trees to be propose Binary search.
Foundation of Computing Systems Lecture 4 Trees: Part I.
MA/CSSE 473 Day 30 Optimal BSTs. MA/CSSE 473 Day 30 Student Questions Optimal Linked Lists Expected Lookup time in a Binary Tree Optimal Binary Tree (intro)
MA/CSSE 473 Days Optimal linked lists Optimal BSTs.
AVL Trees CSE, POSTECH.
Lec 13 Oct 17, 2011 AVL tree – height-balanced tree Other options:
Red Black Trees Colored Nodes Definition Binary search tree.
CS202 - Fundamental Structures of Computer Science II
Chapter 10 Search Structures
CS202 - Fundamental Structures of Computer Science II
AVL Trees binary tree for every node x, define its balance factor
AVL DEFINITION An AVL tree is a binary search tree in which the balance factor of every node, which is defined as the difference between the heights of.
Chapter 26 AVL Trees Jung Soo (Sue) Lim Cal State LA.
AVL Trees A BST in which, for any node, the number of levels in its two subtrees differ by at most 1 The height of an empty tree is -1. If this relationship.
Chapter 29 AVL Trees.
Balanced Trees (AVL and RedBlack)
AVL Tree.
Height Balanced Trees CPSC 335 Dr. Marina Gavrilova Computer Science
AVL Tree 27th Mar 2007.
Binary Tree and General Tree
AVL Trees CENG 213 Data Structures.
CS202 - Fundamental Structures of Computer Science II
Multi-Way Search Trees
CS202 - Fundamental Structures of Computer Science II
CS202 - Fundamental Structures of Computer Science II
Lecture 10 Oct 1, 2012 Complete BST deletion Height-balanced BST
AVL Trees B.Ramamurthy 4/27/2019 BR.
AVL Trees Dynamic tables may also be maintained as binary search trees. Depending on the order of the symbols putting into the table, the resulting binary.
CS202 - Fundamental Structures of Computer Science II
Red Black Trees Colored Nodes Definition Binary search tree.
CS202 - Fundamental Structures of Computer Science II
Presentation transcript:

CS Data Structures Chapter 10 Search Structures

Search Structures: Outline  Optimal Binary Search Trees  AVL Trees  2-3 Trees  Trees  Red Black Trees  B-Trees

Optimal binary search trees (1/14)  In this section we look at the construction of binary search trees for a static set of identifiers  Make no additions to or deletions from the  Only perform searches  We examine the correspondence between a binary search tree and the binary search function

Optimal binary search trees (2/14)  Examine: A binary search on the list (do, if, while) is equivalent to using the function (search2) on the binary search tree

Optimal binary search trees (3/14)  For a given static list, to decide a cost measure for search tree in order to find an optimal binary search tree  Assume that we wish to search for an identifier at level k of a binary search tree.  Generally, the number of iteration of binary search equals the level number of the identifier we seek.  It is reasonable to use the level number of a node as its cost.

 A full binary tree may not be an optimal binary search tree if the identifiers are searched for with different frequency  Consider these two search trees, If we search for each identifier with equal probability  In first tree, the average number of comparisons for successful search is 2.4.  Comparisons for second tree is 2.2.  The second tree has  a better worst case search time than the first tree.  a better average behavior ( )/5 = 2.4 ( )/5 = 2.2

Optimal binary search trees (5/14)  In evaluating binary search trees, it is useful to add a special square node at every place there is a null links.  We call these nodes external nodes.  We also refer to the external nodes as failure nodes.  The remaining nodes are internal nodes.  A binary tree with external nodes added is an extended binary tree

Optimal binary search trees (6/14)  External / internal path length  The sum of all external / internal nodes’ levels.  For example  Internal path length, I, is: I = = 7  External path length, E, is : E = = 17  A binary tree with n internal nodes are related by the formula E = I + 2n

Optimal binary search trees (7/14)  The maximum and minimum possible values for I with n internal nodes  Maximum:  The worst case occurs when the tree is skewed, that is, the tree has a depth of n.  Minimum:  We must have as many internal nodes as close to the root as possible in order to obtain trees with minimal I  One tree with minimal internal path length is the complete binary tree that the distance of node i from the root is  log 2 i .

Optimal binary search trees (8/14)  In the binary search tree:  The identifiers a 1, a 2, …, a n with a 1 < a 2 < … < a n  The probability of searching for each a i is p i  The total cost (when only successful searches are made) is:  If we replace the null subtree by a failure node, we may partition the identifiers that are not in the binary search tree into n+1 classes E i, 0 ≤ i ≤ n  E i contains all identifiers x such that a i < x < a i+1  For all identifiers in a particular class, E i, the search terminates at the same failure node

Optimal binary search trees (9/14)  We number the failure nodes form 0 to n with i being for class E i, 0  i  n.  If q i is the probability that the identifier we are searching for is in E i, then the cost of the failure node is:  Therefore, the total cost of a binary search tree is:  An optimal binary search tree for the identifier set a 1, …, a n is one that minimizes Eq. (10.1)  Since all searches must terminate either successfully or unsuccessfully, we have (10.1)

Optimal binary search trees (10/14)  The possible binary search trees for the identifier set (a 1, a 2, a 3 ) = (do, if, while)  The identifiers with equal probabilities, p i =a j =1/7 for all i, j,  cost(tree a) = 15/7; cost(tree b) = 13/7 (optimal); cost(tree c) = 15/7; cost(tree d) = 15/7; cost(tree e) = 15/7;  p 1 = 0.5, p 2 = 0.1, p 3 = 0.05, q 0 = 0.15, q 1 = 0.1, q 2 = 0.05, q 3 = 0.05  cost(tree a) = 2.65; cost(tree b) = 1.9; cost(tree c) = 1.5; (optimal) cost(tree d) = 2.05; cost(tree e) = 1.6; E0E0 E1E1 E2E2 E3E3

Optimal binary search trees (11/14)  How do we determine the optimal binary search tree for a given set of identifiers?  We can make some observations about the properties of optimal binary search trees  T ij : an optimal binary search tree for a i+1, …, a j, i < j.  T ii is an empty tree for 0  i  n and T ij is not defined for i > j.  c ij : the cost of the search tree T ij.  By definition c ii is 0.  r ij : the root of T ij  w ij : the weight of T ij,  By definition, r ii = 0 and w ii = q i, 0  i  n.  T 0n is an optimal binary search for a 1, …, a n. Its cost is c 0n, its weight is w 0n, and its root is r 0n

Optimal binary search trees (12/14)  If T ij is an optimal binary search tree for a i+1, …, a j and r ij = k, then k satisfies the inequality i < k  j.  T has two subtrees L and R.  L is the left subtree and the identifiers a i+1, …, a k-1  R is the right subtree and the identifiers a k+1, …, a j  The cost c ij of T ij is (w ij = p k + w i,k-1 + w kj ) p k + cost(L) + cost(R) + weight(L) + weight(R) = p k + C i,k-1 + C kj + w i,k-1 + w kj = w ij + C i,k-1 + C kj = w ij + p k + cost(L) + cost(R) + weight(L) + weight(R) = p k + C i,k-1 + C kj + w i,k-1 + w kj = w ij + C i,k-1 + C kj = w ij +  It shows us how to obtain T 0n and C 0n, starting from knowledge that T ii =  and c ii = 0 akak LR

Optimal binary search trees (13/14)  Example  Let n = 4, (a 1, a 2, a 3, a 4 ) = (do, for, void, while). Let (p 1, p 2, p 3, p 4 ) = (3, 3, 1, 1) and (q 0, q 1, q 2, q 3, q 4 ) = (2, 3, 1, 1, 1).  Initially w ii = q i, c ii = 0, and r ii = 0, 0 ≤ i ≤ 4 w 01 = p 1 + w 00 + w 11 = p 1 + q 1 + w 00 = 8 c 01 = w 01 + min{c 00 +c 11 } = 8, r 01 = 1 w 12 = p 2 + w 11 + w 22 = p 2 +q 2 +w 11 = 7 c 12 = w 12 + min{c 11 +c 22 } = 7, r 12 = 2 w 23 = p 3 + w 22 + w 33 = p 3 +q 3 +w 22 = 3 c 23 = w 23 + min{c 22 +c 33 } = 3, r 23 = 3 w 34 = p 4 + w 33 + w 44 = p 4 +q 4 +w 33 = 3 c 34 = w 34 + min{c 33 +c 44 } = 3, r 34 = 4

Optimal binary search trees (14/14)  w ii = q i  w ij = p k + w i,k-1 + w kj  c ij = w ij +  c ii = 0  r ii = 0  r ij = l Computation is carried out row-wise from row 0 to row 4 The optimal search tree as the result (a1, a2, a3, a4) = (do,for,void,while) (p1, p2, p3, p4) = (3, 3, 1, 1) (q0, q1, q2, q3, q4) = (2, 3, 1, 1, 1)

AVL Trees (1/17)  We also may maintain dynamic tables as binary search trees.  Figure 10.8 shows the binary search tree obtained by entering the months January to December, in that order, into an initially empty binary search tree  The maximum number of comparisons needed to search for any identifier in the tree of Figure 10.8 is six (for November).  Average number of comparisons is 42/12 = 3.5

AVL Trees (2/17)  Suppose that we now enter the months into an initially empty tree in alphabetical order  The tree degenerates into the chain  number of comparisons: maximum: 12, and average: 6.5  in the worst case, binary search trees correspond to sequential searching in an ordered list

 Another insert sequence  In the order Jul, Feb, May, Aug, Jan, Mar, Oct, Apr, Dec, Jun, Nov, and Sep, by Figure  Well balanced and does not have any paths to leaf nodes that are much longer than others.  Number of comparisons: maximum: 4, and average: 37/12  3.1.  All intermediate trees created during the construction of Figure 10.9 are also well balanced  If all permutations are equally probable, then we can prove that the average search and insertion time is O(logn) for n node binary search tree

AVL Trees (4/17)  Since we have a dynamic environment, it is hard to achieve:  Required to add new elements and maintain a complete binary tree without a significant increasing time  Adelson-Velskii and Landis introduced a binary tree structure (AVL trees):  Balanced with respect to the heights of the subtrees.  We can perform dynamic retrievals in O(logn) time for a tree with n nodes.  We can enter an element into the tree, or delete an element form it, in O(logn) time. The resulting tree remain height balanced.  As with binary trees, we may define AVL tree recursively

AVL Trees (5/17)  Definition:  An empty binary tree is height balanced. If T is a nonempty binary tree with T L and T R as its left and right subtrees, then T is height balanced iff  T L and T R are height balanced, and  |h L - h R |  1 where h L and h R are the heights of T L and T R, respectively.  The definition of a height balanced binary tree requires that every subtree also be height balanced

AVL Trees (6/17)  This time we will insert the months into the tree in the order  Mar, May, Nov, Aug, Apr, Jan, Dec, Jul, Feb, Jun, Oct, Sep  It shows the tree as it grows, and the restructuring involved in keeping it balanced.  The numbers by each node represent the difference in heights between the left and right subtrees of that node  We refer to this as the balance factor of the node  Definition:  The balance factor, BF(T), of a node, T, in a binary tree is defined as h L - h R, where h L (h R ) are the heights of the left(right) subtrees of T. For any node T in an AVL tree BF(T) = -1, 0, or 1.

AVL Trees (7/17)  Insertion into an AVL tree

AVL Trees (8/17)  Insertion into an AVL tree (cont’d)

AVL Trees (11/17)  We carried out the rebalancing using four different kinds of rotations: LL, RR, LR, and RL  LL and RR are symmetric as are LR and RL  These rotations are characterized by the nearest ancestor, A, of the inserted node, Y, whose balance factor becomes  2.  LL: Y is inserted in the left subtree of the left subtree of A.  LR: Y is inserted in the right subtree of the left subtree of A  RR: Y is inserted in the right subtree of the right subtree of A  RL: Y is inserted in the left subtree of the right subtree of A

 Rebalancing rotations AVL Trees (12/17)

 Rebalancing rotations AVL Trees (13/17)

 Rebalancing rotations (cont’d) AVL Trees (14/17)

 Rebalancing rotations (cont’d) AVL Trees (15/17)

 Rebalancing rotations (cont’d)

AVL Trees (17/17)  Complexity:  In the case of binary search trees, if there were n nodes in the tree, then h (the height of tree) could be be n and the worst case insertion time would be O(n).  In the case of AVL trees, since h is at most (log n), the worst case insertion time is O(log n).  Figure compares the worst case times of certain operations

2-3 Trees

2-3-4 Trees

Red-black Trees

B-Trees

B-Trees

B-Trees

B-Trees

B-Trees

B-Trees

Splay Trees

Digital Trees

Tries

Tries

Tries

Tries

Tries

Tries