Trees (Chapter 4) Binary Search Trees - Review Definition

Slides:



Advertisements
Similar presentations
AVL Trees CSE 373 Data Structures Lecture 8. 12/26/03AVL Trees - Lecture 82 Readings Reading ›Section 4.4,
Advertisements

CS 261 – Data Structures AVL Trees. Binary Search Tree: Balance Complexity of BST operations: proportional to the length of the path from the root to.
CPSC 252 AVL Trees Page 1 AVL Trees Motivation: We have seen that when data is inserted into a BST in sorted order, the BST contains only one branch (it.
AVL Trees COL 106 Amit Kumar Shweta Agrawal Slide Courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
CS261 Data Structures AVL Trees. Goals Pros/Cons of a BST AVL Solution – Height-Balanced Trees.
AVL Tree Smt Genap Outline AVL Tree ◦ Definition ◦ Properties ◦ Operations Smt Genap
Single Right rotation (compare to RotateWithLeftChild page 148 text) void rotateRight(AVLNode *& curr) { AVLNode * kid = curr->left; curr->left = kid->right;
CSE332: Data Abstractions Lecture 7: AVL Trees Tyler Robison Summer
CSE 326: Data Structures AVL Trees
AVL Trees / Slide 1 Balanced Binary Search Tree  Worst case height of binary search tree: N-1  Insertion, deletion can be O(N) in the worst case  We.
AVL Trees ITCS6114 Algorithms and Data Structures.
1 AVL-Trees: Motivation Recall our discussion on BSTs –The height of a BST depends on the order of insertion E.g., Insert keys 1, 2, 3, 4, 5, 6, 7 into.
1 Joe Meehean.  BST efficiency relies on height lookup, insert, delete: O(height) a balanced tree has the smallest height  We can balance an unbalanced.
Binary trees -2 Chapter Threaded trees (depth first) Binary trees have a lot of wasted space: the leaf nodes each have 2 null pointers We can.
AVL Trees An AVL tree is a binary search tree with a balance condition. AVL is named for its inventors: Adel’son-Vel’skii and Landis AVL tree approximates.
AVL Trees / Slide 1 Height-balanced trees AVL trees height is no more than 2 log 2 n (n is the number of nodes) Proof based on a recurrence formula for.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Linda Shapiro Winter 2015.
AVL Trees. AVL Tree In computer science, an AVL tree is the first-invented self-balancing binary search tree. In an AVL tree the heights of the two child.
CSE332: Data Abstractions Lecture 7: AVL Trees
Lecture 15 Nov 3, 2013 Height-balanced BST Recall:
Lec 13 Oct 17, 2011 AVL tree – height-balanced tree Other options:
Multiway Search Trees Data may not fit into main memory
Binary Search Trees A binary search tree is a binary tree
Balancing Binary Search Trees
UNIT III TREES.
CSIT 402 Data Structures II
CS202 - Fundamental Structures of Computer Science II
CS202 - Fundamental Structures of Computer Science II
Introduction Applications Balance Factor Rotations Deletion Example
AVL Trees A BST in which, for any node, the number of levels in its two subtrees differ by at most 1 The height of an empty tree is -1. If this relationship.
CS 201 Data Structures and Algorithms
AVL Search Trees Introduction What is an AVL Tree?
Chapter 29 AVL Trees.
Balanced Trees AVL : Adelson-Velskii and Landis(1962)
CSE373: Data Structures & Algorithms
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
AVL Tree Mohammad Asad Abbasi Lecture 12
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
AVL Tree 27th Mar 2007.
AVL Trees "The voyage of discovery is not in seeking new landscapes but in having new eyes. " - Marcel Proust.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
Monday, April 16, 2018 Announcements… For Today…
AVL Trees CENG 213 Data Structures.
Friday, April 13, 2018 Announcements… For Today…
Instructor: Lilian de Greef Quarter: Summer 2017
CS202 - Fundamental Structures of Computer Science II
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act
CSE 373: Data Structures and Algorithms
v z Chapter 10 AVL Trees Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich,
CSE373: Data Structures & Algorithms Lecture 5: AVL Trees
Tree Rotations and AVL Trees
Balanced BSTs "The voyage of discovery is not in seeking new landscapes but in having new eyes. " - Marcel Proust CLRS, pages 333, 337.
AVL Trees CSE 373 Data Structures.
CSE 332: Data Abstractions AVL Trees
CE 221 Data Structures and Algorithms
Self-Balancing Search Trees
CS202 - Fundamental Structures of Computer Science II
CSE 373 Data Structures and Algorithms
AVL Trees (a few more slides)
Lecture 10 Oct 1, 2012 Complete BST deletion Height-balanced BST
ITCS6114 Algorithms and Data Structures
Richard Anderson Spring 2016
CSE 373 Data Structures Lecture 8
326 Lecture 9 Henry Kautz Winter Quarter 2002
CS202 - Fundamental Structures of Computer Science II
CS202 - Fundamental Structures of Computer Science II
AVL Search Trees What is an AVL Tree? AVL Tree Implementation.
Self-Balancing Search Trees
AVL Trees: AVL Trees: Balanced binary search tree
Presentation transcript:

Trees (Chapter 4) Binary Search Trees - Review Definition Key in the root of the left subtree is less than (or equal to) the root. Key in the root of the right subtree is greater than the root. Left and right subtrees are binary search trees.

Where would you add X, B, and E?

Consider the following code template <class Etype> class BinarySearchTree { protected: TreeNode<Etype> *root; void makeEmpty( TreeNode<Etype> * & t ); bool insert(const Etype & x, TreeNode<Etype> * & t ); Etype* find( const Etype & x, TreeNode<Etype> * t ); void printTree( TreeNode<Etype> * t ,string indent) const;   public: BinarySearchTree( ) : root( NULL ) { } void makeEmpty( ) { makeEmpty( root ); } void printTree(string msg) const { cout << msg << endl; printTree( root,""); } virtual Etype* find( const Etype & x ) { return find( x, root ); } virtual int insert(Etype & x ) { return insert( x, root ); } }; template <class Etype> class TreeNode { public: Etype element; TreeNode<Etype> *left; TreeNode <Etype> *right; TreeNode( Etype e = 0, TreeNode <Etype> * l = NULL, TreeNode <Etype> *r = NULL ) : element( e ), left( l ), right( r ) { } }; Why do we pass value by reference? Why a const?

Consider the following code template <class Etype> class BinarySearchTree { protected: TreeNode<Etype> *root; void makeEmpty( TreeNode<Etype> * & t ); bool insert(const Etype & x, TreeNode<Etype> * & t ); Etype* find( const Etype & x, TreeNode<Etype> * t ); void printTree( TreeNode<Etype> * t ,string indent) const;   public: BinarySearchTree( ) : root( NULL ) { } void makeEmpty( ) { makeEmpty( root ); } void printTree(string msg) const { cout << msg << endl; printTree( root,""); } virtual Etype* find( const Etype & x ) { return find( x, root ); } virtual int insert(Etype & x ) { return insert( x, root ); } }; template <class Etype> class TreeNode { public: Etype element; TreeNode<Etype> *left; TreeNode <Etype> *right; TreeNode( Etype e = 0, TreeNode <Etype> * l = NULL, TreeNode <Etype> *r = NULL ) : element( e ), left( l ), right( r ) { } }; What does this const mean?

Consider the following code template <class Etype> class BinarySearchTree { protected: TreeNode<Etype> *root; void makeEmpty( TreeNode<Etype> * & t ); bool insert(const Etype & x, TreeNode<Etype> * & t ); Etype* find( const Etype & x, TreeNode<Etype> * t ); void printTree( TreeNode<Etype> * t ,string indent) const;   public: BinarySearchTree( ) : root( NULL ) { } void makeEmpty( ) { makeEmpty( root ); } void printTree(string msg) const { cout << msg << endl; printTree( root,""); } virtual Etype* find( const Etype & x ) { return find( x, root ); } virtual int insert(Etype & x ) { return insert( x, root ); } }; template <class Etype> class TreeNode { public: Etype element; TreeNode<Etype> *left; TreeNode <Etype> *right; TreeNode( Etype e = 0, TreeNode <Etype> * l = NULL, TreeNode <Etype> *r = NULL ) : element( e ), left( l ), right( r ) { } }; Why pointer reference?

Consider the following code template <class Etype> class BinarySearchTree { protected: TreeNode<Etype> *root; void makeEmpty( TreeNode<Etype> * & t ); bool insert(const Etype & x, TreeNode<Etype> * & t ); Etype* find( const Etype & x, TreeNode<Etype> * t ); void printTree( TreeNode<Etype> * t ,string indent) const;   public: BinarySearchTree( ) : root( NULL ) { } void makeEmpty( ) { makeEmpty( root ); } void printTree(string msg) const { cout << msg << endl; printTree( root,""); } virtual Etype* find( const Etype & x ) { return find( x, root ); } virtual int insert(Etype & x ) { return insert( x, root ); } }; template <class Etype> class TreeNode { public: Etype element; TreeNode<Etype> *left; TreeNode <Etype> *right; TreeNode( Etype e = 0, TreeNode <Etype> * l = NULL, TreeNode <Etype> *r = NULL ) : element( e ), left( l ), right( r ) { } }; Why this initialization?

Consider the following code template <class Etype> class BinarySearchTree { protected: TreeNode<Etype> *root; void makeEmpty( TreeNode<Etype> * & t ); bool insert(const Etype & x, TreeNode<Etype> * & t ); Etype* find( const Etype & x, TreeNode<Etype> * t ); void printTree( TreeNode<Etype> * t ,string indent) const;   public: BinarySearchTree( ) : root( NULL ) { } void makeEmpty( ) { makeEmpty( root ); } void printTree(string msg) const { cout << msg << endl; printTree( root,""); } virtual Etype* find( const Etype & x ) { return find( x, root ); } virtual int insert(Etype & x ) { return insert( x, root ); } }; template <class Etype> class TreeNode { public: Etype element; TreeNode<Etype> *left; TreeNode <Etype> *right; TreeNode( Etype e = 0, TreeNode <Etype> * l = NULL, TreeNode <Etype> *r = NULL ) : element( e ), left( l ), right( r ) { } }; Why two declarations?

Consider the following code template <class Etype> class BinarySearchTree { protected: TreeNode<Etype> *root; void makeEmpty( TreeNode<Etype> * & t ); bool insert(Etype & x, TreeNode<Etype> * & t ); Etype* find( const Etype & x, TreeNode<Etype> * t ); void printTree( TreeNode<Etype> * t ,string indent) const;   public: BinarySearchTree( ) : root( NULL ) { } void makeEmpty( ) { makeEmpty( root ); } void printTree(string msg) const { cout << msg << endl; printTree( root,""); } virtual Etype* find( const Etype & x ) { return find( x, root ); } virtual int insert(Etype & x ) { return insert( x, root ); } }; template <class Etype> class TreeNode { public: Etype element; TreeNode<Etype> *left; TreeNode <Etype> *right; TreeNode( Etype e = 0, TreeNode <Etype> * l = NULL, TreeNode <Etype> *r = NULL ) : element( e ), left( l ), right( r ) { } }; Write the insert routine

Binary Search Tree - Best Time All BST operations are O(d), where d is tree depth minimum d is for a binary tree with N nodes What is the best case tree? What is the worst case tree? best case running time of BST operations is O(log N) AVL Trees -

Binary Search Tree - Worst Time Worst case running time is O(N) What happens when you Insert elements in ascending order? Insert: 2, 4, 6, 8, 10, 12 into an empty BST Problem: Lack of “balance”: compare depths of left and right subtree Unbalanced degenerate tree AVL Trees -

Balanced and unbalanced BST 1 8 2 2 10 30 1 5 45 40 Is this “balanced”? 47 20 62 56 11 33 45 77 61 AVL Trees -

AVL Trees

Approaches to balancing trees Don't balance May end up with some nodes very deep Strict balance The tree must always be balanced perfectly Pretty good balance – may be easier to maintain Only allow a little out of balance Adjust on access Self-adjusting AVL Trees -

Balancing Binary Search Trees Many algorithms exist for keeping binary search trees balanced Adelson-Velskii and Landis (AVL) trees (height-balanced trees) Splay trees and other self-adjusting trees B-trees and other multiway search trees AVL Trees -

Perfect Balance Want a complete tree after every operation tree is full except possibly in the lower right This is expensive For example, insert 2 in the tree on the left and then rebuild as a complete tree 6 5 Insert 2 & complete tree 4 9 2 8 1 5 8 1 4 6 9 AVL Trees -

AVL - Good but not Perfect Balance AVL trees are height-balanced binary search trees Balance factor of a node height(left subtree) - height(right subtree) An AVL tree has balance factor calculated at every node For every node, heights of left and right subtree can differ by no more than 1 Could store current heights in each node AVL Trees -

Node Heights 6 6 4 9 4 9 1 5 1 5 8 Tree A (AVL) Tree B (AVL) height=2 BF=1-0=1 2 6 6 1 1 1 4 9 4 9 1 5 1 5 8 height of node = h balance factor = hleft-hright empty height = -1 AVL Trees -

Node Heights after Insert 7 Tree A (AVL) Tree B (not AVL) balance factor 1-(-1) = 2 2 3 6 6 1 1 1 2 Heavy to the left 4 9 4 9 TOO Heavy to the left -1 1 1 5 7 1 5 8 7 height of node = h balance factor = hleft-hright empty height = -1 AVL Trees -

Insert and Rotation in AVL Trees Insert operation may cause balance factor to become 2 or –2 for some node(s) only nodes on the path from insertion point to root node have possibly changed in height So after the Insert, go back up to the root node by node, updating heights If a new balance factor (the difference hleft-hright) is 2 or –2, adjust tree by rotation around the node. Important that you rotate about the first unhappy node. AVL Trees -

Single Rotation in an AVL Tree (right rotation) 2 2 6 6 1 2 1 1 4 9 4 8 1 1 5 8 1 5 7 9 7 AVL Trees -

Insertions in AVL Trees Let the node that needs rebalancing be . There are 4 cases: Single rotation : 1. Node is too heavy to the left, and left child is heavy to the left 2. Node is too heavy to the right, and right child is heavy to the right. Double rotation : 1. Node is too heavy to the left, and left child is heavy to the right 2. Node is too heavy to the right, and right child is heavy to the left. The rebalancing is performed through four separate rotation algorithms. AVL Trees -

AVL Insertion j Consider a valid AVL subtree k h Z h h X Y AVL Trees -

AVL Insertion: Single rotation j Inserting into X destroys the AVL property at node j k h Z h+1 h Y X AVL Trees -

AVL Insertion: Single Rotation j Do a “right rotation” k h Z h+1 h Y X AVL Trees -

j k Z Y X Single right rotation Do a “right rotation” h h+1 h AVL Trees -

k j X Z Y Rotation Completed AVL property has been restored! “Right rotation” done! (“Left rotation” is mirror symmetric) j h+1 h h X Z Y AVL property has been restored! AVL Trees -

AVL Insertion: Inside Case j Consider a valid AVL subtree k h Z h h X Y AVL Trees -

AVL Insertion: Double Rotation j Inserting into Y destroys the AVL property at node j Does “right rotation” restore balance? k h Z h h+1 X Y AVL Trees -

AVL Insertion: Inside Case k “Right rotation” does not restore balance… now k is out of balance j h X h h+1 Z Y AVL Trees -

AVL Insertion: Inside Case j Consider the structure of subtree Y… k h Z h h+1 X Y AVL Trees -

AVL Insertion: Inside Case j Y = node i and subtrees V and W k h Z i h h+1 X h or h-1 V W AVL Trees -

AVL Insertion: Inside Case j We will do a left-right “double rotation” . . . k Z i X V W AVL Trees -

Double rotation : first rotation j left rotation complete i Z k W V X AVL Trees -

Double rotation : second rotation j Now do a right rotation i Z k W V X AVL Trees -

Double rotation : second rotation right rotation complete Balance has been restored i k j h h h or h-1 V X W Z AVL Trees -

Though it is really two rotations, visually you can think of pushing the i up through the nodes. j Y = node i and subtrees V and W k h Z i h h+1 X h or h-1 V W AVL Trees -

Implementation balance (1,0,-1) key left right No need to keep the height; just the difference in height, i.e. the balance factor; this has to be modified on the path of insertion even if you don’t perform rotations Once you have performed a rotation (single or double) you won’t need to go back up the tree AVL Trees -

Insertion in AVL Trees Insert at the leaf (as for all BST) only nodes on the path from insertion point to root node have possibly changed in height So after the Insert, go back up to the root node by node, updating heights If a new balance factor (the difference hleft-hright) is 2 or –2, adjust tree by rotation around the node AVL Trees -

Example of Insertions in an AVL Tree 2 20 Insert 5, 40 1 10 30 25 35 AVL Trees -

Example of Insertions in an AVL Tree. What is balance factor Example of Insertions in an AVL Tree. What is balance factor? height(left subtree) - height(right subtree) 2 3 20 20 1 1 1 2 10 30 10 30 1 5 25 35 5 25 35 40 Now Insert 45 AVL Trees -

Single rotation 3 3 20 20 1 2 1 2 10 30 10 30 2 1 5 25 35 5 25 40 35 45 40 1 Imbalance 45 Now Insert 34 AVL Trees -

Double rotation (inside case) 3 3 20 20 1 3 1 2 10 30 10 35 2 1 1 5 Imbalance 25 40 5 30 40 25 34 1 45 35 45 Insertion of 34 34 AVL Trees -

What about the code? AVL Trees -

Single Right rotation (compare to RotateWithLeftChild page 156 text) void rotateRight(AVLNode *& curr) { assert(curr!=NULL); AVLNode * kid = curr->left; assert(kid!=NULL); curr->left = kid->right; kid->right = curr; curr->height = max(height(curr->left), height(curr->right)) + 1; kid>height = max(height(kid->left), height(kid->right)) + 1 curr = kid; }

AVL Tree Deletion Similar but more complex than insertion Rotations and double rotations needed to rebalance Imbalance may propagate upward so that many rotations may be needed. Idea: Replace with inorder successor Delete cell that contained the inorder sucessor AVL Trees -

Why did we need to pass the root by *& ? If you want the value of an int to change, you pass it by reference. If you want the value of a POINTER to change, you pass the pointer by reference. You can pass the address of an int to a routine and change the value of the int. But passing an address DOES NOT allow you to change the address itself, unless you pass it by reference.

Pass by value doit(int x) {x = 10; cout << “doit x” << x; } main() { int x; x = 5; doit(x); cout << “main x “ << x; } 5 10 x 5 x 0x200 0x100

Pass by address doit(int *x) {*x = 10; cout << “doit x” << *x; } main() { int x; x = 5; doit( &x); cout << “main x “ << x; } 0x100 x 5 10 x 0x200 0x100

Pass by reference doit(int &x) { x = 10; cout << “doit x” << x; } main() { int x; x = 5; doit( x); cout << “main x “ << x; } 0x100 x 5 10 x 0x200 0x100 Similar to pass by address, but the compiler does all the work for you.

Pass by value with pointers doit(int *x) {x = new int(); *x = 27 cout << “doit x” << x << *x; } main() { int * x = NULL; doit(x); cout << “main x “ << x <<*x; } NULL 0x300 x NULL x 0x200 0x100 27 0x300

Pass by address with pointers doit(int **x) {*x = new int(); **x = 27 cout << “doit x” << *x << **x; } main() { int * x = NULL; doit(&x); cout << “main x “ << x <<*x; } 0x100 x NULL 0x300 x 0x200 0x100 27 0x300

Pass by pointer reference doit(int * &x) { x = new int(); *x = 27 cout << “doit x” << x << *x; } main() { int * x = NULL; doit( x); cout << “main x “ << x <<*x; } 0x100 x NULL 0x300 x 0x200 0x100 27 0x300

Notation trick I think of the & as being TWO different symbols, depending on where it is. As a right hand value (rvalue), &x means take the address of As a parameter, &x means x is a reference. AVL Trees -

Notation trick int **x is read backwards – x is a pointer to a pointer to an int. I can regroup (mentally) int * (*x) then reads, *x is a pointer ton an int. int (**x) then reads, **x is an int. AVL Trees -

Assigning pointers – simply copies contents (addresses) int * t = new int (); *t = 15; int * s = t; int ** addr = &t; *addr = new int(); 0x200 t 0x400 0x500 0x300 0x400 s 0x400 15 0x700 0x200 addr 0x500

Assigning pointers – simply copies contents (addresses) 0x200 int * t = new int (); *t = 15; int * s = t; *t = 12; What happens? int ** addr = &t; *addr = new int(); *t = 25; What happens? t 0x400 0x500 0x300 0x400 s 0x400 15 0x700 0x200 addr 0x500

So what about rotateRight? 0x100 void rotateRight(AVLNode *& curr) { AVLNode * kid = curr->left; curr->left = kid->right; kid->right = curr; curr->height = max(height(curr->left), height(curr->right)) + 1; kid>height = max(height(kid->left), height(kid->right)) + 1 curr = kid; } 0x200 25 0x80 curr 0x200 0x80 0x320 15 50 kid 0x320 0x320 0x400 10 0x100 0x400 curr 5 When the calling routine passes in root->left, the compiler records the ADDRESS of the root’s left pointer (which is 0x100 in our case).

So what about rotateRight? 0x100 void rotateRight(AVLNode *& curr) { AVLNode * kid = curr->left; curr->left = kid->right; kid->right = curr; curr->height = max(height(curr->left), height(curr->right)) + 1; kid>height = max(height(kid->left), height(kid->right)) + 1 curr = kid; } 0x320 25 0x80 curr 0x200 0x80 15 50 kid 0x320 0x320 0x400 10 0x200 0x100 0x400 curr 5 When the calling routine passes in root->left, the compiler records the ADDRESS of the root’s left pointer (which is 0x100 in our case).

It is “almost” like magic, but there are a few got-cha’s that you need to understand or you will forever be complaining about the broken magic wand. WARNING: If the calling routine passes in a COPY of the address (say temp pointed to 0x200), only the COPY is changed. In our case, the only pointer to node 15 is changed. Similarly, if we change the “kid” pointer, we only change it (not t->left)

Pros and Cons of AVL Trees Arguments for AVL trees: Search is O(log n) since AVL trees are always balanced. Insertion and deletions are also O(log n) The height balancing adds no more than a constant factor to the speed of insertion. Arguments against using AVL trees: Difficult to program & debug; more space for balance factor. Asymptotically faster but rebalancing costs time. Most large searches are done in database systems on disk and use other structures (e.g. B+-trees). May be OK to have O(n) for a single operation if total run time for many consecutive operations is fast (e.g. Splay trees). AVL Trees -