ADSA: Balanced Trees/ Advanced Data Structures and Algorithms Objectives – –discuss various kinds of balanced search trees: AVL trees, trees, Red-Black trees, Semester 2, Balanced Search Trees
ADSA: Balanced Trees/12 2 Contents 1. What is a Balanced Binary Search Tree? 2. AVL Trees Trees 4. Red-Black Trees
ADSA: Balanced Trees/ What is a Balanced Binary Search Tree? A balanced search tree is one where all the branches from the root have almost the same height. A balanced search tree is one where all the branches from the root have almost the same height. balanced unbalanced continued
ADSA: Balanced Trees/12 4 As a tree becomes more unbalanced, search running time decreases from O(log n) to O(n) – –because the tree shape turns into a list We want to keep the binary search tree balanced as nodes are added/removed, so searching/insertion remain fast.
ADSA: Balanced Trees/ Balanced BSTs: AVL Trees An AVL tree maintains height balance – –for each node, the difference in height of its two subtrees is in the range -1 to 1
ADSA: Balanced Trees/ Trees o o A multiway tree where each node has at most 4 children, and a node can hold up to 3 values. o o A tree can be perfectly balanced no difference in height between branches requires complex nodes and links
ADSA: Balanced Trees/12 7 o o A red-black tree is a binary version of a tree the nodes have a 'color' attribute: BLACK or RED drawn in Ford and Topp (and here) in white and gray!! the tree maintains a balance measure called the BLACK height BLACK RED 1.3. Red-Black Trees
ADSA: Balanced Trees/ B-Trees A multiway tree where each node has at most m children, and a node can hold up to m-1 values A multiway tree where each node has at most m children, and a node can hold up to m-1 values –a more general version of a tree B-Trees are most commonly used in databases and filesystems B-Trees are most commonly used in databases and filesystems –most nodes are stored in secondary storage such as hard drives
ADSA: Balanced Trees/ AVL Trees For each AVL tree node, the difference between the heights of its left and right subtrees is either -1, 0 or +1 – –this is called the balance factor of a node balanceFactor = height(left subtree) - height(right subtree) – –if balanceFactor > 1 or < -1 then the tree is too unbalanced, and needs 'rearranging' to make it more balanced L - R
ADSA: Balanced Trees/12 10 Heaviness – –if the balanceFactor is positive, then the node is "heavy on the left" the height of the left subtree is greater than the height of the right subtree – –a negative balanceFactor, means the node is "heavy on the right" continued
ADSA: Balanced Trees/12 11 root is heavy on the right, but still balanced root is heavy on the left, but still balanced root is heavy on the right, but still balanced L – R = 0-1 L – R = 2-1 L – R = 1-2
ADSA: Balanced Trees/ The AVLTree Class
ADSA: Balanced Trees/12 13 Using AVLTree String[] stateList = {"NV", "NY", "MA", "CA", "GA"}; AVLTree avltreeA = new AVLTree (); for (int i = 0; i < stateList.length; i++) avltreeB.add(stateList[i]); System.out.println("States: " + avltreeA); int[] arr = {50, 95, 60, 90, 70, 80, 75, 78}; AVLTree avltreeB = new AVLTree (); for (int i = 0; i < arr.length; i++) avltreeB.add(arr[i]); // display the tree System.out.println(avltreeB.displayTree(2)); avltreeB.drawTree(2);
ADSA: Balanced Trees/12 14 Execution States: [CA, GA, MA, NV, NY] 70(-1) 60(1) 90(1) 50(0) 78(0) 95(0) 75(0) 80(0) (1) (-1) root is heavy on the right, but still balanced
ADSA: Balanced Trees/12 15 The AVLTree Node An AVLNode contains the node's value, references to the node's two subtrees, and the node height. height(node) = max ( height(node.left), height(node.right) ) + 1; nodeValue height left right AVTTreeNode object continued
ADSA: Balanced Trees/12 16 private static class AVLNode { public T nodeValue; // node data public int height; // child links public AVLNode left, right; public AVLNode (T item) { nodeValue = item; height = 0; left = null; right = null; } nodeValue height left right Use of public is bad; the coding style is due to Ford & Topp
ADSA: Balanced Trees/12 17 The addition of a node may cause the tree to go out of balance. – –addNode() adds a node and may reorder nodes as it returns from the adding back to the root reordering is the new idea in AVL trees – –the reordering is done using single and double rotations 2.2. Adding a Node to the Tree
ADSA: Balanced Trees/ Too Heavy on the Left (2) node P or left right Two cases outside grandchild inside grandchild left or right branch -- doesn't matter branch doesn't matter continued L – R = 3-1
ADSA: Balanced Trees/12 19 Inserting a node in the left subtree of P (e.g. adding 11 or 22) may cause P to become "too heavy on the left" – –balance factor == 2 The new node can either be in the outside or inside grandchild subtree: – –outside grandchild = left-left – –inside grandchild = left-right continued
ADSA: Balanced Trees/12 20
ADSA: Balanced Trees/ Too Heavy on the Right (-2) node P or right left right Two cases outside grandchild inside grandchild branch doesn't matter branch doesn't matter continued L – R = 1-3
ADSA: Balanced Trees/12 22 Inserting a node in the right subtree of P (e.g. adding 29 or 40) may cause P to become "too heavy on the right" – –balance factor == -2 The new node can either be in the outside or inside grandchild subtree: – –outside grandchild = right-right – –inside grandchild = right-left continued
ADSA: Balanced Trees/12 23
ADSA: Balanced Trees/ Single Rotations When a new item is added to the subtree for an outside grandchild, the imbalance is fixed with a single right or left rotation Two cases: – –left outside grandchild (left-left) --> single right rotation – –right outside grandchild (right-right) --> single left rotation
ADSA: Balanced Trees/ Single Right Rotation A single right rotation occurs when a new element is added to the subtree of the left outside grandchild (left-left) continued
ADSA: Balanced Trees/12 26 add cut left outside grandchild (left-left) continued
ADSA: Balanced Trees/12 27 A single right rotation rotates the left child (LC) to replace the parent – –the parent becomes the new right child The right subtree of LC (RGC) is attached as a left child of P – –ok since the nodes in RGC are greater than LC but less than P
ADSA: Balanced Trees/12 28 singleRotateRight() // single right rotation on p private static AVLNode singleRotateRight( AVLNode p) { AVLNode lc = p.left; p.left = lc.right; // 1 & 4 on slide 26 lc.right = p; // 2 & 3 p.height = max(height(p.left), height(p.right)) + 1; lc.height = max(height(lc.left), height(rc.right)) + 1; return lc; }
ADSA: Balanced Trees/12 29 private static int height(AVLNode t) { if (t == null) return -1; else return t.height; }
ADSA: Balanced Trees/ Single Left Rotation A single left rotation occurs when a new element is added to the subtree of the right outside grandchild (right-right). The rotation exchanges the parent (P) and right child (RC) nodes, and attaches the subtree LGC as the right subtree of P. continued
ADSA: Balanced Trees/12 31 add cut right outside grandchild (right-right)
ADSA: Balanced Trees/12 32 singleRotateLeft() // single left rotation on p private static AVLNode singleRotateLeft( AVLNode p) { AVLNode rc = p.right; p.right = rc.left; // 1 & 4 on slide 31 rc.left = p; // 2 & 3 p.height = max(height(p.left),height(p.right)) + 1; rc.height = max(height(rc.left), height(rc.right)) + 1; return rc; }
ADSA: Balanced Trees/ Double Rotations When a new item is added to the subtree for an inside grandchild, the imbalance is fixed with a double right or left rotation – –a double rotation is two single rotations Two cases: – –left inside grandchild (left-right) --> double right rotation – –right inside grandchild (right-left) --> double left rotation
ADSA: Balanced Trees/12 34 Single left rotation about LC Single right rotation about P balanced A Double Right Rotation left inside grandchild (left-right) Watch RGC rise to the top
ADSA: Balanced Trees/12 35 doubleRotateRight() private static AVLNode doubleRotateRight( AVLNode p) /* double right rotation on p is left rotation, then right rotation */ { p.left = singleRotateLeft(p.left); return singleRotateRight(p); }
ADSA: Balanced Trees/12 36 Single right rotation about RC Single left rotation about P balanced A Double Left Rotation right inside grandchild (right-left) P P RC RGC LGC BA LC RC LGC RGC B A P LC RC LGC RGC B A Watch LGC rise to the top
ADSA: Balanced Trees/12 37 doubleRotateLeft() private static AVLNode doubleRotateLeft( AVLNode p) /* double left rotation on p is right rotation, then left rotation */ { p.right = singleRotateRight(p.right); return singleRotateLeft(p); }
ADSA: Balanced Trees/ addNode() addNode() recurses down to the insertion point and inserts the node. As it returns, it visits the nodes in reverse order, fixing any imbalances using rotations. It must handle four cases: – –balance height == 2: left-left, left-right – –balance height == -2: right-left, right-right
ADSA: Balanced Trees/12 39 Basic addNode() private Node addNode(Node t, T item) { if (t == null) // found insertion point t = new Node (item); else if (((Comparable )item).compareTo(t.nodeValue) < 0) { t.left = addNode( t.left, item); // visit left subtree else if (((Comparable )item).compareTo(t.nodeValue) > 0 ) { else if (((Comparable )item).compareTo(t.nodeValue) > 0 ) { t.right = addNode(t.right, item); // visit right t.right = addNode(t.right, item); // visit right else else throw new IllegalStateException(); // duplicate error throw new IllegalStateException(); // duplicate error return t; return t; } // end of addNode() No AVL rotation code added yet was P in earlier slides
ADSA: Balanced Trees/12 40 private AVLNode addNode(AVLNode t, T item) { if(t == null) // found insertion point t = new AVLNode (item); else if (((Comparable )item).compareTo(t.nodeValue) < 0) { // visit left subtree: add node then maybe rotate t.left = addNode( t.left, item); // add node, then... if (height(t.left) - height(t.right) == 2 ) { //too heavy on left if (((Comparable )item).compareTo(t.left.nodeValue) < 0) // problem on left-left t = singleRotateRight(t); else // problem on left-right t = doubleRotateRight(t); // left then right rotation } } : continued AVL rotation code added
ADSA: Balanced Trees/12 41 else if (((Comparable )item).compareTo(t.nodeValue) > 0 ) { // visit right subtree: add node then maybe rotate t.right = addNode(t.right, item ); // add node, then... if (height(t.left)-height(t.right) == -2){ //too heavy on right if (((Comparable )item).compareTo(t.right.nodeValue) > 0) // problem on right-right t = singleRotateLeft(t); else // problem on right-left t = doubleRotateLeft(t); // right then left rotation } else // duplicate; throw IllegalStateException throw new IllegalStateException(); // calculate new height of t t.height = max(height(t.left), height(t.right)) + 1; return t; } // end of addNode()
ADSA: Balanced Trees/12 42 public boolean add(T item) { try { root = addNode(root, item); // start from root } catch (IllegalStateException e) { return false; } // item is a duplicate // increment the tree size and modCount treeSize++; modCount++; return true; // node was added ok } add() public interface for inserting an item
ADSA: Balanced Trees/ Building an AVL Tree left outside grandchild (left-left) continued gray node is too heavy
ADSA: Balanced Trees/12 44 right outside grandchild (right-right) continued 45
ADSA: Balanced Trees/12 45 right inside grandchild (right-left) double rotate left (right then left rotation) continued
ADSA: Balanced Trees/12 46 left inside grandchild (left-right) double rotate right (left then right rotation) continued
ADSA: Balanced Trees/ Efficiency of AVL Tree Insertion Detailed analysis shows: int(log 2 n) height < log 2 (n+2) So the worst case running time for insertion is O(log 2 n). The worst case for deletion is also O(log 2 n).
ADSA: Balanced Trees/12 48 AVL Trees Deletion in an AVL Tree Deletion can easily cause an imbalance – –e.g delete before deletion of 32 after deletion
ADSA: Balanced Trees/ Trees In a tree: – –a 2 ‑ node has 1 value and a max of 2 children – –a 3-node has 2 values and a max of 3 children – –a 4-node has 3 values and a max of 4 children The numbers refer to the maximum number of branches that can leave the node same as a binary tree node
ADSA: Balanced Trees/ Searching a Tree To find an item: – –start at the root and compare the item with all the values in the node; – –if there's no match, move down to the appropriate subtree; – –repeat until you find a match or reach an empty subtree
ADSA: Balanced Trees/12 51 Search Example Try finding 9 and 30
ADSA: Balanced Trees/ Inserting into a Tree Search to the bottom for an insertion node Search to the bottom for an insertion node –2-node at bottom: convert to 3-node –3-node at bottom: convert to 4-node –4-node at bottom: ??
ADSA: Balanced Trees/12 53 Splitting 4-nodes Transform tree on the way down: Transform tree on the way down: –ensures last node is not a 4-node –local transformation to split a 4-node Insertion at the bottom is now easy since it's not a 4-node
ADSA: Balanced Trees/12 54 Example To split a 4-node. move middle value up. To split a 4-node. move middle value up.
ADSA: Balanced Trees/ Building continued insert 4 This 4-node will be split during the next insertion. This 4-node will be split during the next insertion.
ADSA: Balanced Trees/12 56 insert 10 Insertions happen at the bottom. This 4-node will be split during the next insertion.
ADSA: Balanced Trees/12 57 continued The insertion point is at level 1, so the new 4-node at level 0 is not split during this insertion. insert 55
ADSA: Balanced Trees/12 58 insert 11 This 4-node will be split during the next insertion.
ADSA: Balanced Trees/12 59 Another Example insert The search missed the 4-nodes on the left, so not changed.
ADSA: Balanced Trees/ Efficiency of Trees Searching for an item in a tree with n elements: – –the max number of nodes visited during the search is int(log 2 n) + 1 Inserting an element into a 2 ‑ 3 ‑ 4 tree: – –requires splitting no more than int(log 2 n) nodes normally requires far fewer splits fast!
ADSA: Balanced Trees/ Drawbacks of Trees Since any node may become a 4-node, then all nodes must have space for 3 values and 4 links – –but most nodes are not 4-nodes – –lots of wasted memory, unless impl. is fancier Complex nodes and links – –slower to process than binary search trees
ADSA: Balanced Trees/ Red-Black Trees A red-black tree is a binary search tree where each node has a 'color' – –BLACK or RED A red-black tree is a binary version of a tree, using different color combinations to represent 3-nodes and 4-nodes. – –a 2-node is already a binary node
ADSA: Balanced Trees/12 63 BLACK RED BLACK and RED are drawn in Ford and Topp (and here) in white and gray!!
ADSA: Balanced Trees/ From Tree Nodes to Red-Black Nodes A 2-node is already a binary node so doesn't need to change its shape. The color of a 2-node is always BLACK (drawn as white in these slides). continued 2-node Conversion
ADSA: Balanced Trees/12 65 A 4-node has it's middle value become a BLACK (white) parent and the other values become RED (gray) children. BLACK RED continued 4-node Conversion
ADSA: Balanced Trees/12 66 Represent a 3-node as: – –a BLACK parent and a smaller RED left child or – –a BLACK parent and a larger RED right child OR 3-node Conversion
ADSA: Balanced Trees/ Changing a Tree into a Red-Black Tree change this node continued
ADSA: Balanced Trees/12 68 change this node continued
ADSA: Balanced Trees/12 69 change these nodes
ADSA: Balanced Trees/ Three Properties of a Red-Black Tree 1. The root must always be BLACK (white in our pictures) 2. A RED parent never has a RED child – –in other words: there are never two successive RED nodes in a path continued that must always be true for the tree to be red-black
ADSA: Balanced Trees/ Every path from the root to an empty subtree contains the same number of BLACK nodes – –called the black height We can use black height to measure the balance of a red-black tree.
ADSA: Balanced Trees/12 72 Check the Example Properties
ADSA: Balanced Trees/ Inserting a Node 1. Search down the tree to find the insertion point, splitting any 4-nodes (a BLACK parent with two RED children) by coloring the children BLACK and the parent RED – –called a color flip Splitting a 4-node may involve additional rotations and color changes – –there are 4 cases to consider (section 4.4.1) continued Three things to do.
ADSA: Balanced Trees/ Once the insertion point is found, add the new item as a RED leaf node (section 4.4.2) – –this may create two successive RED nodes again use rotation and recoloring to reorder/rebalance the tree 3. Keep the root as a BLACK node.
ADSA: Balanced Trees/ Four Cases for Splitting a 4-Node 1234 LL/black parentLR/black parent LL/red parentLR/red parent (also mirror case, RR) G L L G L G L G L R L R (also mirror case, RL) (also mirror case, RR) (also mirror case, RL)
ADSA: Balanced Trees/12 76 If the parent is BLACK, only a color flip is needed. If the parent is BLACK, only a color flip is needed. Case 1 (LL/black P): An Example 1 L G L G L G L G L L L
ADSA: Balanced Trees/12 77 Case 2 (LR/black P): Insert 55 2 Only a color flip is required for same reason as case 1 L G L G L G L G R R R
ADSA: Balanced Trees/12 78 The color-flip creates two successive red nodes -- this breaks property 2, so must be fixed Case 3 (LL / red parent) 3 L L L L
ADSA: Balanced Trees/12 79 To fix the red color conflict, carry out a single left or right rotation of node P: To fix the red color conflict, carry out a single left or right rotation of node P: LL right rotation of P LL right rotation of P RR (mirror case) left rotation of P RR (mirror case) left rotation of P Also change the colors of nodes P and G. Also change the colors of nodes P and G. continued
ADSA: Balanced Trees/12 80 and P and G color changes 33 LL of G → right rot of PRR of G → left rot of P L L R R
ADSA: Balanced Trees/12 81 Case 3 (LL / red P) as a Tree 3 X P G ABCD L L
ADSA: Balanced Trees/12 82 Case 4 (LR / red parent) The color-flip creates two successive red nodes -- this breaks property 2, so must be fixed 4 L R L R
ADSA: Balanced Trees/12 83 To fix the red color conflict, carry out a double left or right rotation of node X: To fix the red color conflict, carry out a double left or right rotation of node X: LR double right rotation of X LR double right rotation of X –left then right rotations RL (mirror case) double left rot of X RL (mirror case) double left rot of X –right then left rotations Also change the colors of nodes X and G. Also change the colors of nodes X and G.
ADSA: Balanced Trees/12 84 LR Example 4 X P X and G recoloured L R left rotation of X right rotation of X
ADSA: Balanced Trees/ Same Example, with Views L R
ADSA: Balanced Trees/ Inserting a New Item Always add a new item to a tree as a RED leaf node – –this may create two successive RED nodes, which breaks property 2 Fix using single / double rotation and color flip as used for splitting 4-nodes: – –LL/RR single right/left rotation of parent (P) – –LR/RL double right/left rot of new node (X)
ADSA: Balanced Trees/12 87 Example: Insert 14 single left rotation of 12 and color flip R R
ADSA: Balanced Trees/ Insert 10 instead of 14 R L single left rotation and color flip single right rotation of 10 RL = double left rotation of node 10 (right then left)
ADSA: Balanced Trees/ Building a Red-Black Tree continued single right rotation of 20 and color flip L L
ADSA: Balanced Trees/12 90 continued 123 these numbers are from section 4.4. Insert 35 Insert 25 now
ADSA: Balanced Trees/12 91 Insert 30 L R LR = double right rotation of node 30 (left then right)
ADSA: Balanced Trees/ Search Running Time The worst ‑ case running time to search a red- black tree or insert an item is O(log 2 n) – –the maximum length of a path in a red-black tree with black height B is 2*B-1 but this cannot happen if the insertion rules are followed
ADSA: Balanced Trees/ Deleting a Node Deletion is more difficult than insertion! – –must usually replace the deleted node – –but no further action is necessary when the replacement node is RED continued replacement (next biggest)
ADSA: Balanced Trees/12 94 But deletion requires recoloring and rotations when the replacement node is BLACK. replacement
ADSA: Balanced Trees/ The RBTree Class RBTree implements the Collection interface and uses RBNode objects to create a red-black tree. nodeValue color left right RBNode object parent
ADSA: Balanced Trees/12 96
ADSA: Balanced Trees/12 97 Ford and Topp's tutorial, "RBTree Class.pdf", provides more explanation of the RBTree class – –local copy at ADSA/Ford%20and%20Topp/ Includes: – –a discussion of the private data – –explanation of the algorithms for splitting a 4 ‑ node and performing a top ‑ down insertion
ADSA: Balanced Trees/12 98 Using the RBTree Class import ds.util.RBTree; public class UseRBTree { public static void main (String[] args) { // 10 values for the red-black tree int[] intArr = {10, 25, 40, 15, 50, 45, 30, 65, 70, 55}; RBTree rbtree = new RBTree (); // load the tree with values for(int i = 0; i < intArr.length; i++) { rbtree.add(intArr[i]); rbtree.drawTrees(4); // display on-going tree } // in a JFrame :
ADSA: Balanced Trees/12 99 // display final tree to stdout System.out.println(rbtree.displayTree(2)); /* // remove red-node 25 rbtree.remove(25); rbtree.drawTrees(4); // tree shown in JFrame // remove black-node root 45 rbtree.remove(45); rbtree.drawTree(3); // tree shown in JFrame */ } // end of main() } // end of UseRBTree class
ADSA: Balanced Trees/ Execution Tree changes are shown graphically by drawTrees() and drawTree()
ADSA: Balanced Trees/ JFrame Output: Add 10 Values split 25; color change 25 single left rot of 25 continued BLACKRED a b started red, then flipped to black
ADSA: Balanced Trees/ double right rotation of 45 continued c d split 45
ADSA: Balanced Trees/ single left rotation of 65 split 65; left rotation of 45 e f