CMPSCI Introduction to Introduction to Programming with Data Structures Lecture 18 Binary Search Trees and Priority Queues Lecture 18 Binary Search Trees and Priority Queues
CMPSCI Binary Search Trees l A binary search tree is a special binary tree with the following properties that hold for every node N: H Every data element in N’s left subtree is less than (or less than or equal to) the element in N. H Every data element in N’s right subtree is greater than the element in node N. l Implies an ordering on the data elements of the tree. l Note that if the nodes of a BST are printed using an inorder traversal, the values will be in sorted order (smallest to largest).
CMPSCI Binary Search Trees: Keys and Data l Usually think of a data item stored in a node with an associated search key. l The search keys are the elements compared. l When looking for an item, we are given a search key and we search the tree for a matching key. The entire data item is returned. l When inserting an item in a tree, we use the key to determine where the item belongs and then insert both the key and the data item at this location. l The node for a binary tree is something like [key, data, left subtree, right subtree]: Key Data Left Right Binary Tree Node:
CMPSCI Comparing Keys l If keys are integers, or strings, etc. they can be compared using Java built-in methods (we’ll do this for our first examples). H For example, use the compareTo(String str) method for strings l If keys are general objects, how can we compare them? l Use the comparable interface. interface Comparable { // if k1 and k2 are Comparable keys, then // k1.compareTo(k2) returns // 0 if k1==k2 // +1 if k1>k2 // -1 if k1<k2 int compareTo(Comparable key); }
CMPSCI Overview of Binary Search Tree Binary search tree definition: T is a binary search tree if either of these is true F T is empty; or F Root has two subtrees: H Each is a binary search tree H Value in root > all values of the left subtree H Value in root < all values in the right subtree
CMPSCI Binary Search Tree Example
CMPSCI Searching a Binary Tree
CMPSCI Searching a Binary Search Tree: Algorithm 1. if root is null 2. item not in tree: return null 3. compare target and root.data 4. if they are equal 5. target is found, return root.data 6. else if target < root.data 7. return search(left subtree) 8. else 9. return search(right subtree)
CMPSCI These operations allow us to construct and manipulate binary search trees. Operations on Binary Search Trees l traverse a tree in pre-, post-, and inorder patterns l find smallest value in the tree l find largest value in the tree l find a specific key (and retrieve item) l insert an item into the tree l delete an item from the tree l We can: F Create a new binary search tree class F Use the adaptor pattern to implement a binary search tree on top of our binary tree F Add binary search tree methods to our existing tree class
CMPSCI Binary Search Tree Interface public interface SearchTree { // add/remove say whether tree changed public boolean add (Object e); public boolean remove (Object e); // contains tests whether e is in tree public boolean contains (Object e); // find and delete return e if present, // or null if it is not present public Object find (Object e); public Object delete (Object e); }
CMPSCI TreeInterface (same as before) public interface TreeInterface { public Object getRootData(); public int getHeight(); public int getNumberOfNodes(); public boolean isEmpty(); public void clear(); }
CMPSCI A Generic Binary Tree root Do we allow duplicate elements in the tree? No (book doesn’t do it)
CMPSCI BinarySearchTree: Implementation public class BinarySearchTree extends BinaryTree implements SearchTree { // Data Fields /** Return value from the public add method. */ protected boolean addReturn; /** Return value from the public delete method. */ protected Object deleteReturn; //Constructors public BinarySearchTree() //creates a null tree {super();} public BinarySearchTree(Object e) //creates a tree with one node {super(e);} //containing the data element e
CMPSCI Methods to Implement boolean add (Object e); boolean remove (Object e); boolean contains (Object e); Object find (Object e); Object delete (Object e);
CMPSCI Finding a specific value in the tree l Suppose I gave you a key, like 28. l Return the data item in the node containing 28 or null if there is no 28 in the tree. l How would you do it? root
CMPSCI Searching for a node with a value l Search key = 28 l Compare against root H If <, choose left H If ==, return root H If >, choose right l Recurse on left or right subtree until either H find element H run off a leaf node (failure) root compare and choose
CMPSCI The method findKey public BinaryTree findKey(int key) throws BinaryTreeException { if (isEmpty()) return null; // we have an empty tree Integer dataElement = (Integer) getRootElement(); if (key == dataElement.intValue()) return new BinaryTree(getRootElement()); else if (key <= dataElement.intValue()) //look for it in the left subtree return (getLeftTree().findKey(key)); else //look for it in the right subtree return (getRightTree().findKey(key)); }
CMPSCI Inserting a Key into a Binary Search Tree l Suppose I want to insert the key 60 to this tree. l How would I do it and maintain the structure of the binary search tree? root
CMPSCI Inserting a Key l Search for the key, as we did earlier. l If we don’t find it, the place where we ‘fall off the tree’ is where a new node containing the key should be inserted. l If we do find it, we have to have a convention about where to put a duplicate item (arbitrarily choose the left subtree) root compare and choose Element should be here if it’s in the tree! In this case, we need to create a new node as the left subtree of the node containing 63
CMPSCI The method insertKey public void insertKey(int key) throws BinaryTreeException { insertKeyNode(root, key); } private BinaryNode insertKeyNode(BinaryNode t, int key) throws BinaryTreeException { if (t==null) t = new BinaryNode(new Integer(key)); // we have a null tree, so create a node at the root and return it else { Integer dataElement = (Integer) t.element(); if (key <= dataElement.intValue()) { t.setLeft(insertKeyNode(t.getLeft(), key)); } else { t.setRight(insertKeyNode(t.getRight(), key));} } return t; }
CMPSCI root Removing a Key from a Binary Search Tree l Suppose I wanted to remove a key (and associated data item) from a tree. l How would I do it? l Several cases to consider:
CMPSCI Removing a Key l To remove a key R from the binary search tree first requires that we find R. H (we know how to do this) l If R is (in) a leaf node, then R’s parent has its corresponding child set to null. l If R is in a node that has one child, then R’s parent has its corresponding pointer set to R’s child. l If R is in a node that has two children, the solution is more difficult.
CMPSCI Removing a node with two children l Don’t remove the NODE containing the key 18. l Instead, consider replacing its key using a key from the left or right subtree. l Then remove THAT node from the tree. l We need to maintain binary tree properties. l So what values could we use. l What about 14 or 28? l Max value from left subtree or minimum value from right subtree! l Why are these the ones to use? root
CMPSCI Removing a node with two children root root Notice that in both cases, the binary tree property is preserved.
CMPSCI Removing a node with two children l When no duplicates are allowed, doesn’t matter which node we choose. l If duplicates are allowed, and if they are stored in the left subtree, then we must choose the replacement from the left subtree. l Why? Original Tree Remove Key 31 Selecting from the Right Subtree Selecting from the Left Subtree Which of these two alternatives preserve the binary search tree property???
CMPSCI The method removeKey public BinaryNode removeKey(int key) { return removeKeyNode(root, key); } private BinaryNode removeKeyNode(BinaryNode root, int key) { if (root == null) return null; //should never have an empty tree, but... Integer dataElement = (Integer) root.element(); if (key<dataElement.intValue()) //if true, remove the node from the left subtree root.setLeft(removeKeyNode(root.getLeft(),key)); else if (key>dataElement.intValue()) //if true, remove the node from the right subtree root.setRight(removeKeyNode(root.getRight(),key)); else //found the node with the key { if (root.getRight() == null) root=root.getLeft(); else if (root.getLeft() == null) root=root.getRight(); else //two children { Integer temp = (Integer) getRightMostData(root.getLeft()); root.setElement(temp); root.setLeft(removeRightMost(root.getLeft()));} } return root; } Assumed methods in the BinaryTree class; you write them.
CMPSCI BinarySearchTree Test Program l We’ll build the tree from before: l And then play with the entries root
CMPSCI BinarySearchTree Test Program Construct tree public static void main(String[] args) { BinarySearchTree myTree = new BinarySearchTree(new Integer(57)); System.out.println("Element at root of tree = " + myTree.getRootData()); System.out.println("Initial tree is: "); myTree.inorderTraverse(); System.out.println(); continued next page Element at root of tree = 57 Initial tree is: ( 57 )
CMPSCI BinarySearchTree Test Program Add remaining nodes Inserting: 18 Tree is now: (( 18 ) 57 ) Inserting: 184 Tree is now: (( 18 ) 57 ( 184 )) Inserting: 5 Tree is now: ((( 5 ) 18 ) 57 ( 184 )) …………….. Inserting: 63 Tree is now: (((( -8 ) 5 ( 14 )) 18 (( 28 ) 40 ( 56 ))) 57 (( 63 ) 184 )) Inserting: 229 Tree is now: (((( -8 ) 5 ( 14 )) 18 (( 28 ) 40 ( 56 ))) 57 (( 63 ) 184 ( 229 ))) Inserting: 387 Tree is now: (((( -8 ) 5 ( 14 )) 18 (( 28 ) 40 ( 56 ))) 57 (( 63 ) 184 ( 229 ( 387 )))) int[] nodes = {18, 184, 5,40, -8, 14, 28, 56, 63, 229, 387}; for (int i=0; i<nodes.length; i++) { System.out.println(("Inserting: "+ nodes[i])); myTree.add(new Integer(nodes[i])); System.out.println("Tree is now: "); myTree.inorderTraverse(); System.out.println(); }
CMPSCI BinarySearchTree Test Program Look for keys and remove them if found l Look for the keys -8, 22, and 40. System.out.println("================================="); Comparable entry = new Integer(-8); if (myTree.contains(entry)) { System.out.println("Entry found; now remove it."); System.out.println("Removing element: "+ entry); myTree.remove(entry); System.out.println("Tree is now: "); myTree.inorderTraverse(); System.out.println(); } else System.out.println("Entry " + entry + "not found"); ==================================================================== Entry found; now remove it. Removing element: -8 Tree is now: ((( 5 ( 14 )) 18 (( 28 ) 40 ( 56 ))) 57 (( 63 ) 184 ( 229 ( 387 )))) ====================================================================
CMPSCI BinarySearchTree Test Program Look for keys and remove them if found l Look for the keys -8, 22, and 40. System.out.println("================================="); Comparable entry = new Integer(22); if (myTree.contains(entry)) { System.out.println("Entry found; now remove it."); System.out.println("Removing element: "+ entry); myTree.remove(entry); System.out.println("Tree is now: "); myTree.inorderTraverse(); System.out.println(); } else System.out.println("Entry " + entry + "not found"); ============================================================ Entry 22 not found ============================================================
CMPSCI BinarySearchTree Test Program Look for keys and remove them if found l Look for the keys -8, 22, and 40. System.out.println("================================="); Comparable entry = new Integer(40); if (myTree.contains(entry)) { System.out.println("Entry found; now remove it."); System.out.println("Removing element: "+ entry); myTree.remove(entry); System.out.println("Tree is now: "); myTree.inorderTraverse(); System.out.println(); } else System.out.println("Entry " + entry + "not found"); ============================================================== Removing element: 40 Tree is now: ((( 5 ( 14 )) 18 ( 28 ( 56 ))) 57 (( 63 ) 184 ( 229 ( 387 )))) ==============================================================
CMPSCI BinarySearchTree Test Program Restore tree and remove node 18 System.out.println("==================================="); System.out.println("Now put the two elements back in the tree."); myTree.add(new Integer(-8)); myTree.add(new Integer(40)); System.out.println("Tree is now: "); myTree.inorderTraverse(); System.out.println(); ========================================================================== Now put the two elements back in the tree. Tree is now: (((( -8 ) 5 ( 14 )) 18 ( 28 (( 40 ) 56 ))) 57 (( 63 ) 184 ( 229 ( 387 )))) ==========================================================================
CMPSCI BinarySearchTree Test Program Restore tree and remove node 18 System.out.println("=========================="); entry = new Integer(18); if (myTree.contains(entry)) { System.out.println("Removing element: "+ entry); myTree.remove(entry); System.out.println("Tree is now: "); myTree.inorderTraverse(); System.out.println(); } else System.out.println("Entry " + entry + " not found"); System.out.println("================================"); ==================================================================== Removing element: 18 Tree is now: (((( -8 ) 5 ) 14 ( 28 (( 40 ) 56 ))) 57 (( 63 ) 184 ( 229 ( 387 )))) ====================================================================
CMPSCI A Test Program, cont’d System.out.println("The tree has " + t.treeSize() + " nodes."); System.out.println("Print tree using inorder traversal:"); t.inOrderPrint(); System.out.println(); System.out.println("Print tree using preorder traversal:"); t.preOrderPrint(); System.out.println(); System.out.println("Print tree using postorder traversal:"); t.postOrderPrint(); The tree has 8 nodes. Print tree using inorder traversal: Print tree using preorder traversal: Print tree using postorder traversal:
CMPSCI A Test Program, con’td int key = 24; BinaryTree temp = t.findKey(key); System.out.println(); System.out.println("Search key = " + key +"; Value found = "+temp.getRootElement()); key = 13; temp = t.findKey(key); System.out.println(); if (temp==null) System.out.println("Search key = " + key +"; Key was not found"); else System.out.println("Search key = " + key +"; Value found = "+temp.getRootElement()); key = 15; temp = t.findKey(key); System.out.println(); if (temp==null) System.out.println("Search key = " + key +"; Key was not found"); else System.out.println("Search key = " + key +"; Value found = "+temp.getRootElement ()); Search key = 24; Value found = 24 Search key = 13; Key was not found Search key = 15; Value found = 15
CMPSCI A Test Program, con’td Inserting key = 29 into tree. In insertKeyNode, going right, value at node = 15 In insertKeyNode, going left, value at node = 31 In insertKeyNode, going right, value at node = 24 In insertKeyNode, going right, value at node = 28 Print tree using inorder traversal: Inserting key = 40 into tree. In insertKeyNode, going right, value at node = 15 In insertKeyNode, going right, value at node = 31 Print tree using inorder traversal: Inserting key = 1 into tree. In insertKeyNode, going left, value at node = 15 In insertKeyNode, going left, value at node = 8 In insertKeyNode, going left, value at node = 2 Print tree using inorder traversal: Inserting key = 57 into tree. In insertKeyNode, going right, value at node = 15 In insertKeyNode, going right, value at node = 31 In insertKeyNode, going right, value at node = 40 Print tree using inorder traversal: Removing node with key = 29 Print tree using inorder traversal: key=29; System.out.println("Inserting key = "+key+" into tree."); t.insertKey(key); System.out.println("Print tree using inorder traversal:"); t.inOrderPrint(); System.out.println(); key=40; System.out.println("Inserting key = "+key+" into tree."); t.insertKey(key); …………. key=1; System.out.println("Inserting key = "+key+" into tree."); t.insertKey(key); ………….. key=57; System.out.println("Inserting key = "+key+" into tree."); t.insertKey(key); ………….. key=29; System.out.println("Removing node with key = "+key); t.removeKey(key); …………. }
CMPSCI Priority Queues: Stock Trading l Consider trading of a single security, say Akamai Technologies, founded in 1998 by CS professors and students at MIT. l Investors place orders consisting of three items (action, price, size), where F action is either buy or sell, F price is the worst price you are willing to pay for the purchase or get from your sale, and F size is the number of shares l At equilibrium, all the buy orders (bids) have prices lower than all the sell orders (asks)
CMPSCI Stock Trading, cont'd. l A level 1 quote gives the highest bid and lowest ask (as provided by popular financial sites, and e-brokers for the naive public) l A level 2 quote gives all the bids and asks for several price steps (Island ECN on the Web and quote subscriptions for professional traders) l A trade occurs whenever a new order can be matched with one or more existing orders, which results in a series of removal transactions l Orders may be canceled at any time
CMPSCI A Data Structure for Trading l For each stock, keep two structures, one for the buy orders (bids), and the other for the sell orders (asks) l Operations that need to be supported Action Ask Structure Bid Structure place an order insert(price, size) insert(price, size) get level 1 quote min() max() trade removeMin() removeMax() cancel remove(order) remove(order) l These data structures are called priority queues. l The NASDAQ priority queues support an average daily trading volume of 1B shares ($50B)
CMPSCI Keys and Total Order Relations l A Priority Queue ranks its elements by key with a total order relation. l Keys: F - Every element has its own key F - Keys are not necessarily unique l Total Order Relation Denoted by Reflexive: k k Antisymetric: if k 1 k 2 and k 2 k 1, then k 1 k 2 Transitive: if k 1 k 2 and k 2 k 3, then k 1 k 3
CMPSCI Priority Queue Operations l A Priority Queue supports these fundamental methods on key-element pairs: F min() F insertItem(k, e) F removeMin() l where k is a key and e is the element associated with the key
CMPSCI The Priority Queue ADT l A priority queue P supports the following methods: F size(): //Return the number of elements in P F isEmpty(): //Test whether P is empty F insertItem(k,e): //Insert a new element e with key k into P F minElement(): //Return (but don’t remove) an element of P with smallest key; an error occurs if P is empty. F minKey(): //Return the smallest key in P; an error occurs if P is empty F removeMin(): //Remove from P and return an element with the smallest key; an error condition occurs if P is empty.
CMPSCI Sorting with a Priority Queue l A Priority Queue P can be used for sorting a sequence S by: F inserting the elements of S into P with a series of insertItem(e, e) operations F removing the elements from P in increasing order and putting them back into S with a series of removeMin() operations l Such a sort is a template for a family of sorting algorithms which differ only in how the priority queue is implemented. F To be discussed: H Selection sort, H insertion sort, H heap sort LATER! But I left the notes pages in anyway.
CMPSCI Pseudo-Code for Sorting Template Algorithm PriorityQueueSort(S, P): Input: A sequence S storing n elements, on which a total order relation is defined, and a Priority Queue P that compares keys with the same relation Output: The Sequence S sorted by the total order relation while !S.isEmpty() do e S.removeFirst() P.insertItem(e, e) while P is not empty do e P.removeMin() S.insertLast(e) Phase 1: insert all the elements into the queue Phase 2: remove items and return to sequence
CMPSCI Implementation with an Unsorted Sequence l Try to implement a priority queue with an unsorted sequence S. l The elements of S are a composition of two elements, k, the key, and e, the element. l We can implement insertItem() by using insertLast() on the sequence. This takes O(1) time. l However, because we always insert at the end, irrespective of the key value, our sequence is not ordered.
CMPSCI Implementation with an Unsorted Sequence continued l Thus, for methods such as minElement(), minKey(), and removeMin(), we need to look at all the elements of S. l The worst case time complexity for these methods is O(n). Performance summary insertItem O(1) minKey, minElement O(n) removeMin O(n)
CMPSCI Selection Sort l Implement P with an unsorted sequence F Phase 1: O(n) F Phase 2: O(n + (n-1) + … ) = O ( i ) = O( ) = O (n 2 ) l Overall complexity of O(n 2 ) l Sort amounts to selecting the smallest element from the queue and placing it into the sorted sequence l Called selection sort. i=1 n n(n+1) 2
CMPSCI Implementation with a Sorted Sequence l Another implementation uses a sequence S, sorted by increasing keys l minElement(), minKey(), and removeMin() take O(1) time l However, to implement insertItem(), we must now scan through the entire sequence in the worst case. l Thus, insertItem() runs in O(n) time l Performance summary insertItem O(n) minKey, minElement O(1) removeMin O(1)
CMPSCI Insertion Sort l Implement P with a sorted sequence F Phase 1: O(n + (n-1) + … ) = O ( i ) = O( ) = O (n 2 ) F Phase 2: O(n) l Overall complexity of O(n 2 ) l Sort amounts to inserting the next element in the sequence into a sorted queue and then copying the queue back to the sequence. l Called insertion sort. i=1 n n(n+1) 2