CS121 Data Structures CS121 © JAS Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry has at most one predecessor, but can have more than one successor.
CS121 Data Structures CS121 © JAS In general trees can have any number of successors, but usually we restrict ourselves to a maximum of two successors - BINARY TREES. A tree is either Empty, or Consists of a value and two sub-trees (left and right)
CS121 Data Structures CS121 © JAS For each non-empty tree there is a unique node with no predecessor – the root node Nodes which have no successors are termed leaf nodes All other nodes have exactly one predecessor and either one or two successors – internal nodes
CS121 Data Structures CS121 © JAS Various subsets of trees can be identified – Binary Trees (maximum of two successors) Strictly Binary Trees (nodes have either 0 or 2 successors) Full Binary Trees (all leaves at same depth) Complete Binary Trees (all levels except last one are full, and last level is filled from left)
CS121 Data Structures CS121 © JAS Various characteristics of trees can be measured Number of nodes - Nodes Number of leaves - Leaves Depth (longest path from root to leaf) – Depth Nodes(EmptyTree) = 0 Nodes(Tree) = 1 + Nodes(Tree.Left) + Nodes(Tree.Right)
CS121 Data Structures CS121 © JAS Leaves(EmptyTree) = 0 Leaves(RootOnly) = 1 Leaves(Tree) = Leaves(Tree.Left) + Leaves(Tree.Right) Depth(Tree) = 1 + maximum(Depth(Tree.Left), Depth(Tree.Right)) Depth(EmptyTree) = – 1
CS121 Data Structures CS121 © JAS Example Application – Representing an Arithmetic Expression Nodes represent operators Leaves represent operands / –* ^*2a b2*c 4a b^2–4*a*c/2*a (())
CS121 Data Structures CS121 © JAS Traversing a tree (visit all nodes in a tree) Left, Root, Right- Inorder or Symmetric Left, Right, Root- Postorder Root, Left, Right- Preorder or Depth-First Level by Level- Breadth-First Each of these traversals can be reversed
CS121 Data Structures CS121 © JAS Defining a tree by operations/methods Create a tree either an empty a tree or, a tree consisting solely of a root IsEmpty – determine if a tree is empty
CS121 Data Structures CS121 © JAS Accessing/Moving round a tree GetData – view data in a node GetLeft – move to/access left sub-tree GetRight – move to/access right sub-tree Adding to a tree can be application dependent simplest methods add a sub-tree to an empty left (or right) subtree Adding elsewhere involves reshaping the tree
CS121 Data Structures CS121 © JAS Defining a Tree Class public class BTNode { private Object data; private BTNode left, right; public BTNode(Object initialData, BTNode initialLeft, BTNode initialRight) { data = initialData; left = initialLeft; right = initialRight; }
CS121 Data Structures CS121 © JAS public Object getData() { return data; } public BTNode getLeft() { return left; } public BTNode getRight() { return right; }
CS121 Data Structures CS121 © JAS public Object getLeftmostData() { if (left == null) return data; else return left.getLeftmostData(); } public Object getRightmostData() { ……… }
CS121 Data Structures CS121 © JAS public void inorderPrint() { if (left != null) left.inorderPrint(); System.out.println(data); if (right != null) right.inorderPrint(); }
CS121 Data Structures CS121 © JAS public BTNode removeLeftmost() { if (left == null) return right; //leftmost node is root else { left = left.removeLeftmost(); return this; } }
CS121 Data Structures CS121 © JAS public void setData(Object newData) { data = newData; } public void setleft(BTNode newLeft) { left = newLeft; }
CS121 Data Structures CS121 © JAS public static BTNode treeCopy(BTNode src) { BTNode leftCopy, RightCopy; if (src == null) return null; else { leftCopy = treeCopy(src,left); rightCopy = treeCopy(src.right); return new BTNode(src.data,leftCopy,rightCopy); } }
CS121 Data Structures CS121 © JAS Binary Search Trees In a binary search tree, a node N with a key K is inserted so that the keys in the left subtree of N are less than K, and the keys in the right subtree of N are greater than K. To summarise, for each node N in a binary search tree: Keys in left subtree of N < Key K in node N < Keys in right subtree of N.
CS121 Data Structures CS121 © JAS public BTNode addEntry(Object newData) { if (this == null) return (new BTNode(newData,null,null)); else { if newData.lessthan(data) left = left.addEntry(newData); else right = right.addEntry(newData); } return this; }
CS121 Data Structures CS121 © JAS public boolean findEntry(Object wanted) { if this == null return false; else if wanted.lessthan(data) return left.findEntry(wanted); else if wanted.greaterthan(data) return right.findEntry(wanted); else //assume equal to data return true; }
CS121 Data Structures CS121 © JAS Deleting From a Binary Search Tree If node to be deleted is a leaf – no problem, just delete If node has only one subtree – replace node with child If node has two subtrees –Determine larger sub-tree (left/right) –Select least/greatest node of larger sub-tree –Move that node up –This will always be a leaf node or a node with only one sub-tree so process ends
CS121 Data Structures CS121 © JAS h dl bjn acegikmo f e
CS121 Data Structures CS121 © JAS Binary Tree Application - Huffman Codes Aim is to encode more frequent letters with shorter bit sequences Algorithm Count frequencies of letters Arrange into ascending order (linked list or binary search tree?) Using two least frequent letters construct binary tree such that –Left leaf is least frequent letter; label left branch 1 –Right leaf is next least frequent letter; label right branch 0 –Root is given value = sum of leafs Replace first two entries in frequency list with entry representing sum (ensure list remains sorted) Repeat
CS121 Data Structures CS121 © JAS Encoding a string Replace each letter with bit code derived from path from root of tree to leaf containing that letter Decoding a string Read binary code such that 1 – descend left 0 – descend right at leaf– output letter
CS121 Data Structures CS121 © JAS This is a simple message a 2 e 3 g 1 h 1 i 3 l 1 m 2 p 1 s 5 t 1 gh gh 2 lp 2 at 3 lpat m mgh 4 lpat 5 ei 6 smgh 9 ei s
CS121 Data Structures CS121 © JAS Binary Tree Application - Priority Queues A heap is a complete binary tree such that no node has a value bigger than its parent. This can provide a representation of a priority queue. The highest priority entry will always be at the root. Removal of the root does, however, require restructuring the tree. To maintain the complete tree property – move the last item in the tree (rightmost leaf on bottom level) to the root. To re-establish the priority property swap new root with highest priority child and repeat until it is moved to the correct position
CS121 Data Structures CS121 © JAS Implementation of Heaps Obviously Heaps can be implemented using references, but as (almost) all the operations are based on swapping values and the trees are complete it is also feasible to use arrays to implement a heap tree.
CS121 Data Structures CS121 © JAS Trees implemented using an array abcdefghijklmn a bc defg hijklmn o o
CS121 Data Structures CS121 © JAS If a Tree is represented by array ATree Tree.root == ATree[0] If Tree == ATree[i] then Tree.left == ATree[2i+1] and Tree.right == ATree[2i+2]
CS121 Data Structures CS121 © JAS Analysis of Binary Tree For a complete binary tree every comparison halves the size of the tree to be searched. Searching is thus said to be O(logn) n is the number of items to be searched log is the logarithm to base 2 If we inserted sorted data into a binary search tree we would end up with a linear list For a linear list the search time is O(n)
CS121 Data Structures CS121 © JAS Balanced binary search trees have the best search times – O(logn) In the worst case, unbalanced trees can have O(n) search times AVL trees – as each new key is inserted, the tree remains balanced Guarantees O(logn) search times To keep tree balanced insertion worst case is O(n)
CS121 Data Structures CS121 © JAS AVL (Adelson-Velskii and Landis) trees are almost balanced binary trees They have both O(logn) insertion (for one key) and search times Height of binary tree = length of longest path from the root to some leaf (The empty binary tree is considered to have a height of -1) The AVL property for a node, N, in a binary tree is that the heights of the left and right subtrees of node N are either equal or if they differ by 1 An AVL tree is a binary tree in which each of its nodes has the AVL property.
CS121 Data Structures CS121 © JAS If inserting into an AVL tree causes the AVL property to be lost at a node, we can apply some shape-changing tree transformations, rotations, to restore the AVL property There are four different rotations Single left Single right Double left = single right then single left Double right = single left then single right
CS121 Data Structures CS121 © JAS private AVLNode RotateRight( AVLNode tree) { AVLNode newTree = tree.Left; tree.Left = newTree.Right; newTree.Right = tree; return newTree; }
CS121 Data Structures CS121 © JAS BRU ORY JFK Right Rotation tree newTree
CS121 Data Structures CS121 © JAS MEX ORD GLA DUS ARN GCM NRT ZRH BRUORY JFK Double Left Right then Left
CS121 Data Structures CS121 © JAS AVL Tree Summary Difference in height between the left and right subtrees of every node is 0 or 1 If this property is lost by inserting a new key, rotations can be performed on the subtree to regain the property An extra field will be required in the node record to retain the height of the subtrees It can be shown that in the worst case insertions, searches and deletions take at most O(logn) time Remember binary trees have a worst case of O(n) when a chain occurs. In fact it can be proven using fibonacci trees that AVL trees are at worst 44% less efficient than complete trees. Proof not required for course, but can be found in text books.
CS121 Data Structures CS121 © JAS Two-Three Trees (2-3 Trees) Another option is to arrange for all subtrees to be perfectly balanced with respect to their heights, but to permit the number of search keys stored in nodes to vary Each node in a 2-3 tree is permitted to contain either one or two search keys, and to have either two or three descendants All leaves of the tree are empty trees that lie on exactly one bottom level.
CS121 Data Structures CS121 © JAS H DJ N IAE FK LO P L H < L J < L < N
CS121 Data Structures CS121 © JAS H DJ N IAE FK LO P B B < H B<D A B
CS121 Data Structures CS121 © JAS H DJ N IAE FK LO P M H < M J < M < N L ^ NJ KM H L
CS121 Data Structures CS121 © JAS The worst case 2-3 tree is one in which every node has 1 key and two children – a complete binary tree For a 2-3 tree with k levels and n elements we have n = 2 (k+1) - 1 Hence k+1=log(n+1) As splits are passed up the tree to parent nodes, there can only ever be O(logn) splits during the insertion Thus we can determine that search and deletions take O(logn) in the worst cases.
CS121 Data Structures CS121 © JAS B-Trees Increase the number of children in a non-root node of a 2-3 tree to be, say, from 100 to 200, then we have a B-tree of order 200 Such a tree can store 8 million records in just 3 levels = 8 million or log =3 Therefore when storing records on disc we would only need 3 disc accesses in order to find any record in our tree of 8 million records A binary tree would require around 23 disc access A B-tree of order 200 would require 3 or in the worst case 5 accesses
CS121 Data Structures CS121 © JAS Binary trees may be fast when stored in fast primary memory, but trees with higher branching factors are far more efficient when stored on slow external memory devices. For efficiency, all of an ordered key sequence in a B-tree is read into internal memory at once, and a fast binary search is used to find a key in this ordered sequence. This either gives the key itself, or the node which must be read from disc next. In general, in a B-tree of order m, each internal node except the root and the leaves must have between upper(m/2) and m children. The root can either be a leaf or it can contain from 2 to m children. All leaves lie on the same bottom- most level and are empty.
CS121 Data Structures CS121 © JAS Summary Concepts and terminology of trees– branches, nodes, children, parents, ancestors, descendants, internal nodes, leaves, levels, paths Binary trees, definition, and recursive definition, complete binary tree definition, representing expressions using binary trees, traversing binary trees Implementing trees in Java Binary search trees Huffman coding tree Deleting from a binary tree
CS121 Data Structures CS121 © JAS Analysis of the binary tree – big-O notation Priority queues; represented by heaps; reheapifying AVL trees – definition, how to calculate the heights and balance factors, how to identify AVL and non-AVL trees, how to insert, how to perform rotations, benefits 2-3 trees – how to insert into, benefits B-trees – benefits, general idea of how they work
CS121 Data Structures CS121 © JAS