Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3

Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3
Trees Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3

2 Trees Lists – one parent & one child (at most)
Trees – one parent & one or more children Graphs – one or more parents and one or more children. Tree (math definition): connected acyclic graph Trees (inductive definition, more useful in comp. science): A single node is a tree (called the root of the tree) A node (called the root) +links to a finite number of trees are a tree Nodes (except the root) have one parent and any # of children (roots of the subtrees attached to the root) a e d c b f g h i j

3 Binary Trees: each node has at most 2 children
public class treenode{ Object data; treenode left; treenode right; public treenode() { }; public treenode(Object j) {data = j; left = null; right = null;} } Tree operations: insert(t,e); delete(t,e); find(t,e) tree traversals: inorder, postorder, preorder, breadth-first

4 Tree traversals a b w c d u s Inorder - left, root, right
Preorder - root, left, right Postorder - left, right, root public void Inorder(treenode t){ if(t != null){ Inorder(t.left); System.out.println(“ “ + t.data); Inorder(t.right);} } public void Postorder(treenode t){ Postorder(t.left); Postorder(t.right); System.out.println(“ “ + t.data);} output: c b s d a w u output: c s d b u w a

5 Convert a recursive algorithm to an iterative version
public void Inorder(treenode t){ // recursive version if(t != null){ Inorder(t.left); System.out.println(“ “ + t.data); Inorder(t.right);} } For iterative version, use a stack; each stack frame has two fields: (node, action) public void Inorder (treenode root) {//iterative version stack.push(root, ‘inorder’); while (stack.IsEmpty()!=null) { (c,action) = stack.pop(); if (action == ‘visit’) visit(c ) else {if(c.right != NULL) stack.push(c.right, ‘inorder’); stack.push(c, ‘visit’) if(c.left != NULL) stack.push(c.left, `inorder’);

6 Level order (or breadth-first order) traversal
Traverse the tree level by level, each level from left to right a b w c d u s Question: what kind of data structures is needed here, queues or stacks? Algorithm: current = root while(current) System.out.print(current.data) if(current.left) Enq(current.left) if(current.right)Enq(current.right) current = Deq(queue) Output: a b w c d u s

7 Binary Search trees A node’s key is larger or equal than that of any node in its left subtree, and smaller than that of any node in its right subtree. 10 7 23 9 15 32 23 19 33 25 21 43 15 2) Operations: Search, Insert, Delete, Min, Max Also: Succesor, Predecessor

8 Algorithm Find public find (int x, treenode T) {
if (!T) return ‘NOT FOUND’ message; if ( T.data == x) return x; if ( X > T.data) return find (x, T.rightchild) else return find (x, T.leftchild) } 17 8 4 26 31 27 11 35 find(27,t) 1) Since x > 17, search ( 27, ->26) 2) Since x > 26, search (27, ->31) 3) Since x < 31, search (27, ->27) 4)Since x = 27, return 27

9 Insertions in Java public treenode insert(treenode t, int k){
if (t == null) {t = new treenode(); t.data = k; t.left=null; t.right=null; return t;} else if( k > t.data) { t.right = insert(t.right,k); return t;} else{t.left = insert(t.left, k); }

10 Look for the smallest number in the BST
public treenode min ( treenode t ) { if ( t == null ) return null; else if ( t. left == null ) return t; else return min ( t.left ); }

Tree deletion operations:
Three cases: (1) The node is a leaf node (2) The node has one child (3) The node has two children

12 delete operation (cont.)
(1) The node is a leaf node: just delete it 20 10 30 3 13 20 10 30 3 13 35 (2) The node has one child: connect its child to its parent 20 10 35 3 13 20 10 30 3 13 35

13 The node has two children
Algorithm: - Replace the value of the node with the smallest value of its right subtree - Delete the node with the smallest value in the right subtree (has at most one child) 58 20 10 40 3 35 46 49 56 16 20 10 46 3 35 40 56 16 49 58 20 10 46 3 35 58 49 56 16 connect its child with its parent replace its value with the smallest

Complexity for BST - height of a tree: number of edges on a longest path root-> leaves - Find, Insert, Delete – O(h), where h is the height of the tree. - h can vary a lot (as a function of n = number of nodes), depending on how well balanced the BST is. - h can be O(log n) : we say that the tree is balanced - h can be O(n) What is the relation between h and n? h +1 ≤ n ≤ 2h Proof by induction on n. We will see later how to maintain the tree balanced while doing the tree operations.

Search problems: one of the most common applications of computers.
If information is static, then use binary search – works in O( log n) time. Note that insert and delete would be slow (O(n)) If information is dynamic: we want to implement search, insert, and delete, ideally all in O( log n) time. approach: use binary search trees; we have seen that search, insert, and delete are done in O(h) time, where h is the tree’s height. but the tree can grow imbalanced and h can be Omega(n). There are variants of binary search tree that do not become unbalanced: AVL trees (store in every node the height difference between the right subtree and left subtree), red-black trees (store in every node the color red or black), splay trees (no extra storage needed, but operations are in O( log n) in the amortized sense (not the worst-case sense).

16 Red-Black Trees (Chapter 13 in textbook)
Done on board.

17 B trees (Chapter 18 in the textbook)
Binary search trees are not appropriate for data stored on external memory: each vertical move in the tree involves a mechanical movement of the disk head. Memory operation: on 500 MIPS machine, there are 500 millions of instructions per second Disk: 3600 rot/min  1 rotation in 1/60 sec ~= 16.7 ms. on average we do half a spin  8.3 ms So ~=120 op. per second So: time for one disk access ~= time for 4 * 106 memory operations Idea: use ‘multiway trees’ instead of binary trees; at each node we need to choose to continue among more than two children. Nodes are fatter, but trees are shallower. More work at a node (this is in internal memory), but fewer nodes are visited (each new visited node typically implies a different disk access). This idea is implemented via B-trees. There are more variants of B-trees. We discuss one of them: B+ - tree

Definition of B trees of order m:
the root is either a leaf or has between 2 and m children Any non-leaf node (except the root) has between m/2 and m children if it has j children, it contains j-1 key values (to guide the search) All leaves are at the same level, each leaf contains between m/2 and m actual values stored in a sorted m-array (some slots in the array may be empty)

A B-tree of order m with n values has height h  log(n/2)/log(m/2).
Search: Obvious (just follow the guiding keys). Insert and Delete: there are many cases. I’ll just illustrate by examples. All the operations work in O(h).

Insert and Delete in a B-tree

With 99 deleted the number of values in the leaf gets below min
With 99 deleted the number of values in the leaf gets below min.; so two leaves are merged; now the parent has children below the min; then a child is adopted from the left sibling

24 Heaps (Chapter 6 in the textbook)
Main application: priority queues The value of each node is less than that of its children. (MIN-HEAP) A heap is a complete binary tree except the bottom level which is adjusted to the left and has no “holes” Height of the tree is  log n, where n is the number of values (proof on board) 4 20 10 50 13 100 25 9 20 10 31 60 45 10 16 21 24 19 17 heap Not a heap Not a heap

25 Heap Implementation We can use an array (due to the regular structure of the binary tree): A B C D E F G H I A C B D E F G H I Left child of i: 2 * i. Right child of i: 2 * i + 1 Parent of i: [i/2]

26 Heap operations 1) Insertion 2) DeleteMin 3) BuildHeap (organize an arbitrary array as a heap) Insertion: find the left-most open position and insert the new element NewElem while ( NewElem < its parent ) exchange them. 5 10 9 20 40 4 5 10 4 20 40 9 4 10 5 20 40 9 Question: Is it possible that some elements at the same level but different branch may be smaller than the newly moved up value?

27 Delete Min - Replace the node by the right-most node at the bottom level - While ( new root > its children ) Exchange it with the smaller of its children The complexity analysis: deletion & insertion – O ( log n ) Reasons: 1) no of operations = O( height of the heap) 2) Height of the heap: O ( log n ) 10 18 13 21 40 20 10 21 13 18 40 20 5 10 13 18 40 20 21 21 10 13 18 40 20

BuildHeap operation Build a heap from an arbitrary array
Bottom –up procedure: start with subtrees having roots at level h-1 and make them heaps; then move to subtrees with roots at level h-2 and make them heaps, and so on till level 0 is reached.

Time complexity: 1*2h-1 + 2* 2h-2 + 3*2h-3 + …. + h 20 < 2 * 2h
Since h =  log n, time complexity is O(n).

33 Heap Questions (1) Do you need to specify which element of a heap to be deleted? (2) What is the difference between a binary search tree and a heap?

Heap Applications (1) priority queues (2) Heap Sort:
(a) BuildHeap from the initial array– O(n) (b) Delete-Min n times – n x O(log n) = O(n log n) Time complexity : O(n log n)

35 Compression – Huffman coding – Section 16.3
Data compression: consists of two phases: (1) Encoding (compression) (2) Decoding (decompression) Example: Convert a sequence of characters into binary sequences of equal length a… b c d e Problem: not efficient since letter frequencies can be vastly different

36 Potential problems Method 2: Use a variable length code
letter frequency code a b c d e Problem: how to convert or to letters? Solution: We want the code to be prefix-free (no codeword is a prefix of another codeword).

37 Huffman Code General Strategy: - allow the code length to vary
- guarantee the decoding to be unambiguous, by making the code to be prefix-free. Note: A prefix-free binary code corresponds to binary tree. Huffman Algorithm: Initially-Each character and its frequency are represented by a tree with a single node (1) find 2 trees with smallest weights, merge them as left and right children (2) The root weight is the sum of two children Repeat it until there is only one tree left Assign a 0 to each left edge and a 1 to each right edge. The path from the root to the leaf node representing a character is the codeword for the character.

38 Trace Huffman algorithm
Frequency: a b c d e (1) merge d and e T1 d e 0.24 (2) merge T1 and c T1 d e 0.44 c T2 (3) merge a and b T3 a b 0.56

39 Trace Cont. (4) merge T2 and T3 Encode:
b T4 1 Encode: a b c d e Decode: Only one way to convert any string to letters Example: > a d b e Why? No prefix code of a code is used to represent another letter How to implement the algorithm?

such that the average length
Compression problem Input: symbols a1 , a2 , … , an , with probabilities p1 , p2 , … , pn GOAL : Find prefix-free set of codewords C = {c1 , c2 , … , cn } having lengths l1 , l2 , … , ln such that the average length L(C) = p1 l1 + p2 l … + pn ln is small. Theorem. Huffman algorithm finds an optimal code.

Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3

Similar presentations

Presentation on theme: "Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3

Similar presentations

Presentation on theme: "Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3"— Presentation transcript:

Similar presentations

About project

Feedback