Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3

Slides:



Advertisements
Similar presentations
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Advertisements

Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
TREES Chapter 6. Trees - Introduction  All previous data organizations we've studied are linear—each element can have only one predecessor and successor.
Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
Binary Trees, Binary Search Trees COMP171 Fall 2006.
CS 171: Introduction to Computer Science II
Trees Chapter 8.
Fall 2007CS 2251 Trees Chapter 8. Fall 2007CS 2252 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Binary Trees A binary tree is made up of a finite set of nodes that is either empty or consists of a node called the root together with two binary trees,
Lec 15 April 9 Topics: l binary Trees l expression trees Binary Search Trees (Chapter 5 of text)
Binary Trees Chapter 6.
Version TCSS 342, Winter 2006 Lecture Notes Trees Binary Trees Binary Search Trees.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Advanced Algorithms Analysis and Design Lecture 8 (Continue Lecture 7…..) Elementry Data Structures By Engr Huma Ayub Vine.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Spring 2010CS 2251 Trees Chapter 6. Spring 2010CS 2252 Chapter Objectives Learn to use a tree to represent a hierarchical organization of information.
1 Trees A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems Game trees (i.e., chess ),
Data Structures and Algorithms Lecture (BinaryTrees) Instructor: Quratulain.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Chapter 19: Binary Trees Java Programming: Program Design Including Data Structures Program Design Including Data Structures.
Preview  Graph  Tree Binary Tree Binary Search Tree Binary Search Tree Property Binary Search Tree functions  In-order walk  Pre-order walk  Post-order.
© Copyright 2012 by Pearson Education, Inc. All Rights Reserved. 1 Chapter 19 Binary Search Trees.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
Binary Search Trees (BST)
Trees Ellen Walker CPSC 201 Data Structures Hiram College.
1 Joe Meehean. A A B B D D I I C C E E X X A A B B D D I I C C E E X X  Terminology each circle is a node pointers are edges topmost node is the root.
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
Trees CSIT 402 Data Structures II 1. 2 Why Do We Need Trees? Lists, Stacks, and Queues are linear relationships Information often contains hierarchical.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
CSE 373 Data Structures Lecture 7
Trees Saurav Karmakar
CC 215 Data Structures Trees
CSCE 3110 Data Structures & Algorithm Analysis
CSCE 3110 Data Structures & Algorithm Analysis
B/B+ Trees 4.7.
Chapter 25 Binary Search Trees
Multiway Search Trees Data may not fit into main memory
Chapter 5 : Trees.
UNIT III TREES.
Week 6 - Wednesday CS221.
Binary Search Tree (BST)
Data Structures Review Session 2
Week 11 - Friday CS221.
CSE 373 Data Structures Lecture 7
CSC 172– Data Structures and Algorithms
Binary Search Trees Why this is a useful data structure. Terminology
Binary Trees, Binary Search Trees
Binary Tree and General Tree
Chapter 20: Binary Trees.
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
Ch. 11 Trees 사실을 많이 아는 것 보다는 이론적 틀이 중요하고, 기억력보다는 생각하는 법이 더 중요하다.
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
Chapter 8 – Binary Search Tree
Chapter 21: Binary Trees.
Find in a linked list? first last 7  4  3  8 NULL
Trees CSE 373 Data Structures.
CMSC 202 Trees.
Binary Trees, Binary Search Trees
Binary SearchTrees [CLRS] – Chap 12.
Important Problem Types and Fundamental Data Structures
Trees CSE 373 Data Structures.
Binary Trees, Binary Search Trees
Data Structures Using C++ 2E
Tree (new ADT) Terminology: A tree is a collection of elements (nodes)
Presentation transcript:

Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3 Trees Based on slides by Harry Zhou Read Sections 12, 13, 6, 18, 16.3

2 Trees Lists – one parent & one child (at most) Trees – one parent & one or more children Graphs – one or more parents and one or more children. Tree (math definition): connected acyclic graph Trees (inductive definition, more useful in comp. science): A single node is a tree (called the root of the tree) A node (called the root) +links to a finite number of trees are a tree Nodes (except the root) have one parent and any # of children (roots of the subtrees attached to the root) a e d c b f g h i j

3 Binary Trees: each node has at most 2 children public class treenode{ Object data; treenode left; treenode right; public treenode() { }; public treenode(Object j) {data = j; left = null; right = null;} } Tree operations: insert(t,e); delete(t,e); find(t,e) tree traversals: inorder, postorder, preorder, breadth-first

4 Tree traversals a b w c d u s Inorder - left, root, right Preorder - root, left, right Postorder - left, right, root public void Inorder(treenode t){ if(t != null){ Inorder(t.left); System.out.println(“ “ + t.data); Inorder(t.right);} } public void Postorder(treenode t){ Postorder(t.left); Postorder(t.right); System.out.println(“ “ + t.data);} output: c b s d a w u output: c s d b u w a

5 Convert a recursive algorithm to an iterative version public void Inorder(treenode t){ // recursive version if(t != null){ Inorder(t.left); System.out.println(“ “ + t.data); Inorder(t.right);} } For iterative version, use a stack; each stack frame has two fields: (node, action) public void Inorder (treenode root) {//iterative version stack.push(root, ‘inorder’); while (stack.IsEmpty()!=null) { (c,action) = stack.pop(); if (action == ‘visit’) visit(c ) else {if(c.right != NULL) stack.push(c.right, ‘inorder’); stack.push(c, ‘visit’) if(c.left != NULL) stack.push(c.left, `inorder’);

6 Level order (or breadth-first order) traversal Traverse the tree level by level, each level from left to right a b w c d u s Question: what kind of data structures is needed here, queues or stacks? Algorithm: current = root while(current) System.out.print(current.data) if(current.left) Enq(current.left) if(current.right)Enq(current.right) current = Deq(queue) Output: a b w c d u s

7 Binary Search trees A node’s key is larger or equal than that of any node in its left subtree, and smaller than that of any node in its right subtree. 10 7 23 9 15 32 23 19 33 25 21 43 15 2) Operations: Search, Insert, Delete, Min, Max Also: Succesor, Predecessor

8 Algorithm Find public find (int x, treenode T) { if (!T) return ‘NOT FOUND’ message; if ( T.data == x) return x; if ( X > T.data) return find (x, T.rightchild) else return find (x, T.leftchild) } 17 8 4 26 31 27 11 35 find(27,t) 1) Since x > 17, search ( 27, ->26) 2) Since x > 26, search (27, ->31) 3) Since x < 31, search (27, ->27) 4)Since x = 27, return 27

9 Insertions in Java public treenode insert(treenode t, int k){ if (t == null) {t = new treenode(); t.data = k; t.left=null; t.right=null; return t;} else if( k > t.data) { t.right = insert(t.right,k); return t;} else{t.left = insert(t.left, k); }

10 Look for the smallest number in the BST public treenode min ( treenode t ) { if ( t == null ) return null; else if ( t. left == null ) return t; else return min ( t.left ); }

Tree deletion operations: Three cases: (1) The node is a leaf node (2) The node has one child (3) The node has two children

12 delete operation (cont.) (1) The node is a leaf node: just delete it 20 10 30 3 13 20 10 30 3 13 35 (2) The node has one child: connect its child to its parent 20 10 35 3 13 20 10 30 3 13 35

13 The node has two children Algorithm: - Replace the value of the node with the smallest value of its right subtree - Delete the node with the smallest value in the right subtree (has at most one child) 58 20 10 40 3 35 46 49 56 16 20 10 46 3 35 40 56 16 49 58 20 10 46 3 35 58 49 56 16 connect its child with its parent replace its value with the smallest

Complexity for BST - height of a tree: number of edges on a longest path root-> leaves - Find, Insert, Delete – O(h), where h is the height of the tree. - h can vary a lot (as a function of n = number of nodes), depending on how well balanced the BST is. - h can be O(log n) : we say that the tree is balanced - h can be O(n) What is the relation between h and n? h +1 ≤ n ≤ 2h+1 - 1 Proof by induction on n. We will see later how to maintain the tree balanced while doing the tree operations.

Search problems: one of the most common applications of computers. If information is static, then use binary search – works in O( log n) time. Note that insert and delete would be slow (O(n)) If information is dynamic: we want to implement search, insert, and delete, ideally all in O( log n) time. approach: use binary search trees; we have seen that search, insert, and delete are done in O(h) time, where h is the tree’s height. but the tree can grow imbalanced and h can be Omega(n). There are variants of binary search tree that do not become unbalanced: AVL trees (store in every node the height difference between the right subtree and left subtree), red-black trees (store in every node the color red or black), splay trees (no extra storage needed, but operations are in O( log n) in the amortized sense (not the worst-case sense).

16 Red-Black Trees (Chapter 13 in textbook) Done on board.

17 B trees (Chapter 18 in the textbook) Binary search trees are not appropriate for data stored on external memory: each vertical move in the tree involves a mechanical movement of the disk head. Memory operation: on 500 MIPS machine, there are 500 millions of instructions per second Disk: 3600 rot/min  1 rotation in 1/60 sec ~= 16.7 ms. on average we do half a spin  8.3 ms So ~=120 op. per second So: time for one disk access ~= time for 4 * 106 memory operations Idea: use ‘multiway trees’ instead of binary trees; at each node we need to choose to continue among more than two children. Nodes are fatter, but trees are shallower. More work at a node (this is in internal memory), but fewer nodes are visited (each new visited node typically implies a different disk access). This idea is implemented via B-trees. There are more variants of B-trees. We discuss one of them: B+ - tree

Definition of B trees of order m: the root is either a leaf or has between 2 and m children Any non-leaf node (except the root) has between m/2 and m children if it has j children, it contains j-1 key values (to guide the search) All leaves are at the same level, each leaf contains between m/2 and m actual values stored in a sorted m-array (some slots in the array may be empty)

A B-tree of order m with n values has height h  log(n/2)/log(m/2). Search: Obvious (just follow the guiding keys). Insert and Delete: there are many cases. I’ll just illustrate by examples. All the operations work in O(h).

Insert and Delete in a B-tree

With 99 deleted the number of values in the leaf gets below min With 99 deleted the number of values in the leaf gets below min.; so two leaves are merged; now the parent has children below the min; then a child is adopted from the left sibling

24 Heaps (Chapter 6 in the textbook) Main application: priority queues The value of each node is less than that of its children. (MIN-HEAP) A heap is a complete binary tree except the bottom level which is adjusted to the left and has no “holes” Height of the tree is  log n, where n is the number of values (proof on board) 4 20 10 50 13 100 25 9 20 10 31 60 45 10 16 21 24 19 17 heap Not a heap Not a heap

25 Heap Implementation We can use an array (due to the regular structure of the binary tree): 1 2 3 4 5 6 7 8 9 A B C D E F G H I A C B D E F G H I Left child of i: 2 * i. Right child of i: 2 * i + 1 Parent of i: [i/2]

26 Heap operations 1) Insertion 2) DeleteMin 3) BuildHeap (organize an arbitrary array as a heap) Insertion: find the left-most open position and insert the new element NewElem while ( NewElem < its parent ) exchange them. 5 10 9 20 40 4 5 10 4 20 40 9 4 10 5 20 40 9 Question: Is it possible that some elements at the same level but different branch may be smaller than the newly moved up value?

27 Delete Min - Replace the node by the right-most node at the bottom level - While ( new root > its children ) Exchange it with the smaller of its children The complexity analysis: deletion & insertion – O ( log n ) Reasons: 1) no of operations = O( height of the heap) 2) Height of the heap: O ( log n ) 10 18 13 21 40 20 10 21 13 18 40 20 5 10 13 18 40 20 21 21 10 13 18 40 20

BuildHeap operation Build a heap from an arbitrary array Bottom –up procedure: start with subtrees having roots at level h-1 and make them heaps; then move to subtrees with roots at level h-2 and make them heaps, and so on till level 0 is reached.

Time complexity: 1*2h-1 + 2* 2h-2 + 3*2h-3 + …. + h 20 < 2 * 2h Since h =  log n, time complexity is O(n).

33 Heap Questions (1) Do you need to specify which element of a heap to be deleted? (2) What is the difference between a binary search tree and a heap?

Heap Applications (1) priority queues (2) Heap Sort: (a) BuildHeap from the initial array– O(n) (b) Delete-Min n times – n x O(log n) = O(n log n) Time complexity : O(n log n)

35 Compression – Huffman coding – Section 16.3 Data compression: consists of two phases: (1) Encoding (compression) (2) Decoding (decompression) Example: Convert a sequence of characters into binary sequences of equal length a…000 b --- 001 c --- 010 d --- 011 e --- 100 Problem: not efficient since letter frequencies can be vastly different

36 Potential problems Method 2: Use a variable length code letter frequency code a 0.30 0 b 0.26 1 c 0.20 00 d 0.14 01 e 0.10 10 Problem: how to convert 00110110 or 1010010010 to letters? Solution: We want the code to be prefix-free (no codeword is a prefix of another codeword).

37 Huffman Code General Strategy: - allow the code length to vary - guarantee the decoding to be unambiguous, by making the code to be prefix-free. Note: A prefix-free binary code corresponds to binary tree. Huffman Algorithm: Initially-Each character and its frequency are represented by a tree with a single node (1) find 2 trees with smallest weights, merge them as left and right children (2) The root weight is the sum of two children Repeat it until there is only one tree left Assign a 0 to each left edge and a 1 to each right edge. The path from the root to the leaf node representing a character is the codeword for the character.

38 Trace Huffman algorithm Frequency: a - 0.30 b - 0.26 c - 0.20 d - 0.14 e - 0.10 (1) merge d and e T1 d e 0.24 (2) merge T1 and c T1 d e 0.44 c T2 (3) merge a and b T3 a b 0.56

39 Trace Cont. (4) merge T2 and T3 Encode: b T4 1 Encode: a --- 00 b --- 01 c --- 10 d --- 110 e --- 111 Decode: Only one way to convert any string to letters Example: 0011001111 -> a d b e Why? No prefix code of a code is used to represent another letter How to implement the algorithm?

such that the average length Compression problem Input: symbols a1 , a2 , … , an , with probabilities p1 , p2 , … , pn GOAL : Find prefix-free set of codewords C = {c1 , c2 , … , cn } having lengths l1 , l2 , … , ln such that the average length L(C) = p1 l1 + p2 l2 + … + pn ln is small. Theorem. Huffman algorithm finds an optimal code.