Chapter 8 – Binary Search Tree

Slides:



Advertisements
Similar presentations
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Advertisements

Lecture 4 (week 2) Source Coding and Compression
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Greedy Algorithms Amihood Amir Bar-Ilan University.
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
Data Structures: A Pseudocode Approach with C 1 Chapter 6 Objectives Upon completion you will be able to: Understand and use basic tree terminology and.
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
Trees Chapter 8.
Fall 2007CS 2251 Trees Chapter 8. Fall 2007CS 2252 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
A Data Compression Algorithm: Huffman Compression
Chapter 9: Huffman Codes
CS 206 Introduction to Computer Science II 12 / 10 / 2008 Instructor: Michael Eckmann.
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
CS 46B: Introduction to Data Structures July 30 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Binary Trees A binary tree is made up of a finite set of nodes that is either empty or consists of a node called the root together with two binary trees.
Trees. Tree Terminology Chapter 8: Trees 2 A tree consists of a collection of elements or nodes, with each node linked to its successors The node at the.
Algorithm Design & Analysis – CS632 Group Project Group Members Bijay Nepal James Hansen-Quartey Winter
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Huffman Encoding Veronica Morales.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Data Structures Week 6: Assignment #2 Problem
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Spring 2010CS 2251 Trees Chapter 6. Spring 2010CS 2252 Chapter Objectives Learn to use a tree to represent a hierarchical organization of information.
1 Trees A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems Game trees (i.e., chess ),
Data Structures and Algorithms Lecture (BinaryTrees) Instructor: Quratulain.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
Priority Queues, Trees, and Huffman Encoding CS 244 This presentation requires Audio Enabled Brent M. Dingle, Ph.D. Game Design and Development Program.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Huffman’s Algorithm 11/02/ Weighted 2-tree A weighted 2-tree T is an extended binary tree with n external nodes and each of the external nodes is.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman.
Properties: -The value in each node is greater than all values in the node’s subtrees -Complete tree! (fills up from left to right) Max Heap.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
Lecture on Data Structures(Trees). Prepared by, Jesmin Akhter, Lecturer, IIT,JU 2 Properties of Heaps ◈ Heaps are binary trees that are ordered.
Design & Analysis of Algorithm Huffman Coding
HUFFMAN CODES.
B/B+ Trees 4.7.
CSC317 Greedy algorithms; Two main properties:
ISNE101 – Introduction to Information Systems and Network Engineering
Lecture 22 Binary Search Trees Chapter 10 of textbook
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
original list {67, 33,49, 21, 25, 94} pass { } {67 94}
Chapter 8 – Binary Search Tree
Chapter 9: Huffman Codes
Wednesday, April 18, 2018 Announcements… For Today…
Friday, April 13, 2018 Announcements… For Today…
Huffman Coding.
Binary Trees.
Chapter 11 Data Compression
Huffman Coding CSE 373 Data Structures.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
Binary Search Trees.
Trees Addenda.
Data Structure and Algorithms
Heaps and Priority Queues
HEAPS.
Huffman Encoding.
Podcast Ch23d Title: Huffman Compression
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Analysis of Algorithms CS 477/677
Self-Balancing Search Trees
Presentation transcript:

Chapter 8 – Binary Search Tree 8.6 Huffman Trees Chapter 8 – Binary Search Tree

Attendance Quiz #29 Trees

Tip #31: size() vs empty() Trees Should size() or empty() be used to test for container empty? bool empty() const { return (size() == 0); // BAD?? } Some implementations of std::list::size take O(n) and not O(1). By using the empty() function: Not very likely to incur the overhead of an extra function call, since the function is probably in-lined. Makes makes code more readable, and means you don't have to worry about implementation details. Your code can more easily be adapted to other types of containers, with other characteristics. Bottom line: The standard guarantees that empty() is a constant-time operation for all standard containers.

Case Study: Building a Custom Huffman Tree 8.6 Huffman Trees Case Study: Building a Custom Huffman Tree 8.6, pgs. 496-505

Huffman Trees Huffman coding is a lossless data compression algorithm. Assign variable-length codes to input characters based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code. A straight binary encoding of an alphabet assigns a unique 8- bit binary number to each symbol in the alphabet. ASCII has 256 possible characters, the length of a message would be 8 x n where n is the total number of character. The message “go eagles” requires 9 x 8 or 72 bits in ASCII. The same string encoded using a good Huffman encoding schema would only require 38 bits. A Huffman tree can be implemented using a binary tree and a priority_queue.

Huffman Trees Trees In 1973, Donald Knuth published the relative frequencies of the letters in English text. The letter e occurs and average of 103 times every 1000 letters (or 10.3% of the letters are e's.) e = 010

A 1. What are the Huffman codes for the characters found in the Huffman binary tree to the right? 1 p 5 s 4 e 2 l n 13 8 char frequency encoding s 5 e 4 l 2 n 1 p 10 110 1110 1111 2. Using the above Huffman codes, what word is represented with the following 27-bit stream? 011010101111110100011101000 sleeplessness 3. What is the resulting compression ratio of Huffman encoding verses 8-bit ASCII encoding? (13 x 8 = 104 bits) : 27 bits or 4 : 1

Steps to build Huffman Tree Trees Input: array of unique characters with their frequency of occurrences. Create a leaf node for each unique character and build a min- heap of all leaf nodes. Min-heap is used as a priority queue. The value of frequency field is used to compare two nodes in min heap. The least frequent character will be at root (min-heap). Extract two nodes (minimum frequencies) from the min heap and create a new internal node whose frequency is the sum of the two extracted node frequencies. Make the first extracted node the left child and the second node the right child. Add the new internal node (with two children) to the min heap. Repeat steps #2 and #3 until the heap contains only one node. The tree is complete and the remaining node is the root node of a Huffman Binary Tree.

An assassin sins 1 3 n a 16 6 s 2 i 10 4 ␣ char frequency encoding a 3 n sp 2 i s 6 encoding 00 01 100 101 11 1. Create a leaf node for each unique character and build a min-heap of all leaf nodes. 2 ␣ i 3 a n 6 s 2. Extract two nodes, create new node w/frequency equal to the sum of the two node frequencies, and add back to min-heap. Repeat until 1 node. 1 2 3 4 ␣ i a n 6 s 1 2 3 a n 6 s 4 $ 1 2 4 $ 6 s 1 6 $ 10 16 $ 2 ␣ i 2 ␣ i 3 a n 3 a n 4 $ 6 s 6 $ 10 2 ␣ i 3 a n 4 $ 6 s 2 ␣ i 3. What is the bit stream for "An assassin sins"? 000110000111100111110101100111010111

B aaabbc 1100 4:24 char frequency encoding a b c 3 2 1 11 10 11 10 1. Using the above character stream, create a leaf node for each unique character and build a min-heap of all leaf nodes. 1 2 1 c 3 a 2 b 2. Extract two nodes, create new node w/frequency equal to the sum of the two node frequencies, and add back to min-heap. Repeat until 1 node. 1 2 c 3 a b 1 3 a $ 6 $ 3 a 1 2 b c 6 1 c 2 b 3 a $ 1 c 2 b 3. What is the bit stream for "baa"? 1100 4:24 4. What is the resulting compression ratio of Huffman to ASCII?

11.1 Self Balancing Trees 11.1, pgs. 496-505

Balanced Binary Search Trees The performance of a binary search tree is proportional to the height of the tree or the maximum number of nodes along a path from the root to a leaf. A full binary tree of height h (assuming an empty tree is of height 0,) can hold 2h -1 items. If a binary search tree is full and contains n items, the expected performance is O(log n). However, if a binary tree is not full, the actual performance is worse than expected, possibly up to O(n). Self-balancing trees require the heights of the right and left subtrees to be equal or nearly equal. We'll examine these trees later as well as non-binary search trees: the B-tree and its specializations, the 2-3 and 2-3-4 trees, and the B+ tree.

Balanced Binary Search Trees Balanced tree is a tree whose height is of order of log(number of elements in the tree). Balancing a tree recursively applies to every subtree. That is, the tree is balanced if and only if: The left and right subtrees' heights differ by at most one, AND The left subtree is balanced, The right subtree is balanced. 8 4 12 2 10 9 13 Balanced 8 4 12 2 10 9 Not Balanced

Why Balance is Important Trees The binary tree to the right has search performance of O(n), not O(log n). Balancing a tree recursively applies to every subtree. That is, the tree is balanced if and only if: The left and right subtrees' heights differ by at most one, AND The left subtree is balanced, The right subtree is balanced. 1 2 3 4 5 6 7 8 9 10 8 4 12 2 10 9 13 Balanced 8 4 12 2 10 9 Not Balanced

Rotation Trees We need an operation on a binary tree that changes the relative heights of left and right subtrees, but preserves the binary search tree property. Tree rotation is an operation on a binary tree that changes the structure without interfering with the order of the elements. A tree rotation moves one node up in the tree and one node down. Rotation is used to change the shape of the tree, and in particular to decrease its height by moving smaller subtrees down and larger subtrees up, resulting in improved performance of many tree operations.

Trees

Algorithm for Rotation Trees = left right = data = 20 BTNode root = left right = data = 10 BTNode = left right = data = 40 NULL BTNode = left right = data = 5 NULL BTNode = left right = data = 15 NULL BTNode = left right = data = 7 NULL BTNode

Algorithm for Rotation Trees = left right = data = 20 BTNode root temp = left right = data = 10 BTNode = left right = data = 40 NULL BTNode = left right = data = 5 NULL BTNode = left right = data = 15 NULL BTNode Remember value of root->left (temp = root->left) = left right = data = 7 NULL BTNode

Algorithm for Rotation Trees root = left right = data = 20 BTNode temp = left right = data = 10 BTNode = left right = data = 40 NULL BTNode = left right = data = 5 NULL BTNode = left right = data = 15 NULL BTNode Remember value of root->left (temp = root->left) Set root->left to value of temp->right = left right = data = 7 NULL BTNode

Algorithm for Rotation Trees root = left right = data = 20 BTNode temp = left right = data = 10 BTNode = left right = data = 40 NULL BTNode = left right = data = 5 NULL BTNode = left right = data = 15 NULL BTNode Remember value of root->left (temp = root->left) Set root->left to value of temp->right Set temp->right to root = left right = data = 7 NULL BTNode

Algorithm for Rotation Trees root = left right = data = 20 BTNode temp = left right = data = 10 BTNode = left right = data = 40 NULL BTNode = left right = data = 5 NULL BTNode = left right = data = 15 NULL BTNode Remember value of root->left (temp = root->left) Set root->left to value of temp->right Set temp->right to root Set root to temp = left right = data = 7 NULL BTNode

Algorithm for Rotation Trees root = left right = data = 10 BTNode = left right = data = 5 NULL BTNode = left right = data = 20 BTNode = left right = data = 7 NULL BTNode = left right = data = 15 NULL BTNode = left right = data = 40 NULL BTNode

Implementing Rotation Trees

Implementing Rotation Trees