Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.

Slides:



Advertisements
Similar presentations
Lecture 4 (week 2) Source Coding and Compression
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Trees Chapter 8.
Compression & Huffman Codes
Huffman Coding: An Application of Binary Trees and Priority Queues
Fall 2007CS 2251 Trees Chapter 8. Fall 2007CS 2252 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
A Data Compression Algorithm: Huffman Compression
CS 206 Introduction to Computer Science II 04 / 29 / 2009 Instructor: Michael Eckmann.
Chapter 9: Huffman Codes
CS 206 Introduction to Computer Science II 12 / 10 / 2008 Instructor: Michael Eckmann.
CSE 143 Lecture 18 Huffman slides created by Ethan Apter
Lossless Data Compression Using run-length and Huffman Compression pages
Data Compression and Huffman Trees (HW 4) Data Structures Fall 2008 Modified by Eugene Weinstein.
Huffman code uses a different number of bits used to encode characters: it uses fewer bits to represent common characters and more bits to represent rare.
Huffman Codes Message consisting of five characters: a, b, c, d,e
CSE Lectures 22 – Huffman codes
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Trees. Tree Terminology Chapter 8: Trees 2 A tree consists of a collection of elements or nodes, with each node linked to its successors The node at the.
MA/CSSE 473 Day 31 Student questions Data Compression Minimal Spanning Tree Intro.
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Huffman Encoding Veronica Morales.
1 Analysis of Algorithms Chapter - 08 Data Compression.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
Data Structures Week 6: Assignment #2 Problem
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Spring 2010CS 2251 Trees Chapter 6. Spring 2010CS 2252 Chapter Objectives Learn to use a tree to represent a hierarchical organization of information.
Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.
Homework #5 New York University Computer Science Department Data Structures Fall 2008 Eugene Weinstein.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Priority Queues, Trees, and Huffman Encoding CS 244 This presentation requires Audio Enabled Brent M. Dingle, Ph.D. Game Design and Development Program.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
CPS 100, Spring Huffman Coding l D.A Huffman in early 1950’s l Before compressing data, analyze the input stream l Represent data using variable.
Huffman’s Algorithm 11/02/ Weighted 2-tree A weighted 2-tree T is an extended binary tree with n external nodes and each of the external nodes is.
Foundation of Computing Systems
1 Algorithms CSCI 235, Fall 2015 Lecture 30 More Greedy Algorithms.
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman.
CSE 143 Lecture 22 Huffman slides created by Ethan Apter
CS 101 – Sept. 11 Review linear vs. non-linear representations. Text representation Compression techniques Image representation –grayscale –File size issues.
Compression and Huffman Coding. Compression Reducing the memory required to store some information. Lossless compression vs lossy compression Lossless.
Lecture on Data Structures(Trees). Prepared by, Jesmin Akhter, Lecturer, IIT,JU 2 Properties of Heaps ◈ Heaps are binary trees that are ordered.
Design & Analysis of Algorithm Huffman Coding
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
Expression Tree The inner nodes contain operators while leaf nodes contain operands. a c + b g * d e f Start of lecture 25.
HUFFMAN CODES.
COMP261 Lecture 22 Data Compression 2.
Assignment 6: Huffman Code Generation
Huffman Coding Based on slides by Ethan Apter & Marty Stepp
Heaps, Priority Queues, Compression
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Chapter 8 – Binary Search Tree
The Huffman Algorithm We use Huffman algorithm to encode a long message as a long bit string - by assigning a bit string code to each symbol of the alphabet.
Huffman Coding.
Chapter 11 Data Compression
Huffman Coding CSE 373 Data Structures.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
Trees Addenda.
Data Structure and Algorithms
Heaps and Priority Queues
Podcast Ch23d Title: Huffman Compression
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Presentation transcript:

Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced storage requirements CS340 1

Huffman Trees CS340

Huffman Tree  A Huffman tree represents Huffman codes for characters that might appear in a text file  As opposed to ASCII or Unicode, Huffman code uses different numbers of bits to encode letters  More common characters use fewer bits  Many programs that compress files use Huffman codes CS340 3

Huffman Tree (cont.) To form a code, traverse the tree from the root to the chosen character, appending 0 if you turn left, and 1 if you turn right. 4 CS340

Huffman Tree (cont.) Examples: d : e : 010 Examples: d : e : 010 CS340 5

Huffman Trees  Implemented using a binary tree and a PriorityQueue  Unique binary number to each symbol in the alphabet  Unicode is an example of such a coding  The message “go eagles” requires 144 bits in Unicode but only 38 bits using Huffman coding CS340 6

Huffman Trees (cont.) CS340 7

Huffman Trees (cont.) CS340 8

Building a Custom Huffman Tree CS340  Input: an array of objects such that each object contains  a reference to a symbol occurring in that file  the frequency of occurrence (weight) for the symbol in that file 9

Building a Custom Huffman Tree (cont.) CS340  Analysis:  Each node will have storage for two data items: the weight of the node and the symbol associated with the node  All symbols will be stored in leaf nodes  For nodes that are not leaf nodes, the symbol part has no meaning  Which data structure is best to use? 10

Building a Custom Huffman Tree (cont.) CS340 11

Building a Custom Huffman Tree (cont.) CS340 12

Design CS340 Algorithm for Building a Huffman Tree 1. Construct a set of trees with root nodes that contain each of the individual symbols and their weights. 2. Place the set of trees into a priority queue. 3. while the priority queue has more than one item 4. Remove the two trees with the smallest weights. 5. Combine them into a new binary tree in which the weight of the tree root is the sum of the weights of its children. 6. Insert the newly created tree back into the priority queue. 13

Design (cont.) CS340 14

Implementation CS340  Listing 6.9 (Class HuffmanTree ; page 349)  Listing 6.10 (The buildTree Method ( HuffmanTree.java ); pages )  Listing 6.11 (The decode Method ( HuffmanTree.java ); page 352) 15