Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.

Slides:



Advertisements
Similar presentations
Functional Programming Lecture 15 - Case Study: Huffman Codes.
Advertisements

CREATING a HUFFMAN CODE EVERY EGG IS GREEN E ///// V/V/ R // Y/Y/ I/I/ S/S/ N/N/ Sp /// V/V/ Y/Y/ I/I/ S/S/ N/N/ R // Sp /// G /// E /////
Data Compression CS 147 Minh Nguyen.
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Lecture 4 (week 2) Source Coding and Compression
Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Greedy Algorithms Amihood Amir Bar-Ilan University.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture3.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Greedy Algorithms (Huffman Coding)
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Huffman Encoding 16-Apr-17.
Huffman Coding: An Application of Binary Trees and Priority Queues
Compression JPG compression, Source: Original 10:1 Compression 45:1 Compression.
A Data Compression Algorithm: Huffman Compression
Is ASCII the only way? For computers to do anything (besides sit on a desk and collect dust) they need two things: 1. PROGRAMS 2. DATA A program is a.
CS 206 Introduction to Computer Science II 04 / 29 / 2009 Instructor: Michael Eckmann.
CSE 143 Lecture 18 Huffman slides created by Ethan Apter
Greedy Algorithms Huffman Coding
Lossless Data Compression Using run-length and Huffman Compression pages
Huffman code uses a different number of bits used to encode characters: it uses fewer bits to represent common characters and more bits to represent rare.
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Data Compression1 File Compression Huffman Tries ABRACADABRA
Huffman Encoding Veronica Morales.
1 Analysis of Algorithms Chapter - 08 Data Compression.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Data Structures Week 6: Assignment #2 Problem
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Priority Queues, Trees, and Huffman Encoding CS 244 This presentation requires Audio Enabled Brent M. Dingle, Ph.D. Game Design and Development Program.
Huffman Coding Yancy Vance Paredes. Outline Background Motivation Huffman Algorithm Sample Implementation Running Time Analysis Proof of Correctness Application.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Huffman’s Algorithm 11/02/ Weighted 2-tree A weighted 2-tree T is an extended binary tree with n external nodes and each of the external nodes is.
1 Algorithms CSCI 235, Fall 2015 Lecture 30 More Greedy Algorithms.
1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
CSE 143 Lecture 22 Huffman slides created by Ethan Apter
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
D ESIGN & A NALYSIS OF A LGORITHM 12 – H UFFMAN C ODING Informatics Department Parahyangan Catholic University.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Data Compression: Huffman Coding in Weiss (p.389)
Design & Analysis of Algorithm Huffman Coding
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
HUFFMAN CODES.
COMP261 Lecture 22 Data Compression 2.
Assignment 6: Huffman Code Generation
Data Compression.
Huffman Coding Based on slides by Ethan Apter & Marty Stepp
Data Compression.
Data Compression CS 147 Minh Nguyen.
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Advanced Algorithms Analysis and Design
Huffman Coding CSE 373 Data Structures.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
File Compression Even though disks have gotten bigger, we are still running short on disk space A common technique is to compress files so that they take.
Huffman Encoding.
Podcast Ch23d Title: Huffman Compression
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Presentation transcript:

Huffman Coding

Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman as part of their compression process The basic idea is that instead of storing each character in a file as an 8-bit ASCII value, we will instead store the more frequently occurring characters using fewer bits and less frequently occurring characters using more bits –On average this should decrease the filesize (usually ½)

Huffman Coding As an example, lets take the string: “duke blue devils” We first to a frequency count of the characters: e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1 Next we use a Greedy algorithm to build up a Huffman Tree –We start with nodes for each character e,3d,2u,2l,2sp,2k,1b,1v,1i,1s,1

Huffman Coding We then pick the nodes with the smallest frequency and combine them together to form a new node –The selection of these nodes is the Greedy part The two selected nodes are removed from the set, but replace by the combined node This continues until we have only 1 node left in the set

Huffman Coding e,3d,2u,2l,2sp,2k,1b,1v,1i,1s,1

Huffman Coding e,3d,2u,2l,2sp,2k,1b,1v,1 i,1s,1 2

Huffman Coding e,3d,2u,2l,2sp,2k,1 b,1v,1i,1s,1 22

Huffman Coding e,3d,2u,2l,2sp,2 k,1 i,1s,1 2 b,1v,1 2 3

Huffman Coding e,3d,2u,2 l,2sp,2k,1 i,1s,1 2 b,1v,1 2 34

Huffman Coding e,3 d,2u,2l,2sp,2k,1i,1s,1 2 b,1v,

Huffman Coding e,3 d,2u,2l,2sp,2 k,1 i,1s,1 2 b,1v,

Huffman Coding e,3 d,2u,2 l,2sp,2 k,1 i,1s,1 2 b,1v,

Huffman Coding e,3 d,2u,2l,2sp,2 k,1 i,1s,1 2 b,1v,

Huffman Coding e,3 d,2u,2l,2sp,2 k,1 i,1s,1 2 b,1v,

Huffman Coding Now we assign codes to the tree by placing a 0 on every left branch and a 1 on every right branch A traversal of the tree from root to leaf give the Huffman code for that particular leaf character Note that no code is the prefix of another code

Huffman Coding e,3 d,2u,2l,2sp,2 k,1 i,1s,1 2 b,1v, e00 d010 u011 l100 sp101 i1100 s1101 k1110 b11110 v11111

Huffman Coding These codes are then used to encode the string Thus, “duke blue devils” turns into: When grouped into 8-bit bytes: xxxx Thus it takes 7 bytes of space compared to 16 characters * 1 byte/char = 16 bytes uncompressed

Huffman Coding Uncompressing works by reading in the file bit by bit –Start at the root of the tree –If a 0 is read, head left –If a 1 is read, head right –When a leaf is reached decode that character and start over again at the root of the tree Thus, we need to save Huffman table information as a header in the compressed file –Doesn’t add a significant amount of size to the file for large files (which are the ones you want to compress anyway) –Or we could use a fixed universal set of codes/freqencies