Huffman Compression.

Slides:



Advertisements
Similar presentations
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Advertisements

Lecture 4 (week 2) Source Coding and Compression
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Special Purpose Trees: Tries and Height Balanced Trees CS 400/600 – Data Structures.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture3.
Greedy Algorithms (Huffman Coding)
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Huffman Coding: An Application of Binary Trees and Priority Queues
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
A Data Compression Algorithm: Huffman Compression
Chapter 08 Binary Trees and Binary Search Trees © John Urrutia 2013, All Rights Reserved.
Huffman code uses a different number of bits used to encode characters: it uses fewer bits to represent common characters and more bits to represent rare.
x x x 1 =613 Base 10 digits {0...9} Base 10 digits {0...9}
Chapter 18 - basic definitions - binary trees - tree traversals Intro. to Trees 1CSCI 3333 Data Structures.
CSE Lectures 22 – Huffman codes
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
CS 46B: Introduction to Data Structures July 30 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Binary Trees A binary tree is made up of a finite set of nodes that is either empty or consists of a node called the root together with two binary trees.
Algorithm Design & Analysis – CS632 Group Project Group Members Bijay Nepal James Hansen-Quartey Winter
MA/CSSE 473 Day 31 Student questions Data Compression Minimal Spanning Tree Intro.
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Data Compression1 File Compression Huffman Tries ABRACADABRA
Huffman Encoding Veronica Morales.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Data Structures Week 6: Assignment #2 Problem
1 Huffman Codes Drozdek Chapter Objectives You will be able to Construct an optimal variable bit length code for an alphabet with known probability.
Spring 2010CS 2251 Trees Chapter 6. Spring 2010CS 2252 Chapter Objectives Learn to use a tree to represent a hierarchical organization of information.
data ordered along paths from root to leaf
Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Priority Queues, Trees, and Huffman Encoding CS 244 This presentation requires Audio Enabled Brent M. Dingle, Ph.D. Game Design and Development Program.
Huffman Coding Yancy Vance Paredes. Outline Background Motivation Huffman Algorithm Sample Implementation Running Time Analysis Proof of Correctness Application.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
CPS 100, Spring Huffman Coding l D.A Huffman in early 1950’s l Before compressing data, analyze the input stream l Represent data using variable.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
Trees Ellen Walker CPSC 201 Data Structures Hiram College.
Week 14 - Wednesday.  What did we talk about last time?  Heapsort  Timsort  Counting sort  Radix sort.
1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman.
Compression and Huffman Coding. Compression Reducing the memory required to store some information. Lossless compression vs lossy compression Lossless.
Podcast Ch23e Title: Implementing Huffman Compression
Design & Analysis of Algorithm Huffman Coding
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
HUFFMAN CODES.
COMP261 Lecture 22 Data Compression 2.
B/B+ Trees 4.7.
Review GitHub Project Guide Huffman Coding
Tries 07/28/16 11:04 Text Compression
Assignment 6: Huffman Code Generation
Madivalappagouda Patil
Tries 5/27/2018 3:08 AM Tries Tries.
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Chapter 8 – Binary Search Tree
Chapter 9: Huffman Codes
Huffman Coding.
Huffman Encoding Visualization
Math 221 Huffman Codes.
Binary Trees.
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Chapter 11 Data Compression
Huffman Coding CSE 373 Data Structures.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
Data Structure and Algorithms
Tries 2/23/2019 8:29 AM Tries 2/23/2019 8:29 AM Tries.
Topic 20: Huffman Coding The author should gaze at Noah, and ... learn, as they did in the Ark, to crowd a great deal of matter into a very small compass.
File Compression Even though disks have gotten bigger, we are still running short on disk space A common technique is to compress files so that they take.
Podcast Ch23d Title: Huffman Compression
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Presentation transcript:

Huffman Compression

Compress + Expand

Compress Expand A Tabulate frequency counts Build trie Build codebook (write trie) Encode input Build codebook (read trie) Decode compressed message Compress: compute freq, buildTrie() A Trie Compress: writeTrie() Expand: readTrie() Compress: write length Expand: read length Compress: encode Expand: decode Compressed file format: Codebook Input length Compressed message

Compute frequency counts Eerie eyes seen near lake. Contain letters: ‘E’ ‘e’ ‘r’ ‘i’ ‘y’ ‘s’ ‘n’ ‘a’ ‘r’ ‘l’ ‘k’ ‘.’ space Generate frequency counts: E: 1, e: 8, r: 2, i: 1, y: 1, s: 2, n: 2, a: 2, l: 1, k: 1, .: 1, space: 4

Build trie (buildTrie()) Merge nodes based on frequency until there is only one left Special case: only one node Priority queue delMin: Removes and returns a smallest key on this priority queue. insert: adds a new key to the priority queue. Node{ char ch int freq Node left, right }

Build trie private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin();

E 1 i y l k . r 2 s n a sp 4 e 8 private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin();

y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 sp 4 e 8 private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin(); 2 E 1 i 1

An Introduction to Huffman Coding March 21, 2000 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 2 sp 4 e 8 E 1 i 1 private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin(); Mike Scott

An Introduction to Huffman Coding March 21, 2000 k 1 . 1 r 2 s 2 n 2 a 2 2 sp 4 e 8 private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin(); E 1 i 1 2 y 1 l 1 Mike Scott

An Introduction to Huffman Coding March 21, 2000 2 k 1 . 1 r 2 s 2 n 2 a 2 2 sp 4 e 8 y 1 l 1 E 1 i 1 private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin(); Mike Scott

An Introduction to Huffman Coding March 21, 2000 r 2 s 2 n 2 a 2 2 2 sp 4 e 8 y 1 l 1 E 1 i 1 private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin(); 2 k 1 . 1 Mike Scott

An Introduction to Huffman Coding March 21, 2000 r 2 s 2 n 2 a 2 2 sp 4 e 8 2 2 k 1 . 1 E 1 i 1 y 1 l 1 private static Node buildTrie(int[] freq) { MinPQ<Node> pq = new MinPQ<Node>(); for (char ch = 0; ch < R; ++ch if (freq[ch] > 0) pq.insert(new Node(c, freq[c], null, null)); while (pq.size() > 1) { Node x=pq.delMin(); Node y=pq.delMin(); Node parent = new Node(‘\0’, x.freq + y.freq, x, y); pq.insert(parent); } return pq.delMin(); Mike Scott

An Introduction to Huffman Coding March 21, 2000 n 2 a 2 2 sp 4 e 8 2 2 E 1 i 1 y 1 l 1 k 1 . 1 4 r 2 s 2 Mike Scott

An Introduction to Huffman Coding March 21, 2000 4 4 4 2 sp 4 e 8 2 2 r 2 s 2 n 2 a 2 k 1 . 1 E 1 i 1 y 1 l 1 Mike Scott

An Introduction to Huffman Coding March 21, 2000 4 4 4 e 8 2 2 r 2 s 2 n 2 a 2 E 1 i 1 y 1 l 1 6 2 sp 4 k 1 . 1 Mike Scott

An Introduction to Huffman Coding March 21, 2000 4 4 6 4 e 8 2 sp 4 2 2 r 2 s 2 n 2 a 2 k 1 . 1 E 1 i 1 y 1 l 1 Mike Scott

An Introduction to Huffman Coding March 21, 2000 4 6 e 8 2 2 2 sp 4 k 1 . 1 E 1 i 1 y 1 l 1 8 4 4 r 2 s 2 n 2 a 2 Mike Scott

An Introduction to Huffman Coding March 21, 2000 4 6 e 8 8 2 2 2 sp 4 4 4 k 1 . 1 E 1 i 1 y 1 l 1 r 2 s 2 n 2 a 2 Mike Scott

An Introduction to Huffman Coding March 21, 2000 8 e 8 4 4 10 r 2 s 2 n 2 a 2 4 6 2 2 2 sp 4 E 1 i 1 y 1 l 1 k 1 . 1 Mike Scott

An Introduction to Huffman Coding March 21, 2000 8 e 8 10 4 4 4 6 2 2 r 2 s 2 n 2 a 2 2 sp 4 E 1 i 1 y 1 l 1 k 1 . 1 Mike Scott

An Introduction to Huffman Coding March 21, 2000 10 16 4 6 2 2 e 8 8 2 sp 4 E 1 i 1 y 1 l 1 k 1 . 1 4 4 r 2 s 2 n 2 a 2 Mike Scott

An Introduction to Huffman Coding March 21, 2000 10 16 4 6 e 8 8 2 2 2 sp 4 4 4 E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 Mike Scott

An Introduction to Huffman Coding March 21, 2000 26 16 10 4 e 8 8 6 2 2 2 sp 4 4 4 E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 Mike Scott

An Introduction to Huffman Coding March 21, 2000 26 16 10 4 e 8 8 6 2 2 2 sp 4 4 4 E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 Mike Scott

Generate codebook (buildCode()) Perform a traversal of the tree to obtain new code words Going left is a 0 going right is a 1 code word is only completed when a leaf node is reached private static void buildCode(String[] st, Node x, String s) { if (!x.isLeaf()) { buildCode(st, x.left, s + '0'); buildCode(st, x.right, s + '1'); } else { st[x.ch] = s; st is the code book, in which st[ch] stores the code for character ch, for example: st[‘E’]=“0000“ st[‘s’]=“1101” String s records the path to the current node x, for example, “left, left, left, right” is encoded as “0001”

An Introduction to Huffman Coding March 21, 2000 Generate codebook Char Code E 0000 i 0001 y 0010 l 0011 k 0100 . 0101 space 011 e 10 r 1100 s 1101 n 1110 a 1111 E 1 i sp 4 e 8 2 y l k . r s n a 6 10 16 26 Mike Scott

Encode message Eerie eyes seen near lake. Char Code E 0000 i 0001 y 0010 l 0011 k 0100 . 0101 space 011 e 10 r 1100 s 1101 n 1110 a 1111 Eerie eyes seen near lake. 000010110000011001110001010 110110100111110101111110001 1001111110100100101 Codebook Input length Compressed message

Expand Compress A Tabulate frequency counts Build trie Build codebook (write trie) Encode input Build codebook (read trie) Decode compressed message Compress: compute freq, buildTrie() A Trie Compress: writeTrie() Expand: readTrie() Compress: write length Expand: read length Compress: encode Expand: decode Compressed file format: Codebook Input length Compressed message

Build codebook (readTrie()) readTrie / writeTrie How to deserialize (read) trie ? Boolean flag + data If flag == 1 (leaf), then the next 8bits represent a character If flag == 0 (internal node), then recursively desterilize left and right children Codebook Input length Compressed message

Build codebook readTrie / writeTrie sp e - y l k . r s n a readTrie / writeTrie 00001E(8bits)1i(8bits)01y(8bits) 1l(8bits)... // write bitstring-encoded trie to standard output private static void writeTrie(Node x) { if (x.isLeaf()) { BinaryStdOut.write(true); BinaryStdOut.write(x.ch, 8); return; } BinaryStdOut.write(false); writeTrie(x.left); writeTrie(x.right);

Build codebook readTrie / writeTrie sp e - y l k . r s n a readTrie / writeTrie 00001E(8bits)1i(8bits)01y(8bits) 1l(8bits)... private static Node readTrie() { boolean isLeaf = BinaryStdIn.readBoolean(); if (isLeaf) { return new Node(BinaryStdIn.readChar(), -1, null, null); } else { return new Node('\0', -1, readTrie(), readTrie());

Decode message E i sp e - y l k . r s n a Scan the bit string while traversal the tree: 0 go left, 1 go right end at leaf node 000010110000011001110001010 110110100111110101111110001 1001111110100100101 Eerie eyes seen near lake.