Dynamic Huffman Coding Computer Networks Assignment.

Slides:



Advertisements
Similar presentations
Compression techniques. Why we need compression. Types of compression –Lossy and lossless Concentrate on lossless techniques. Run Length coding. Entropy.
Advertisements

EE 4780 Huffman Coding Example. Bahadir K. Gunturk2 Huffman Coding Example Suppose X is a source producing symbols; the symbols comes from the alphabet.
Lecture 4 (week 2) Source Coding and Compression
Applied Algorithmics - week7
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
22C:19 Discrete Math Trees Fall 2011 Sukumar Ghosh.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Lecture04 Data Compression.
Lossless Data Compression Using run-length and Huffman Compression pages
Data Compression Basics & Huffman Coding
Data Compression and Huffman Trees (HW 4) Data Structures Fall 2008 Modified by Eugene Weinstein.
Gzip Compression and Decompression 1. Gzip file format 2. Gzip Compress Algorithm. LZ77 algorithm. LZ77 algorithm.Dynamic Huffman coding algorithm.Dynamic.
Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.
Huffman Codes Message consisting of five characters: a, b, c, d,e
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Lecture 23. Greedy Algorithms
Algorithm Design & Analysis – CS632 Group Project Group Members Bijay Nepal James Hansen-Quartey Winter
296.3Page 1 CPS 296.3:Algorithms in the Real World Data Compression: Lecture 2.5.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2011.
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Huffman Encoding Veronica Morales.
Graph Theory in Computer Science Greg Stoll November 22, 2008.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
Communication Technology in a Changing World Week 2.
Data Structures and Algorithms Lecture (BinaryTrees) Instructor: Quratulain.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
ALGORITHMS FOR ISNE DR. KENNETH COSH WEEK 13.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Encodings Section 9.4. Data Compression: Array Representation Σ denotes an alphabet used for all strings Each element in Σ is called a character.
Additive White Gaussian Noise
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Foundation of Computing Systems
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
1Computer Sciences Department. 2 Advanced Design and Analysis Techniques TUTORIAL 7.
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
Compression techniques Adaptive and non-adaptive.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2012.
ECE 101 An Introduction to Information Technology Information Coding.
Huffman Coding (2 nd Method). Huffman coding (2 nd Method)  The Huffman code is a source code. Here word length of the code word approaches the fundamental.
Compression and Huffman Coding. Compression Reducing the memory required to store some information. Lossless compression vs lossy compression Lossless.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
Information Theory Information Suppose that we have the source alphabet of q symbols s 1, s 2,.., s q, each with its probability p(s i )=p i. How much.
Information theory Data compression perspective Pasi Fränti
3.3 Fundamentals of data representation
Design & Analysis of Algorithm Huffman Coding
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
Expression Tree The inner nodes contain operators while leaf nodes contain operands. a c + b g * d e f Start of lecture 25.
Assignment 6: Huffman Code Generation
Algorithms in the Real World
Applied Algorithmics - week7
ISNE101 – Introduction to Information Systems and Network Engineering
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Chapter 8 – Binary Search Tree
The Huffman Algorithm We use Huffman algorithm to encode a long message as a long bit string - by assigning a bit string code to each symbol of the alphabet.
Advanced Algorithms Analysis and Design
Chapter 11 Data Compression
SPLAY TREES.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
File Compression Even though disks have gotten bigger, we are still running short on disk space A common technique is to compress files so that they take.
Huffman Coding Greedy Algorithm
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Lecture 8 Huffman Encoding (Section 2.2)
Electrical Communications Systems ECE
Presentation transcript:

Dynamic Huffman Coding Computer Networks Assignment

2 T Stage 1 (First occurrence of t ) Stage 1 (First occurrence of t ) r / \ / \ 0 t(1) 0 t(1) Order: 0,t(1) Order: 0,t(1) * r represents the root * 0 represents the null node * t(1) denotes the occurrence of T with a frequency of 1

3 TE Stage 2 (First occurrence of e) Stage 2 (First occurrence of e) r / \ / \ 1 t(1) 1 t(1) / \ / \ 0 e(1) 0 e(1) Order: 0,e(1),1,t(1) Order: 0,e(1),1,t(1)

4 TEN Stage 3 (First occurrence of n ) Stage 3 (First occurrence of n ) r / \ / \ 2 t(1) 2 t(1) / \ / \ 1 e(1) 1 e(1) / \ / \ 0 n(1) 0 n(1) Order: 0,n(1),1,e(1),2,t(1) : Misfit Order: 0,n(1),1,e(1),2,t(1) : Misfit

5 Reorder: TEN r / \ / \ t(1) 2 t(1) 2 / \ / \ 1 e(1) 1 e(1) / \ / \ 0 n(1) 0 n(1) Order: 0,n(1),1,e(1),t(1),2 Order: 0,n(1),1,e(1),t(1),2

6 TENN Stage 4 ( Repetition of n ) Stage 4 ( Repetition of n ) r / \ / \ t(1) 3 t(1) 3 / \ / \ 2 e(1) 2 e(1) / \ / \ 0 n(2) 0 n(2) Order: 0,n(2),2,e(1),t(1),3 : Misfit Order: 0,n(2),2,e(1),t(1),3 : Misfit

7 Reorder: TENN r / \ / \ n(2) 2 n(2) 2 / \ / \ 1 e(1) 1 e(1) / \ / \ 0 t(1) 0 t(1) Order: 0,t(1),1,e(1),n(2),2 Order: 0,t(1),1,e(1),n(2),2 t(1),n(2) are swapped t(1),n(2) are swapped

8 TENNE Stage 5 (Repetition of e ) Stage 5 (Repetition of e ) r / \ / \ n(2) 3 n(2) 3 / \ / \ 1 e(2) 1 e(2) / \ / \ 0 t(1) 0 t(1) Order: 0,t(1),1,e(2),n(2),3 Order: 0,t(1),1,e(2),n(2),3

9 TENNES Stage 6 (First occurrence of s) Stage 6 (First occurrence of s) r / \ / \ n(2) 4 n(2) 4 / \ / \ 2 e(2) 2 e(2) / \ / \ 1 t(1) 1 t(1) / \ / \ 0 s(1) 0 s(1) Order: 0,s(1),1,t(1),2,e(2),n(2),4 Order: 0,s(1),1,t(1),2,e(2),n(2),4

10 TENNESS Stage 7 (Repetition of s) Stage 7 (Repetition of s) r / \ / \ n(2) 5 n(2) 5 / \ / \ 3 e(2) 3 e(2) / \ / \ 2 t(1) 2 t(1) / \ / \ 0 s(2) 0 s(2) Order: 0,s(2),2,t(1),3,e(2),n(2),5 : Misfit Order: 0,s(2),2,t(1),3,e(2),n(2),5 : Misfit

11 Reorder: TENNESS r / \ / \ n(2) 5 n(2) 5 / \ / \ 3 e(2) 3 e(2) / \ / \ 1 s (2) 1 s (2) / \ / \ 0 t(1) 0 t(1) Order : 0,t(1),1,s(2),3,e(2),n(2),5 Order : 0,t(1),1,s(2),3,e(2),n(2),5 s(2) and t(1) are swapped s(2) and t(1) are swapped

12 TENNESSE Stage 8 (Second repetition of e ) Stage 8 (Second repetition of e ) r / \ / \ n(2) 6 n(2) 6 / \ / \ 3 e(3) 3 e(3) / \ / \ 1 s(2) 1 s(2) / \ / \ 0 t(1) 0 t(1) Order : 0,t(1),1,s(2),3,e(3),n(2),6 : Misfit Order : 0,t(1),1,s(2),3,e(3),n(2),6 : Misfit

13 Reorder: TENNESSE r / \ / \ e(3) 5 e(3) 5 / \ / \ 3 n(2) 3 n(2) / \ / \ 1 s(2) 1 s(2) / \ / \ 0 t(1) 0 t(1) Order : 1,t(1),1,s(2),3,n(2),e(3),5 Order : 1,t(1),1,s(2),3,n(2),e(3),5 N(2) and e(3) are swapped N(2) and e(3) are swapped

14 TENNESSEE Stage 9 (Second repetition of e ) Stage 9 (Second repetition of e ) r 0 / \ 1 0 / \ 1 e(4) 5 e(4) 5 0 / \ 1 0 / \ 1 3 n(2) 3 n(2) 0 / \ 1 0 / \ 1 1 s(2) 1 s(2) 0 / \ 1 0 / \ 1 0 t(1) 0 t(1) Order : 1,t(1),1,s(2),3,n(2),e(4),5 Order : 1,t(1),1,s(2),3,n(2),e(4),5

15 ENCODING The letters can be encoded as follows: e : 0 e : 0 n : 11 n : 11 s : 101 s : 101 t : 1001 t : 1001

16 Average Code Length Average code length =  i=0,n (length*frequency)/  i=0,n frequency Average code length =  i=0,n (length*frequency)/  i=0,n frequency = { 1(4) + 2(2) + 3(2) + 1(4) } / ( ) = { 1(4) + 2(2) + 3(2) + 1(4) } / ( ) = 18 / 9 = 2 = 18 / 9 = 2

17 ENTROPY Entropy = -  i=1, n (p i log 2 p i ) Entropy = -  i=1, n (p i log 2 p i ) = - ( 0.44 * log * log = - ( 0.44 * log * log * log * log ) * log * log ) = - (0.44 * log (0.22 * log * log0.11) = - (0.44 * log (0.22 * log * log0.11) / log2 / log2 = =

18 Ordinary Huffman Coding TENNESSE TENNESSE 9 0 / \ 1 0 / \ 1 5 e(4) 5 e(4) 0 / \ 1 0 / \ 1 s(2) 3 s(2) 3 0 / \ 1 0 / \ 1 t(1) n(2) t(1) n(2) ENCODING E : 1 S : 00 T : 010 N : 011 Average code length = (1*4 + 2*2 + 2*3 + 3*1) / 9 = 1.89

19 SUMMARY The average code length of ordinary Huffman coding seems to be better than the Dynamic version,in this exercise. The average code length of ordinary Huffman coding seems to be better than the Dynamic version,in this exercise. But, actually the performance of dynamic coding is better. The problem with static coding is that the tree has to be constructed in the transmitter and sent to the receiver. The tree may change because the frequency distribution of the English letters may change in plain text technical paper, piece of code etc. But, actually the performance of dynamic coding is better. The problem with static coding is that the tree has to be constructed in the transmitter and sent to the receiver. The tree may change because the frequency distribution of the English letters may change in plain text technical paper, piece of code etc. Since the tree in dynamic coding is constructed on the receiver as well, it need not be sent. Considering this, Dynamic coding is better. Since the tree in dynamic coding is constructed on the receiver as well, it need not be sent. Considering this, Dynamic coding is better. Also, the average code length will improve if the transmitted text is bigger. Also, the average code length will improve if the transmitted text is bigger.