1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 5.

Slides:



Advertisements
Similar presentations
Lecture 4 (week 2) Source Coding and Compression
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Greedy Algorithms (Huffman Coding)
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Lecture 6 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Image Compression, Transform Coding & the Haar Transform 4c8 – Dr. David Corrigan.
Data Compression Michael J. Watts
Lecture04 Data Compression.
Compression & Huffman Codes
School of Computing Science Simon Fraser University
Adaptive Huffman coding The trees are automatically generated by a Generator Designed at SUCO.
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
Spatial and Temporal Data Mining
A Data Compression Algorithm: Huffman Compression
Data Structures – LECTURE 10 Huffman coding
Chapter 9: Huffman Codes
Lecture 4 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lossless Data Compression Using run-length and Huffman Compression pages
Huffman Coding. Main properties : –Use variable-length code for encoding a source symbol. –Shorter codes are assigned to the most frequently used symbols,
Data Compression Basics & Huffman Coding
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
1 Lossless Compression Multimedia Systems (Module 2 Lesson 2) Summary:  Adaptive Coding  Adaptive Huffman Coding Sibling Property Update Algorithm 
Dr.-Ing. Khaled Shawky Hassan
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2011.
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
Huffman Encoding Veronica Morales.
1 Analysis of Algorithms Chapter - 08 Data Compression.
Fundamental Structures of Computer Science Feb. 24, 2005 Ananda Guna Lempel-Ziv Compression.
Multimedia Data Introduction to Lossless Data Compression Dr Sandra I. Woolley Electronic, Electrical.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
The LZ family LZ77 LZ78 LZR LZSS LZB LZH – used by zip and unzip
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-217, ext: 1204, Lecture 4 (Week 2)
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Adaptive Huffman Coding. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a Why Adaptive Huffman Coding? Huffman coding suffers.
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 6 – Basics of Compression (Part 1) Klara Nahrstedt Spring 2011.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Bahareh Sarrafzadeh 6111 Fall 2009
Lecture 7 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 7 (W5)
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Multi-media Data compression
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2012.
ECE 101 An Introduction to Information Technology Information Coding.
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval" Basics
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
BAB 3 HUFFMAN CODE Universitas Telkom INFORMATION THEORY.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
Data Coding Run Length Coding
Huffman Coding, Arithmetic Coding, and JBIG2
Chapter 8 – Binary Search Tree
Chapter 9: Huffman Codes
Chapter 11 Data Compression
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
CSE 589 Applied Algorithms Spring 1999
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Presentation transcript:

1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 5

Static vs. Adaptive Coding Encoder 1. Initialize the data model based on a first pass over the data (i.e., perform the probabilities analysis) 2. Transmit the data model (encoder). 3. Send data and while there is more data to send: -- Encode the next symbol using the existing data model and send it.Decoder 1. Receive the data model (decoder). 2. Receive the data and while there is more data to receive -- Decode the next symbol using the data model and output it. Summary about the Two-Pass procedure: 1. Collect statistics, generate codewords(1 st pass round) 2. Perform actual encoding/compression(2 nd pass round) 3. Not practical in many situations (e.g., compressing network transmissions) Static (Two-Pass Model) 2

Static vs. Adaptive Coding Encoder 1. Initialize the data model as fixed probability and fixed code length. 2. Send data first and while there is more data to send a.Encode the next symbol using the data model (if we have) and send it. b.Modify the existing data model based on the last symbol.Decoder 1. Initialize the data model as per agreement. 2. While there is more data to receive a.Decode the next symbol using the data model and output it. b.Modify the data model based on the decoded symbol. Adaptive (One-Pass Model) 3 What Do We Find ? No Encoder map to send!

Huffman Coding (e.g.: Lossless JPEG) Properties: I.Huffman codes are built from the bottom up, starting with the leaves of the tree and working progressively closer to the root II.Huffman coding will always at least work more efficient than Shannon-Fano coding, so it has become the predominate entropy coding method III.It was shown that Huffman coding cannot be improved or with any other integral bit- width coding stream Sibling Property: Defined by Gallager [Gallager 1978]: “A binary code tree has the sibling property if each node (except the root) has a sibling and if the nodes can be listed in order of nonincreasing (decreasing) weight with each node adjacent to its sibling.” Thus: 1- If A is the parent node of B (left) and C (right) is a child of B, then W(A) > W(B) > W(C) Thus if A is the parent node of B (left) and C (right), then W(B) < W(C) 4

Huffman Coding Properties A binary tree is a Huffman tree if and only if it obeys the sibling property, i.e., W(#1) ≤ W(#3) ≤ W(#3) ≤ … ≤ W(#7) ≤ W(#8) ≤ W(#9) 5 Non-Decreasing Order

Huffman Tree Sibling Property 6 ✔ Adaptive Huffman tree is found by adjusting the Huffman tree on the fly, based on data previously seen and having no knowledge about future statistics ✔ Sibling property during the update assure that we have a Huffman tree with right weights W(#1) ≤ W(#3) ≤ W(#3) ≤ … ≤ W(#7) ≤ W(#8) ≤ W(#9)

Huffman Tree Sibling Property 7 ✔ Sibling property during the update shows that we don't have a right Huffman tree W(#1) ≤ W(#3) ≤ W(#3) ≤ … ≤ W(#7) ≤ W(#8) ≤ W(#9) Dismissed Order

Huffman Tree Sibling Property 8 Sibling property during the update assure that we have a Huffman tree with right weights

Huffman Tree Sibling Property 9 ✔ Adaptive Huffman tree is found by adjusting the Huffman tree on the fly, based on data previously seen and having no knowledge about future statistics ✔ Sibling property during the update assure that we have a Huffman tree with right weights Dismissed Order W(#1) ≤... ≤ W(#3) ≤ W(#4) > W(#5) ≤ W(#6) ≤ W(#7) > W(#8) ≤ W(#9) Now node W(#4) > W(#5) and W(#7) > W(#8), i.e., violate the sibling property!

Adaptive Huffman Coding Algorithm: o Given: alphabet S = {s 1, …, s n } (NO Probabilities !!!) block/quadratic code o Pick a fixed default binary codes for all symbols (block/quadratic code) Empty o Start with an empty “Huffman” tree (I said and I mean it – Empty ) o Read symbol s from source If NYT(s) % (//) Not Yet Transmitted Send NYT, default(s) (except for the first symbol) Update the tree (and keep it Huffman) Else codeword Send codeword for s Update tree o Repeat until done with all symbols in the source 10

Example (Adaptive Huffman) Assume we are encoding the message [a a r d v a r k] The total number of nodes in this tree will be (at most) 2*n – = 2* = 53 where n is the number of usable alphabets and +2 is only for the “NYT” and its “node” The first letter to be transmitted is “a” As a does not yet exist in the tree, we send a binary code for a and then add a to the tree The NYT node gives birth to a new NYT node and a terminal node corresponding to “a” In this example, we will consider only 51 nodes and leaves (instead of 53!!). However, the correct is 53. The weight of the terminal node will be higher than the NYT node, so we assign the number 49 to the NYT node and 50 to the terminal node “a” The next symbol is a, and the transmitted code is 1 now (as a = 1 only now!) Lest see an example … (we first starts with a fixed code!) 11

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0 a r d v k Output: 12 To keep the rest of the slides as is, we started as the book with 51; however, the correct thing is to start with 53!

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0 a r d v k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0 a1 r d v k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0 a1 r d v k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0 a1 r01 d v k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT000 a1 r01 d001 v k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0000 a1 r01 d001 v0001?? k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0000 a1 r01 d001 v0001?? k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT0000 a1 r01 d001 v0101 ?? k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT000 a1 r01 d001 v?? k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT1100 a0 r10 d111 v1101 k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT1100 a0 r10 d111 v1101 k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT1100 a0 r10 d111 v1101 k Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT11000? a0 r10 d111 v1101 k11001?? Output:

Example: Adaptive Huffman Coding Input: aardvark SymbolCode NYT11000? a0 r10 d111 v1101 k11001 ?? Output:

Example: Adaptive Huffman Coding k Input: aardvark SymbolCode NYT11100 a0 r10 d110 v1111 k11101 Output:

Adaptive Huffman Decoding Input: a SymbolCode NYT a r d v k Output:

Input: aa SymbolCode NYT0 a1 r d v k Output: Adaptive Huffman Decoding 29

Input: aar SymbolCode NYT0 a1 r01 d v k Output: Adaptive Huffman Decoding 30

Input: aard SymbolCode NYT000 a1 r01 d001 v k Output: Adaptive Huffman Decoding 31

Input: aardv SymbolCode NYT0000 ? a1 r01 d001 v0001?? k Output: Adaptive Huffman Decoding 32

Input: aardv SymbolCode NYT0000 a1 r01 d001 v0001?? k Output: Adaptive Huffman Decoding 33

Input: aardv SymbolCode NYT0000 a1 r01 d001 v0101 ?? k Output: Adaptive Huffman Decoding 34

Input: aardv SymbolCode NYT000 a1 r01 d001 v?? k Output: Adaptive Huffman Decoding 35

Input: aardv SymbolCode NYT1100 a0 r10 d111 v1101 k Output: Adaptive Huffman Decoding 36

Input: aardva SymbolCode NYT1100 a0 r10 d111 v1101 k Output: Adaptive Huffman Decoding 37

Input: aardvar SymbolCode NYT1100 a0 r10 d111 v1101 k Output: Adaptive Huffman Decoding 38

Input: aardvar SymbolCode NYT1100 a0 r10 d111 v1101 k Output: Adaptive Huffman Decoding 39

Input: aardvark SymbolCode NYT11000 a0 r10 d111 v1101 k?? Output: Adaptive Huffman Decoding 40

k Input: aardvark SymbolCode NYT11100 a0 r10 d110 v1111 k11101 Output: ? Adaptive Huffman Decoding 41

Find the adaptive Huffman encoder (compressor) for the following text: raaaabcbaacvkl Assuming 26 alphabet set! Adaptive Huffman Exercise 42 Try to solve the following!

If the source has an alphabet {a 1,a 2, …, a m } of size m, then pick e and r such that m = 2 e +r and 0 ≤ r <2 e. The letter a k is encoded as the ﴾e+1﴿-bit corresponds to k−1, iff 1≤ k ≤2r; else, a k is encoded as (only) the e-bit binary representation of k−r−1. Example: suppose m = 26, then e = 4, and r=10. Then symbol a1 is encoded as 00000, (“a” in English) the symbol a2 is encoded as 00001, (“b” in English) and the symbol a22 is encoded as 1011 (“b” in English) Adaptive Huffman Notes 43 To Follow the Text Book example:

Adaptive Huffman Applications Lossless Image Compression Steps to have lossless image compression: 1.Generate a Huffman code for each uncompressed image (but already quantized and compressed with lossy methods) 2.Encode the image using the Huffman code 3.Save it in a file again !!! The original (uncompressed) image representation uses 8 bits/pixel. The image consists of 256 rows of 256 pixels, so the uncompressed representation uses 65,536 bytes. Compression ratio → number of bytes (uncompressed): number of bytes compressed 44

Adaptive Huffman Applications Lossless Image Compression 45

Adaptive Huffman Applications Lossless Image Compression Image NameBits/PixelTotal Size (B)Compression Ratio Sena7.0157, Sensin7.4961, Earth4.9440, Omaha7.1258, Huffman (Lossless JPEG) Compression Based on Pixel value 46

Adaptive Huffman Applications Lossless Image Compression Image NameBits/PixelTotal Size (B)Compression Ratio Sena4.0232, Sensin4.7038, Earth4.1333, Omaha6.4252, Huffman Compression Based on Pixel Difference value and Two-Pass Model 47

Adaptive Huffman Applications Lossless Image Compression Image NameBits/PixelTotal Size (B)Compression Ratio Sena3.9332, Sensin4.6337, Earth4.8239, Omaha6.3952, Huffman Compression Based on Pixel Difference Value and One-Pass Adaptive Model 48

Adaptive Huffman Applications Lossless Image Compression Image NameBits/PixelTotal Size (B)Compression Ratio Sena3.9332, Sensin4.6337, Earth4.8239, Omaha6.3952, Huffman Compression Based on Pixel Difference Value and One-Pass Adaptive Model 49

Optimality of Huffman Codes! The necessary conditions for an optimal variable-length binary code: 50 Condition 1: Condition 1: Given any two letters a j and a k, if P(a j ) ≥ P(a k ), then l j ≤ l k, where l j is the number of bits in the codeword for a j. Condition 2: Condition 2: The two least probable letters have codewords with the same maximum length lm. Condition 3: Condition 3: In the tree corresponding to the optimum code, there must be two branches stemming from each intermediate node. Condition 4: Condition 4: Suppose we change an intermediate node into a leaf node by combining all the leaves descending from it into a composite word of a reduced alphabet. Then, if the original tree was optimal for the original alphabet, the reduced tree is optimal for the reduced alphabet.

Minimum Variance Huffman Codes By performing the sorting procedure in a slightly different manner, we could have found a different Huffman code. 51

Huffman Coding: Self Study! Minimum Variance Huffman Codes (pp. 46 – 47 {redo the examples}) Length of Huffman Codes (pp. 49 ~ 51 and the example 3.2.2) Huffman Codes optimality condition!!

Huffman Tree Sibling Property 53 W(#1) ≤... ≤ W(#3) ≤ W(#4) > W(#5) ≤ W(#6) ?? W(#7) > W(#8) ≤ W(#9) ➔ Now, sweep nodes #4 and #5

Huffman Tree Sibling Property 54 W(#1) ≤... ≤ W(#3) ≤ W(#4) ≤ W(#5) ≤ W(#6) > …. W(#8) ≤ W(#9) ➔ But W(#6) > W(#8)! Then sweep #6 and #8 #8 E(10) #6 A(13) #4 C(2) #5(15) #7(19) #9(29) Looks better?! #1 D(2) #2 B(2) #3(4)

Huffman Tree Sibling Property 55 W(#1) ≤ W(#3) ≤ W(#3) ≤ … ≤ W(#7) ≤ W(#8) ≤ W(#9) #8 E(10) #8 A(13) #4 C(2) #5(15) #7(19) #9(29) NOW it is ok! #1 D(2) #2 B(2) #3(4)