Lossless/Near-lossless Compression of Still and Moving Images Part 2. Entropy coding Xiaolin Wu Polytechnic University Brooklyn, NY.

Slides:



Advertisements
Similar presentations
Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.
Advertisements

CY2G2 Information Theory 1
Lecture 4 (week 2) Source Coding and Compression
Lecture 3: Source Coding Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Arithmetic Coding. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a How we can do better than Huffman? - I As we have seen, the.
Information Theory EE322 Al-Sanie.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Data Compression.
Lecture04 Data Compression.
Compression & Huffman Codes
School of Computing Science Simon Fraser University
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
SWE 423: Multimedia Systems
Department of Computer Engineering University of California at Santa Cruz Data Compression (1) Hai Tao.
2015/6/15VLC 2006 PART 1 Introduction on Video Coding StandardsVLC 2006 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
Spatial and Temporal Data Mining
Chapter 9: Huffman Codes
Variable-Length Codes: Huffman Codes
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
2015/7/12VLC 2008 PART 1 Introduction on Video Coding StandardsVLC 2008 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
Data Compression Basics & Huffman Coding
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Huffman Codes Message consisting of five characters: a, b, c, d,e
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Basics of Compression Goals: to understand how image/audio/video signals are compressed to save storage and increase transmission efficiency to understand.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
Information and Coding Theory
8. Compression. 2 Video and Audio Compression Video and Audio files are very large. Unless we develop and maintain very high bandwidth networks (Gigabytes.
Noiseless Coding. Introduction Noiseless Coding Compression without distortion Basic Concept Symbols with lower probabilities are represented by the binary.
1 Lossless Compression Multimedia Systems (Module 2 Lesson 2) Summary:  Adaptive Coding  Adaptive Huffman Coding Sibling Property Update Algorithm 
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
Basic Concepts of Encoding Codes, their efficiency and redundancy 1.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
ICS 220 – Data Structures and Algorithms Lecture 11 Dr. Ken Cosh.
ALGORITHMS FOR ISNE DR. KENNETH COSH WEEK 13.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
An Algorithm for Construction of Error-Correcting Symmetrical Reversible Variable Length Codes Chia-Wei Lin, Ja-Ling Wu, Jun-Cheng Chen Presented by Jun-Cheng.
Lossless Compression(2)
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 7 (W5)
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval" Basics
Lampel ZIV (LZ) code The Lempel-Ziv algorithm is a variable-to-fixed length code Basically, there are two versions of the algorithm LZ77 and LZ78 are the.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
HUFFMAN CODES.
COMP261 Lecture 22 Data Compression 2.
EE465: Introduction to Digital Image Processing
Huffman Coding, Arithmetic Coding, and JBIG2
Context-based Data Compression
Chapter 9: Huffman Codes
Analysis & Design of Algorithms (CSCE 321)
Chapter 11 Data Compression
Huffman Coding CSE 373 Data Structures.
Data Structure and Algorithms
Huffman Coding Greedy Algorithm
CSE 589 Applied Algorithms Spring 1999
Analysis of Algorithms CS 477/677
Presentation transcript:

Lossless/Near-lossless Compression of Still and Moving Images Part 2. Entropy coding Xiaolin Wu Polytechnic University Brooklyn, NY

4/29/20152 Variable length codes (VLC) èMap more frequently occurring symbols to shorter codewords éabracadabra ä fixed length a b c d r ä variable length a - 0 b - 10 c d r è For instantaneous and unique decodability we need prefix condition, i.e. no codeword is prefix of another éNon-prefix code éPrefix code

4/29/20153 Optimality of prefix codes èOptimal data compression achievable by any VLC can always be achieved by a prefix code èA prefix code can be represented by a labeled binary tree as follows èAn optimal prefix code is always represented by a full binary tree Prefix code {0, 100, 101, 110, 111}

4/29/ Huffman codes èDeveloped in 1952 by D.A. Huffman. èLet source alphabet be s 1, s 2,…,s N with probability of occurrence p 1, p 2, …, p N éStep 1 Sort symbols in decreasing order or probability éStep 2 Merge two symbols with lowest probabilities, say, s N-1 and s N. Replace (s N-1,s N ) pair by H N-1 (the probability is p N-1 + p N ). Now new set of symbols has N-1 members s 1, s 2, …., H N-1. éStep 3 Repeat Step 2 until all symbols merged.

4/29/20155 Huffman codes (contd.)  Process viewed as construction of a binary tree. On completion, all symbols s i will be leaf nodes. Codeword for s i obtained by traversing tree from root to the leaf node corresponding to s i. Average code length 2.2

4/29/20156 Properties of Huffman codes èOptimum code for a given data set requires two passes. èCode construction complexity O(N logN). èFast lookup table based implementation. èRequires at least one bit per symbol. èAverage codeword length is within one bit of zero-order entropy (Tighter bounds are known). è Susceptible to bit errors.

4/29/20157 Huffman codes - Blocking symbols to improve efficiency èp(w) = 0.8, p(b) = 0.2 Entropy = 0.72 Bit-rate = 1.0 Efficiency = 72% èp(ww) = 0.64, p(wb)=p(bw)=0.16, p(bb) = 0.04 Bit-rate = 0.80 Efficiency = 90% èBlocking three symbols we get alphabet of size 8 and average bit-rate 0.75 efficiency 95% èProblem - alphabet size and consequently Huffman table size grows exponentially with number of symbols blocked.

4/29/20158 Run-length codes èEncode runs of symbols rather than symbols themselves bbbaaadddddcfffffffaaaaaddddd encoded as 3b3a4d1c7f5a5d èEspecially suitable for binary alphabet encoded as 2,7,7,2,1,6,5 èRun lengths can be further encoded using a VLC

4/29/20159 Arithmetic Coding èWe have seen that alphabet extension i.e. blocking symbols prior to coding can lead to coding efficiency èHow about treating entire sequence as one symbol! èNot practical with Huffman coding èArithmetic coding allows you to do precisely this èBasic idea - map data sequences to sub-intervals in (0,1) with lengths equal to probability of corresponding sequence. è To encode a given sequence transmit any number within the sub-interval

4/29/ Arithmetic coding - mapping sequences to sub-intervals abc aaabac aaaaabaac

4/29/ Final range is [ , ). Transmit any number within range, e.g … 16 bits. (Huffman coder needs 18bits. Fixed coder: 21bits). Message is lluure? (we use ? As message terminator) Arithmetic coding - encoding example Initial partition of (0,1) interval

4/29/ Symbol probabilities Modified decoder table Arithmetic coding - decoding example 1. We find i = 6 such that cumprob 6 <= (value-lo)/range <cumprob 5 Thus first decoded symbol is l. 2. Update: hi = 0.25, lo = 0.05, range = To decode next symbol we find i = 6 such that cumprob 6 <= (value )/0.2 < cumprob 5 thus next decoded symbol is l. 5. Update hi = 0.10, lo = 0.06, range = Repeat above steps till decoded symbol is ? Terminate decoding. lo = 0, hi = 1, range = 1.

4/29/ Arithmetic coding - implementation issues èIncremental output at encoder and decoder éFrom example discussed earlier, note that after encoding u, subinterval range [0.07, 0.074). So, can output 07.  After encoding next symbol, range is [0.071, ). So can output 1. èPrecision - intervals can get arbitrarily small éScaling - Scale interval every time you transmit éActually scale interval every time it gets below half original size - (this gives rise to some subtle problems which can be taken care of)

4/29/ Golomb-Rice codes èGolomb code of parameter m for positive integer n is given by coding n div m (quotient) in unary and n mod m (remainder) in binary. èWhen m is power of 2, a simple realization also known as Rice code. èExample: n = 22, k = 2 (m = 4). én = 22 = ‘10110’. Shift right n by k (= 2) bits. We get ‘101’. éOutput 5 (for ‘101’) ‘0’s followed by ‘1’. Then also output the last k bits of N. éSo, Golomb-Rice code for 22 is ‘ ’. èDecoding is simple: count up to first 1. This gives us the number 5. Then read the next k (=2) bits - ‘10’, and n = m x (for ‘10’) = = 22.

4/29/ Comparison èIn practice, for images, arithmetic coding gives 10-20% improvement in compression ratios over a simple Huffman coder. The complexity of arithmetic coding is however % higher. èGolomb-Rice codes if used efficiently have been demonstrated to give performance within 5 to 10% of arithmetic coding. They are potentially simpler than Huffman codes. èMultiplication free binary arithmetic coders (Q, QM coders) give performance within 2 to 4% of M-ary arithmetic codes.

4/29/ Further Reading for Entropy Coding èText Compression - T.Bell, J. Cleary and I. Witten. Prentice Hall. Good coverage of arithmetic coding èThe Data Compression Book - M. Nelson and J-L Gailly. M&T Books. Includes source code. èImage and Video Compression Standards - V. Bhaskaran and K. Konstantinides. Kluwer International. Hardware Implementations.