Why do we need compression? EE565 Advanced Image Processing Copyright Xin Li 2009-2012 1  Why do we need image compression? -Example: digital camera (4Mpixel)

Why do we need compression? EE565 Advanced Image Processing Copyright Xin Li 2009-2012 1  Why do we need image compression? -Example: digital camera (4Mpixel) Raw data – 24bits, 5.3M pixels  16M bytes 4G memory card ($10-30)  250 pictures JPEG encoder raw image (16M bytes) compressed JPEG file (1M bytes) compression ratio=16  4000 pictures

EE565 Advanced Image Processing Copyright Xin Li 2009-2012 2 Roadmap to Image Coding Introduction to data compression – A modeling perspective – Shannon’s entropy and Rate-Distortion Theory* (skipped) – Arithmetic coding and context modeling Lossless image compression (covered in EE465) – Spatially adaptive prediction algorithms Lossy image compression – Before EZW era: first-generation wavelet coders – After EZW era: second-generation wavelet coders – A quick tour of JPEG2000 New direction in image coding

EE565 Advanced Image Processing Copyright Xin Li 2009-2012 3 Modeler’s View on Image Coding Spatial-domain models Transform-domain models Stationary process Non-Stationary process Conventional MED, GAP Stationary GGD Non-Stationary GGD Patch-based Transform models Least-Square Based Edge Directed Prediction First-generation Wavelet coders Second-generation Wavelet coders Next-generation coders Nonparametric (patch-based) Intra-coding in H.264

Two Regimes Lossless coding – No distortion is tolerable – Decoded signal is mathematically identical to the encoded one Lossy coding – Distortion is allowed for the purpose of achieving higher compression ratio – Decoded signal should be perceptually similar to the encoded one EE565 Advanced Image Processing Copyright Xin Li 2009-2012 4

5 Data Compression Basics (bits/sample) or bps weighting coefficients Shannon’s Source Entropy Formula X is a discrete random variable Discrete Source:

EE565 Advanced Image Processing Copyright Xin Li 2009-2012 6 Code Redundancy Average code length: Theoretical bound Practical performance l i : the length of codeword assigned to the i-th symbol Note: if we represent each symbol by q bits (fixed length codes), Then redundancy is simply q-H(X) bps

How to achieve source entropy? EE565 Advanced Image Processing Copyright Xin Li 2009-2012 7 Note: The above entropy coding problem is based on simplified assumptions are that discrete source X is memoryless and P(X) is completely known. Those assumptions often do not hold for real-world data such as images and we will discuss them later. entropy coding discrete source X P(X) binary bit stream

EE565 Advanced Image Processing Copyright Xin Li 2009-2012 8 Two Goals of VLC design  –log 2 p(x)  For an event x with probability of p(x), the optimal code-length is, where  x  denotes the smallest integer larger than x (e.g.,  3.4  =4 ) Achieve optimal code length (i.e., minimal redundancy) Satisfy prefix condition code redundancy: Unless probabilities of events are all power of 2, we often have r>0

Prefix condition EE565 Advanced Image Processing Copyright Xin Li 2009-2012 9 No codeword is allowed to be the prefix of any other codeword. 10 … 10110100 10 … 1011 codeword 1 codeword 2 111110

Coding Procedures for an N-symbol source – Source reduction List all probabilities in a descending order Merge the two symbols with smallest probabilities into a new compound symbol Repeat the above two steps for N-2 steps – Codeword assignment Start from the smallest source and work back to the original source Each merging point corresponds to a node in binary codeword tree EE565 Advanced Image Processing Copyright Xin Li 2009-2012 10 Huffman Codes (Huffman’1952)

A Toy Example EE565 Advanced Image Processing Copyright Xin Li 2009-2012 11 symbol x p(x) e o a i 0.4 0.2 0.1 0.4 0.2 0.40.6 0.4 (iou) (aiou) compound symbols u0.1 0.2 (ou) 0.4 0.2 0 1 01 0100 000001 00100011 e ou (ou) i (iou)a (aiou) Source reductionCodeword Assignment

Arithmetic Coding One of the major milestones in data compression (just like Lempel-Ziv coding used in WinZIP) The building block of almost all existing compression algorithms including text, audio, image and video Remarkably simple idea and ease of implementation (especially computational efficiency in the special case of binary arithmetic coding) EE565 Advanced Image Processing Copyright Xin Li 2009-2012 12

Basic Idea The input sequence will be mapped to a unique real number on [0,1] – The more symbols are coded, the smaller such interval (therefore it takes more bits to represent the interval) – The size of interval is proportional to the probability of the whole sequence Note that we still assume source X is memoryless – source with memory will be handled by context modeling techniques next EE565 Advanced Image Processing Copyright Xin Li 2009-2012 13

Example EE565 Advanced Image Processing Copyright Xin Li 2009-2012 14 Input Sequence: SQUEEZ… Alphabet: {E,Q,S,U,Z} P(E)=0.429 P(Q)=0.142 P(S)=0.143 P(U)=0.143 P(Z)=0.143 P(SQUEEZ)= P(S)P(Q)P(U) P(E) 2 P(Z)

Example (Con’d) EE565 Advanced Image Processing Copyright Xin Li 2009-2012 15 SQUEEZ… Symbol sequence Interval [0.64769,0.64777] Notes: P(X)  Any number between 0.64769 and 0.64777 will produce a sequence starting from SQUEEZ  How do we know when the sequence stops? I.e. how can encoder distinguish between SQUEEZ and SQUEEZE? The mapping of a real number to a binary bit stream is easy 0 1 First bit Second bit

Another Example EE565 Advanced Image Processing Copyright Xin Li 2009-2012 16 Solution: Use a special symbol to denote end of block (EOB) For example: if we use “!” as EOB symbol, “eaii” becomes eaii!” In other words, we will assign a nonzero probability for EOB symbol.

Implementation Issues EE565 Advanced Image Processing Copyright Xin Li 2009-2012 17 Witten, I. H., Neal, R. M., and Cleary, J. G. “Arithmetic coding for data compression”. Commun. ACM 30, 6 (Jun. 1987), 520-540.

Arithmetic Coding Summary Based on the given probability model P(X), AC maps the symbol sequence to a unique number between 0 and 1, which can then be conveniently represented by binary bits You will compare Huffman coding and Arithmetic coding in your homework and learn how to use it in the computer assignment EE565 Advanced Image Processing Copyright Xin Li 2009-2012 18

Context Modeling Arithmetic coding (entropy coding) solves the problem under the assumption that P(X) is known In practice, we don’t know P(X) and have to estimate it from the data More importantly, the memoryless assumption with source X does not hold for real-world data EE565 Advanced Image Processing Copyright Xin Li 2009-2012 19

Probability Estimation Problem EE565 Advanced Image Processing Copyright Xin Li 2009-2012 20 Given a sequence of symbols, how do we estimate the probability of each individual symbol? Forward solution: Backward solution (more popular in practice): Encoder counts the frequency of each symbol for the whole sequence and transmit the frequency table to the decoder as the overhead Both encoder and decoder count the frequency of each symbol on-the-fly from the causal past only (so no overhead is needed)

Examples P(0)P(1)N(0)N(1) start1/2 11 02/31/321 03/41/431 04/51/541 EE565 Advanced Image Processing Copyright Xin Li 2009-2012 21 S={0,0,0,0,0,0,1,0,0,0,0,0,0,1,1,1} Forward approach: count 4 “1”s and 12 “0”s  P(0)=3/4, P(1)=1/4 For simplicity, we will consider binary symbol sequence (M-ary sequence is conceptually similar) Backward approach:

Backward Adaptive Estimation EE565 Advanced Image Processing Copyright Xin Li 2009-2012 22 0,0,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,1,1,0,0,0,0 T P(0)=.6 P(1)=.4 Example: The probability estimation will be based on the causal past with a specified window length T (i.e., assume source is Markovian) Such adaptive estimation is particularly effective for handling sequence with dynamically varying statistics

Now Comes Context Importance of context – Context is a fundamental concept to help us resolve ambiguity The best known example: By quoting Darwin "out of context" creationists attempt to convince their followers that Darwin didn't believe the eye could evolve by natural selection. Why do we need context? – To handle the memory in the source – Context-based modeling often leads to better estimation of probability models EE565 Advanced Image Processing Copyright Xin Li 2009-2012 23

Order of Context EE565 Advanced Image Processing Copyright Xin Li 2009-2012 24 First order context “q u o t e” P(u)<<P(u|q) Note: Second order context“s h o c k w a v e” Context Dilution Problem: If source X has N different symbols, K-th order context modeling will define N K different contexts (e.g., consider N=256 for images)

Context-Adaptive Probability Estimation EE565 Advanced Image Processing Copyright Xin Li 2009-2012 25 Thumb rule: in estimating probabilities, only those symbols with the same context will be used in counting frequencies 1D Example 0,1,0,1,0,1,0,1, 0,1,0,1,0,1,0,1, 0,1,0,1,0,1,0,1, zero-order (No) context P(0)=P(1)  1/2 first-order context P(1|0)=P(0|1)=1, P(0|0)=P(1|1)=0

2D Example (Binary Image) EE565 Advanced Image Processing Copyright Xin Li 2009-2012 26 0 0 0 0 1 1 1 1 0 0 0 0 Zero-order context: P(0)=5/9, P(1)=4/9 First-order context:W-X P(X|W)=? W=0 (total 20): P(0|0)=4/5, P(1|0)=1/5 W=1 (total 16): P(0|1)=1/4, P(1|1)=3/4 Fourth-order context: NW N NE W X P(1|1111)=1

Data Compression Summary Entropy coding is solved by arithmetic coding techniques Context plays an important role in statistical modeling of source with memory (there exists a problem of context dilution which can be handled by quantizing the context information) Quantization of memoryless source is solved by Lloyd-Max algorithm EE565 Advanced Image Processing Copyright Xin Li 2009-2012 27

Quantization Theory (Rate-Distortion Theory) EE565 Advanced Image Processing Copyright Xin Li 2009-2012 28 x Q x ^ Quantization noise: For a continuous random variable, distortion is defined by probability distribution function For a discrete random variable, distortion is defined by probabilities

Recall: Quantization Noise of UQ EE565 Advanced Image Processing Copyright Xin Li 2009-2012 29  -  /2  /2 1/  e f(e) Quantization noise of UQ on uniform distribution is also uniformly distributed Recall Variance of U[-  /2,  /2] is -A-A A

6dB/bit Rule of UQ EE565 Advanced Image Processing Copyright Xin Li 2009-2012 30 Signal: X ~ U[-A,A] Noise: e ~ U[-  /2,  /2] Choose N=2 n (n-bit) codewords for X N=2 n (quantization stepsize)

Shannon’s R-D Function EE565 Advanced Image Processing Copyright Xin Li 2009-2012 31 D R R-D function of source X determines the optimal tradeoff between Rate and Distortion R(D)=min R s.t. E[(X-X) 2 ]≤D ^

A Few Cautious Notes Regarding Distortion Unlike rate (how many bits are used?), definition of distortion is nontrivial at all Mean-Square-Error (MSE) is widely used and will be our focus in this class However, for image signals, MSE has little correlation with subjective quality (the design of perceptual image coders is a very interesting research problem which is still largely open) EE565 Advanced Image Processing Copyright Xin Li 2009-2012 32

EE565 Advanced Image Processing Copyright Xin Li 2009-2012 33 Gaussian Random Variable X: a given random variable with Gaussian distribution N(0,σ 2 ) Its Rate-Distortion function is known as

Quantizer Design Problem EE565 Advanced Image Processing Copyright Xin Li 2009-2012 34 For a memoryless source X with pdf of P(X), how to design a quantizer (i.e., where to put the L=2 K codewords) to minimize the distortion? Solution: Lloyd-Max Algorithm minimized MSE (we will study it in detail on the blackboard)

Rate Allocation Problem* EE565 Advanced Image Processing Copyright Xin Li 2009-2012 35 Solution: Lagrangian Multiplier technique (we will study it in detail on the blackboard) LL LHHH HL Given a quota of bits R, how should we allocate them to each band to minimize the overall MSE distortion?

Gap Between Theory and Practice Information theoretical results offer little help in the practice of data compression What is the entropy of English texts, audio, speech or image? – Curse of dimensionality Without exact knowledge about the subband statistics, how can we solve the rate allocation problem? – Image subbands are nonstationary and nonGaussian What is the class of image data we want to model in the first place? – Importance of understanding the physical origin of data and its implication into compression EE565 Advanced Image Processing Copyright Xin Li 2009-2012 36

Why do we need compression? EE565 Advanced Image Processing Copyright Xin Li 2009-2012 1  Why do we need image compression? -Example: digital camera (4Mpixel)

Similar presentations

Presentation on theme: "Why do we need compression? EE565 Advanced Image Processing Copyright Xin Li 2009-2012 1  Why do we need image compression? -Example: digital camera (4Mpixel)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Why do we need compression? EE565 Advanced Image Processing Copyright Xin Li 2009-2012 1  Why do we need image compression? -Example: digital camera (4Mpixel)

Similar presentations

Presentation on theme: "Why do we need compression? EE565 Advanced Image Processing Copyright Xin Li 2009-2012 1  Why do we need image compression? -Example: digital camera (4Mpixel)"— Presentation transcript:

Similar presentations

About project

Feedback