Image Compression - JPEG
Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation (throat), or ear characterisitics –Image compression JPEG based –Images take much more memory than voice An image is worth a thousand words Which thousand words? –Video – next week, can we extrapolate?
Image Compression Basics Model driven Reduce data redundancy –Neighboring values on a line scan in an image –DPCM, predictive coding Human perception properties –Human visual system {eye/brain} is more sensitive to some information as compared to others {low frequencies vs high frequencies}: be careful..edges are often critical –Enhancement approaches
Entropy Entropy – measurement of the uncertainty of the input. Higher the uncertainty the higher the entropy. –Which has higher entropy noise or a 300Hx sine wave? –Computation is histogram based – p(i) = probability of occurrence of a gray level in the image –E = - i p(i) lg {p(i)} –Identifies the minimum number of bits required to represent the image
Compression Issues Progressive display –Display partially decompressed images –User begins to see parts of the image, does not have to wait for complete decompression Hierarchical encoding –Encode images at multiple resolution levels. –Display images at lower resolution level and then incrementally improve the quality Asymmetry –Time for encoding –Time for decoding
Types of compression Lossless –Huffman, LZW, Run length, DPCM? –Typical compression: 3:1 Lossy –Predictive –Frequency based: transform, subbands –Spatial based: filtering, non-linear quantization, vector quantization Hybrid
JPEG is based on Huffman coding –Optimal entropy encoding Run length encoding –Used in G3, fax Discrete Cosine Transform –Frequency based –Apply perception rules in the frequency domain The fidelity and level of compression can be controlled – 15:1 or even better
Huffman encoding Assign fewer bits to symbols {pixel values} that occur more frequently Number of bits per symbol is non-uniform The code book has to be made available to the decoder, i.e. this file leads to increase in the file size. Results in optimal encoding –Number of bits required is close to the entropy
Run length Encoding Run length, size, amplitude –RL: 4 bits –Size: 4 bits –Amplitude: 10 bits Maximum compression if the run lengths are long G3 used for fax Usually use Huffman to encode the parameters
Discrete Cosine Transform Real cousin of Fourier transform Complexity –N*N –Fast DCT – similar to FFT –To reduce cost Divide image into 8 x 8 blocks Compute DCT of blocks Reduce the size of the object to be compressed
Quantization The eye is more sensitive to the lower frequencies. Divide each frequency component by a constant Divide higher frequency components with a larger value Truncate, and this will reduce the non-zero values Four quantization matrices are available in JPEG
Color RGB planes Transform RGB into YUV –Y – luminance –U,V – chrominance –UV have lower spatial resolutions Down sampled to take advantage of lower resolution
RGB YUV Down sample UV Original data is 8 bits per pixel, all positive [0,255]. Shift to [-128, 127]. Divide image into 8x8 blocks DCT on each block Use quantization table to quantize values in each block {Reducing high freq content} Use zig-zag scanning to order values in each block Organize data into bands {DC, low f, mid f, high f} Run length encoding Huffman encoding Overview of JPEG
Reference G. K. Wallace, “The JPEG Still Picture Compression Standard”, Communications of the ACM, April 1991, vol 34, No. 4, pp