EE591U Wavelets and Filter Banks Copyright Xin Li Roadmap to Lossy Image Compression JPEG standard: DCT-based image coding First-generation wavelet coding FBI WSQ standard Second-generation schemes Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation Beyond wavelet coding
EE591U Wavelets and Filter Banks Copyright Xin Li A Tour of JPEG Coding Algorithm Flow-chart diagram of DCT-based coding algorithm specified by Joint Photographic Expert Group (JPEG) TQ C
EE591U Wavelets and Filter Banks Copyright Xin Li Transform Coding of Images Why not transform the whole image together? Require a large memory to store transform matrix It is not a good idea for compression due to spatially varying statistics within an image Idea of partitioning an image into blocks Each block is viewed as a smaller-image and processed independently It is not a magic, but a compromise
EE591U Wavelets and Filter Banks Copyright Xin Li by-8 DCT Basis Images
EE591U Wavelets and Filter Banks Copyright Xin Li Block Processing under MATLAB Type “help blkproc” to learn the usage of this function B = BLKPROC(A,[M N],FUN) processes the image A by applying the function FUN to each distinct M-by-N block of A, padding A with zeros if necessary. Example I = imread('cameraman.tif'); fun J = blkproc(I,[8 8],fun);
EE591U Wavelets and Filter Banks Copyright Xin Li Block-based DCT Example J I note that white lines are artificially added to the border of each 8-by-8 block to denote that each block is processed independently
EE591U Wavelets and Filter Banks Copyright Xin Li Boundary Padding padded regions Example When the width/height of an image is not the multiple of 8, the boundary is artificially padded with repeated columns/rows to make them multiple of 8
EE591U Wavelets and Filter Banks Copyright Xin Li Work with a Toy Example Any 8-by-8 block in an image is processed in a similar fashion
EE591U Wavelets and Filter Banks Copyright Xin Li Encoding Stage I: Transform Step 1: DC level shifting 128 (DC level) _
EE591U Wavelets and Filter Banks Copyright Xin Li Step 2: 8-by-8 DCT Encoding Step 1: Transform (Con’t) 8 8 DCT
EE591U Wavelets and Filter Banks Copyright Xin Li Encoding Stage II: Quantization Q-table : specifies quantization stepsize (see slide #28) Notes: Q-table can be specified by customer Q-table is scaled up/down by a chosen quality factor Quantization stepsize Q ij is dependent on the coordinates ( i,j ) within the 8-by-8 block Quantization stepsize Q ij increases from top-left to bottom-right
EE591U Wavelets and Filter Banks Copyright Xin Li Encoding Stage II: Quantization (Con’t) Example f x ij s ij
EE591U Wavelets and Filter Banks Copyright Xin Li Encoding Stage III: Entropy Coding Zigzag Scan (20,5,-3,-1,-2,-3,1,1,-1,-1, 0,0,1,2,3,-2,1,1,0,0,0,0,0, 0,1,1,0,1,EOB) zigzag scan End Of the Block: All following coefficients are zero
EE591U Wavelets and Filter Banks Copyright Xin Li Run-length Coding (20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) DC coefficient AC coefficient - DC coefficient : DPCM coding - AC coefficient : run-length coding (run, level) (5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) (0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1), (0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB Huffman coding encoded bit stream
EE591U Wavelets and Filter Banks Copyright Xin Li JPEG Decoding Stage I: Entropy Decoding (20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) Huffman decoding encoded bit stream AC coefficients DC coefficient DPCM decoding (0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1), (0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB encoded bit stream
EE591U Wavelets and Filter Banks Copyright Xin Li JPEG Decoding Stage II: Inverse Quantization (20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) zigzag f -1
EE591U Wavelets and Filter Banks Copyright Xin Li JPEG Decoding Stage III: Inverse Transform 8 8 IDCT 128 (DC level) +
EE591U Wavelets and Filter Banks Copyright Xin Li Quantization Noise X X ^ MSE=|| X-X || 2 ^ Distortion calculation: Rate calculation: Rate=length of encoded bit stream/number of pixels (bps)
EE591U Wavelets and Filter Banks Copyright Xin Li JPEG Examples (58k bytes)50 (21k bytes)10 (8k bytes) best quality, lowest compression worst quality, highest compression
EE591U Wavelets and Filter Banks Copyright Xin Li Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and transform First-generation schemes FBI WSQ standard Second-generation schemes Probabilistic modeling of wavelet coefficients Embedded Zerotree Wavelet (EZW) SPIHT coder A unified where-and-what perspective JPEG2000
EE591U Wavelets and Filter Banks Copyright Xin Li Early Attempts Each band is modeled by a Guassian random variable with zero mean and unknown variance (e.g., WSQ) Only modest gain over JPEG (DCT- based) is achieved Question: is this an accurate model? and how can we test it?
EE591U Wavelets and Filter Banks Copyright Xin Li FBI Wavelet Scalar Quantization (WSQ) k: band index mk=mk= image size subband size Each band is approximately modeled by a Gaussian r.v. Given R, minimize
EE591U Wavelets and Filter Banks Copyright Xin Li Rate Allocation Problem* Solution: Lagrangian Multiplier technique (turn a constrained optimization Into an unconstrained optimization problem) LL LHHH HL Given a quota of bits R, how should we allocate them to each band to minimize the overall MSE distortion?
EE591U Wavelets and Filter Banks Copyright Xin Li Proof by Contradiction (I) Suppose each coefficient X in a high band does observe Gaussian distribution, i.e., X~N(0,σ 2 ), then flip the sign of X (i.e., replace X with –X) should not matter and generates another element in Ω (i.e., a different but meaningful image) Assumption: our modeling target Ω is the collection of natural images Let’s test it!
EE591U Wavelets and Filter Banks Copyright Xin Li Proof by Contradiction (II) DWT sign flip IWT
EE591U Wavelets and Filter Banks Copyright Xin Li What is wrong with that? Think of two coefficients: one in smooth region and the other around edge, do they observe the same probabilistic distribution? Think of all coefficients around the same edge, do they observe the same probabilistic distribution? Ignorance of topology and geometry
EE591U Wavelets and Filter Banks Copyright Xin Li The Importance of Modeling Singularity Location Uncertainty Singularities carry critical visual information: edges, lines, corners … The location of singularities is important Recall locality of wavelets in spatial- frequency domain Singularities in spatial domain → significant coefficients in wavelet domain
EE591U Wavelets and Filter Banks Copyright Xin Li Where-and-What Coding Communication context Where The location of significant coefficients What The sign and magnitude of significant coefficients AliceBob communication channel picture
EE591U Wavelets and Filter Banks Copyright Xin Li Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and transform First-generation schemes FBI WSQ standard Second-generation schemes Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation Scalable and ROI coding in JPEG2000
EE591U Wavelets and Filter Banks Copyright Xin Li Embedded Zerotree Wavelet (EZW)’1993 Set Partition In Hierarchical Tree (SPIHT)’1995 Space-Frequency Quantization (SFQ)’ 1996 Estimation Quantization (EQ)’1997 Embedded Block Coding with Optimal Truncation (EBCOT)’2000 Least-Square Estimation Quantization (LSEQ)’2003
EE591U Wavelets and Filter Banks Copyright Xin Li A Simpler Two-Stage Coding Position coding stage (where) Generate a binary map indicating the location of significant coefficients (|X|>T) Use context-based adaptive binary arithmetic coding (e.g., JBIG) to code the binary map Intensity coding stage (what) Code the sign and magnitude of significant coefficients
EE591U Wavelets and Filter Banks Copyright Xin Li Classification-based Modeling Insignificant class Significant class Mixture
EE591U Wavelets and Filter Banks Copyright Xin Li Classification Gain Without classification With classification Classification gain
EE591U Wavelets and Filter Banks Copyright Xin Li Example
EE591U Wavelets and Filter Banks Copyright Xin Li Advanced Wavelet Coding SPIHT: a simpler yet more efficient implementation of EZW coder SFQ: Rate-Distortion optimized zerotree coder EQ: Rate-Distortion optimization via backward adaptive classification EBCOT (adopted by JPEG2000): a versatile embedded coder
EE591U Wavelets and Filter Banks Copyright Xin Li Beyond SPIHT JPEG-decoded at rate of 0.32bpp (PSNR=32.07dB) SFG-enhanced at rate of 0.32bpp (PSNR=33.22dB) SPIHT-decoded at rate of 0.20bpp (PSNR=26.18dB) SFG-enhanced at rate of 0.20bpp (PSNR=27.33dB) Maximum-Likelihood (ML) Decoding Maximum a Posterior (MAP) Decoding
EE591U Wavelets and Filter Banks Copyright Xin Li Open Problems Related to Image Coding Coding of specific class of images (e.g., Satellite, microarray, fingerprint) Coding of color-filter-array (CFA) images Error resilient coding of images Perceptual image coding Image coding for pattern recognition
EE591U Wavelets and Filter Banks Copyright Xin Li Coding of Specific Class of Images How to design specific coding algorithms for each class?
EE591U Wavelets and Filter Banks Copyright Xin Li CFA Image Coding Bayer Pattern CFA Interpolation (demosaicing) Color image compression CFA Interpolation (demosaicing) CFA data compression Approach I Approach II Which one is better and why?
EE591U Wavelets and Filter Banks Copyright Xin Li Error Resilient Image Coding source encoder channel source decoder sourcedestination super-channel channel encoder channel decoder How can we optimize the end-to-end performance in the presence of channel errors?
EE591U Wavelets and Filter Banks Copyright Xin Li Perceptual Image Coding Characterizing image distortion is difficult! How do we objectively define mage quality which has to be subject to individual opinions?
EE591U Wavelets and Filter Banks Copyright Xin Li Image Coding for PR image sensor Communication channel Pattern recognition How does coding distortion affect the recognition performance? We need to develop a new image representation which Can simultaneously support low-level (e.g., compression, denoising) and high-level (e.g., recognition and retrieval) vision tasks