Download presentation
Presentation is loading. Please wait.
Published byMary Charles Modified over 8 years ago
1
M. Wu: ENEE631 Digital Image Processing (Spring'09) Transform Coding and JPEG Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department, University of Maryland, College Park bb.eng.umd.edu (select ENEE631 S’09) minwu@eng.umd.edu ENEE631 Spring’09 Lecture 11 (3/2/2009)
2
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [2] Overview and Logistics Last Time: –Unitary transform and properties –DCT transform Today: –Wrap up 2-D unitary transform and basis images –Transform coding –Putting together: JPEG compression standard Logistics –Assignment #3 posted last Friday. Due Friday noon March 13. –Mid-term: Wednesday March 25 in class. UMCP ENEE631 Slides (created by M.Wu © 2004)
3
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [3] Clarifications: “Dimension” in different context Dimension of a signal ~ # of index variables –Audio and speech is 1-D signal: over time or sampled time index –Image is 2-D: over two spatial indices (horizontal and vertical) –Video is 3-D: over two spatial indices and one time index Dimension of an image ~ size of digital image –How many pixels along each row and column: e.g. 512x512 lena image –Also referred to as the “resolution” of an image Dimension of a vector space ~ # of basis vectors in it –[ x(1), …, x(N) ] T ~ # of elements in the vector
4
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [4] Summary and Review 1-D transform of a vector –Represent an N-sample sequence as a vector in N-dimension vector space –Transform u Different representation of this vector in the space via different basis u e.g., 1-D DFT from time domain to frequency domain –Forward transform u In the form of inner product u Project a vector onto a new set of basis to obtain N “coefficients” –Inverse transform u Use linear combination of basis vectors weighted by transform coefficients to represent the original signal 2-D transform of a matrix –Generally can rewrite the matrix into a long vector & apply 1-D transform –Separable transform allows applying transform to rows then columns UMCP ENEE631 Slides (created by M.Wu © 2001)
5
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [5] Summary and Review (cont’d) Vector/matrix representation of 1-D & 2-D sampled signal –Representing an image as a matrix or sometimes as a long vector Basis functions/vectors and orthonormal basis –Used for representing the space via their linear combinations –Many possible sets of basis and orthonormal basis Unitary transform on input x ~ A -1 = A *T –y = A x x = A -1 y = A *T y = a i *T y(i) ~ represented by basis vectors {a i *T } –Rows (and columns) of a unitary matrix form an orthonormal basis General 2-D transform and separable unitary 2-D transform –2-D transform involves O(N 4 ) computation –Separable: Y = A X A T = (A X) A T ~ O(N 3 ) computation u Apply 1-D transform to all columns, then apply 1-D transform to rows –For non-square image of size MxN: Y = A MxM X A T NxN ; basis images MxN UMCP ENEE631 Slides (created by M.Wu © 2001)
6
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [6] Review: 1-D Unitary Transf. Consider linear invertible transform –1-D sequence { x(0), x(1), …, x(N-1) } as a vector – y = A x and A is invertible Unitary matrix: A is unitary if A -1 = A *T = A H –Denote A *T as A H ~ “Hermitian” –x = A -1 y = A *T y = a i *T y(i) –Hermitian of row vectors of unitary matrix A form a set of orthonormal basis vectors {a i *T } Think: how about column vectors of A? Orthogonal matrix ~ A -1 = A T –Real-valued unitary matrix is also an orthogonal matrix –Row vectors of real orthogonal matrix A form orthonormal basis vectors UMCP ENEE631 Slides (created by M.Wu © 2001)
7
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [7] Exercise: Is each matrix here unitary or orthogonal? –If yes, what are the basis vectors? 1: n; 2: y. inv(A3) = [2, –3; -1, 2]; Check A A’ = I ? UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
8
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [8] Review: 1-D DCT Linear transform matrix C is real and orthogonal c(k,n) = (0) for k=0 ; c(k,n) = (k) cos[ (2n+1)/2N] for k>0 –rows of C form an orthonormal basis –Relation and comparison with DFT: C is not symmetric u DCT is related to DFT of a symmetrically extended signal, which gives less discontinuity at boundaries and better energy compaction UMCP ENEE631 Slides (created by M.Wu © 2001/2004) Figure is from slides at Gonzalez/ Woods DIP book website (Chapter 8)
9
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [9] Relation between DCT and DFT – see assign#3
10
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [10] UMCP ENEE631 Slides (created by M.Wu © 2001) Revisit: Example of 1-D DCT with N=8 k Z (k) Transform coeff. 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 Basis vectors 100 0 -100 100 0 -100 100 0 -100 100 0 -100 100 0 -100 100 0 -100 100 0 -100 100 0 -100 u=0 u=0 to 1 u=0 to 4 u=0 to 5 u=0 to 2 u=0 to 3 u=0 to 6 u=0 to 7 Reconstructions n z (n) Original signal From Ken Lam’s DCT talk 2001 (HK Polytech)
11
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [11] 2-D Transform: General Case A general 2-D linear transform with kernel {a k,l (m,n)} –y(k,l) is a transform coefficient for Image {x(m,n)} –{y(k,l)} is “Transformed Image” –Equiv to rewriting all from 2-D to 1-D and applying 1-D transform Computational complexity –N 2 values to compute –N 2 terms in summation per output coefficient –O(N 4 ) for transforming an NxN image! UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
12
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [12] 2-D Linear Transforms Image transform kernel {a k,l (m,n)} –y(k,l) is a transform coefficient for Image {x(m,n)} –{y(k,l)} is “Transformed Image” –Equiv to rewriting all from 2-D to 1-D and applying 1-D transform May generalize the transform (series expansion) from NxN to NxM a1 a2 a3 orth. proj. gives min. distance Orthonomality condition – – Assure any truncated expansion of the form will minimize sum of squared errors when y(k,l) take values as above Completeness condition – – assure zero error when taking full basis UMCP ENEE631 Slides (created by M.Wu © 2001)
13
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [13] 2-D Separable Unitary Transf. Focus our attention to separable transform – a k,l (m,n) = a k (m) b l (n), denote this as a(k,m) b(l,n) –Can apply transform as succession of 1-D transforms to rows then columns Use 1-D unitary transform as building block – {a k (m)} k and {b l (n)} l each forms a set of orthonormal basis vectors u use as row vectors to obtain unitary matrices A={a(k,m)} & B={b(l,n)} –Apply to columns and rows Y = A X B T u often choose same unitary matrix as A and B (e.g., 2-D DFT) For square NxN image A: Y = A X A T X = A H Y A * –For rectangular MxN image A: Y = A M X A N T X = A M H Y A N * Complexity ~ O(N 3 ) –May further reduce complexity if unitary transf. has fast implementation UMCP ENEE631 Slides (created by M.Wu © 2001)
14
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [14] Basis Images for Separable Transform X = A H Y A * => x(m,n) = k l a*(k,m)a*(l,n) y(k,l) –Represent X with NxN basis images weighted by coefficients in Y –Obtain basis image by setting Y={ (k-k 0, l-l 0 )} and getting X { a*(k 0,m)a*(l 0,n) } m,n Basis image in matrix form A* k,l = a* k a l *T ~ a* k is k th column vector of A H transf. coeff. y(k,l) is the inner product of the basis A* k,l with image X UMCP ENEE631 Slides (created by M.Wu © 2001)
15
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [15] Exercise on Basis Images For 2-D separable unitary transform: Y = A X A T => X = A H Y A * –Represent X with NxN basis images weighted by coefficients in Y –Obtain basis image by setting Y={ (k-k 0, l-l 0 )} & getting X In matrix form A* k,l = a* k a l *T ~ a* k is k th column vector of A H Exercise: –A is Unitary transform or not? –If so, find basis images –Represent an image X with basis images (Jain’s e.g.5.1, pp137: A’ [5, –1; – 2, 0] A; outer product of columns of A H : [1,1]’[1 1]/2, …) UMCP ENEE631 Slides (created by M.Wu © 2001)
16
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [16] Review and Exercise on Basis Images Exercise: –A is Unitary transform or not? –Find basis images –Represent an image X with basis images (Jain’s e.g.5.1, pp137: A’ [5, –1; – 2, 0] A; outer product of columns of A H : [1,1]’[1 1]/2, …) UMCP ENEE631 Slides (created by M.Wu © 2001)
17
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [17] 2-D DCT Separable orthogonal transform –Apply 1-D DCT to each row, then to each column Y = C X C T X = C T Y C = m n y(m,n) B m,n –Set y(m,n)=1 and rest as zeros to obtain basis image B m,n ~ outer product of C’s m th & n th rows DCT basis images: –Equivalent to represent an NxN image with a set of orthonormal NxN “basis images” –Each DCT coefficient indicates the contribution from (or similarity to) the corresponding basis image UMCP ENEE631 Slides (created by M.Wu © 2001)
18
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [18] Revisit: 2-D DFT Recall: 2-D DFT is Separable Y = F X F X = F * Y F * –Basis images B k,l = (a k ) (a l ) T ~ outer product of two vectors where a k = [ 1 W N -k W N -2k … W N -(N-1)k ] T / N UMCP ENEE631 Slides (created by M.Wu © 2001/2004)
19
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [19] Visualizing Fourier Basis Images Fourier basis uses complex exponentials –Their real and imaginary parts give smoothly varying sinusoidal patterns in different frequencies and orientations exp[ j 2 (ux + vy) ] = cos[2 (ux + vy)] + j sin[2 (ux + vy)] v Real (cos) part Imaginary (sin) part (u, v)(1, 0)(0, 5)(1, 1) Figures from Mani Thomas U.Del CISC489/689 2D FT http://vims.cis.udel.edu/~mani/TACourses/Spring06/cv/index.html
20
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [20] 8x8 DFT Basis Images Figures from John Woods’ book.
21
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [21] Transform Coding UMCP ENEE631 Slides (created by M.Wu © 2004)
22
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [22] Transform Coding Use transform to pack energy to only a few coefficients How many bits to be allocated for each coefficient? –More bits for coefficients with high variance k 2 to keep total MSE small –Also determined by perceptual importance From Jain’s Fig.11.15 UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
23
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [23] Zonal Coding and Threshold Coding Zonal coding –Only transmit a small predetermined zone of transformed coeff. Threshold coding –Transmit coefficients that are above certain thresholds Compare – Threshold coding is inherently adaptive u introduce smaller distortion for the same # of coded coefficients – Threshold coding has overhead in specifying index of coded coefficients u run-length coding helps to reduce overhead UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
24
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [24] Determining Block Size Why block based? –Coding more samples together can help exploit their correlations (via transform) –Higher computational complexity and lower parallelism for larger blocks u O( m m log m ) per block transform for (MN/m 2 ) blocks u complexity in bit allocation –Block transform captures local info. and allows for bit allocation better than global transform Rate and complexity vs. block size –Commonly used block size ~ 8x8 From Jain’s Fig.11.16 complexity UMCP ENEE631 Slides (created by M.Wu © 2001)
25
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [25] Block-based Transform Coding Encoder –Step-1 Divide an image into m x m blocks and perfrom transform –Step-2 Determine bit-allocation for coefficients –Step-3 Design quantizer and quantize coefficients (lossy!) –Step-4 Encode quantized coefficients Decoder From Jain’s Fig.11.17 UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
26
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [26] Review: List of Common Compression Tools Lossless encoding tools –Entropy coding: Huffman, Lemple-Ziv, and others (arithmetic coding) –Run-length coding Lossy tools for reducing redundancy –Quantization: scalar quantizer vs. vector quantizer –Truncations: discard unimportant parts of data Facilitating compression via Prediction –Encode prediction parameters and residues with less bits Facilitating compression via Transforms –Transform into a domain with improved energy compaction UMCP ENEE631 Slides (created by M.Wu © 2004)
27
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [27] How to Encode Quantized Coeff. in Each Block Basic tools –Entropy coding (Huffman, etc.) and run-length coding –Predictive coding ~ esp. for DC Ordering –zig-zag scan for block-DCT to better achieve run-length coding gain Horizontal frequency Vertical frequency DC AC 01 AC 07 AC 70 AC 77 low-frequency coefficients, then high frequency coefficients UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
28
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [28] Put Basic Tools Together: JPEG Image Compression Standard UMCP ENEE631 Slides (created by M.Wu © 2004)
29
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [29] JPEG Compression Standard (early 1990s) JPEG - Joint Photographic Experts Group –Compression standard of generic continuous-tone still image –Became an international standard in 1992 Allow for lossy and lossless encoding of still images –Part-1 DCT-based lossy compression u average compression ratio 15:1 –Part-2 Predictive-based lossless compression Sequential, Progressive, Hierarchical modes –Sequential ~ encoded in a single left-to-right, top-to-bottom scan –Progressive ~ encoded in multiple scans to first produce a quick, rough decoded image when the transmission time is long –Hierarchical ~ encoded at multiple resolution to allow accessing low resolution without full decompression UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
30
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [30] Baseline JPEG Algorithm “Baseline” –Simple, lossy compression u Subset of other DCT-based modes of JPEG standard A few basics –8x8 block-DCT based coding –Shift to zero-mean by subtracting 128 [-128, 127] u Allows using signed integer to represent both DC and AC coefficients –Color ( YCbCr / YUV ) and downsample u Color components can have lower spatial resolution than luminance –Interleaving color components => Flash demo on Baseline JPEG algorithm by Dr. Ken Lam (HK PolyTech Univ.) (Based on Wang’s video book Chapt.1) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
31
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [31] Block Diagram of JPEG Baseline From Wallace’s JPEG tutorial (1993)
32
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [32] More on Color Interleaving – Different components are ordered into Minimum Coding Unit (MCU) – MCUs define repeating interleaving patterns Minimum coding unit (MCU) MCU1 = {Y00, Y01, Y10, Y11, U00, V00} MCU2 = {Y02, Y03, Y12, Y13, U01, V01} YUV UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
33
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [33] 475 x 330 x 3 = 157 KB luminance From Liu’s EE330 (Princeton)
34
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [34] RGB Components From Liu’s EE330 (Princeton)
35
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [35] Y U V (Y Cb Cr) Components Assign more bits to Y, less bits to Cb and Cr From Liu’s EE330 (Princeton)
36
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [36] JPEG Compression (Q=75%) 45 KB, compression ration ~ 4:1 From Liu’s EE330 (Princeton)
37
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [37] Lossless Coding Part in JPEG Differentially encode DC –(lossy part: DC differences are then quantized.) AC coefficients in one block – Zig-zag scan after quantization for better run-length u save bits in coding consecutive zeros – Represent each AC run-length using entropy coding u use shorter codes for more likely AC run-length symbols UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
38
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [38] Lossy Part in JPEG Important tradeoff between bit rate and visual quality Quantization (adaptive bit allocation) –Different quantization step size for different coeff. bands –Use same quantization matrix for all blocks in one image –Choose quantization matrix to best suit the image –Different quantization matrices for luminance and color components Default quantization table –“Generic” over a variety of images Quality factor “Q” –Scale the quantization table –Medium quality Q = 50% ~ no scaling –High quality Q = 100% ~ quantization step is 1 –Poor quality ~ small Q, larger quantization step u visible artifacts like ringing and blockiness UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
39
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [39] Quantization Table Recommended in JPEG 8x8 Quantization Table for Luminance 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 8x8 Quantization Table for Chrominance 17 18 24 47 99 99 99 99 18 21 26 66 99 99 99 99 24 26 56 99 99 99 99 99 47 66 99 99 99 99 99 99 99 99 99 99 99 99 99 99
40
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [40] Uncompressed (100KB) JPEG 75% (18KB) JPEG 50% (12KB)JPEG 30% (9KB)JPEG 10% (5KB) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
41
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [41] JPEG Compression (Q=75% & 30%) 45 KB 22 KB From Liu’s EE330 (Princeton)
42
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [42] Y Cb Cr After JPEG (Q=30%) From Liu’s EE330 (Princeton) JPEG Cb JPEG Cr
43
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [43] Lossless Coding Part in Baseline JPEG: Details Differentially encode DC –( SIZE, AMPLITUDE ), with amplitude range in [-2048, 2047] AC coefficients in one block – Zig-zag scan for better run-length – Represent each AC with a pair of symbols u Symbol-1: ( RUNLENGTH, SIZE ) Huffman coded u Symbol-2: AMPLITUDE Variable length coded RUNLENGTH [0,15] # of consecutive zero-valued AC coefficients preceding the nonzero AC coefficient [0,15] SIZE [0 to 10 in unit of bits] # of bits used to encode AMPLITUDE AMPLITUDE in range of [-1023, 1024] UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
44
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [44] UMCP ENEE631 Slides (created by M.Wu © 2004) Table is from slides at Gonzalez/ Woods DIP book website (Chapter 8); Baseline JPEG use a smaller part of the coefficient range.
45
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [45] Summary of Today’s Lecture Transform Coding JPEG compression standard –Baseline block-DCT based algorithm u lossy part: quantization with different step size for each coeff. band u lossless part: differential coding, run-length coding, Huffman Next lecture: –Wrap up JPEG compression –“Optimal” transform –Subband coding and Wavelet based compression Readings –Gonzalez’s 3/e book 8.1, 8.2.8 –G.K. Wallace: “The JPEG still picture compression standard,” IEEE Trans. on Consumer Electronics, vol.38, no.1, pp.18-34, Feb. 1992. UMCP ENEE631 Slides (created by M.Wu © 2004)
46
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [46] Recap: JPEG Still Image Coding From B. Liu PU EE488 F’06 Lossy, block based, transform coding
47
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [47] Figure/Table are from Gonzalez/Woods DIP book 3/e website (Chapter 8) Table 8.3 is shown in the next page.
48
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [48] Table is from slides at Gonzalez/ Woods DIP book 3/e website (Chapter 8)
49
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec11 – Transf. Coding & JPEG [49] Reference Compandor design of Waveform coding book Transform coding article in Sig.Proc. Magzine Bovik’s book Chapt.5.1 (Lossless coding) – huffman, lempel-ziv, etc. Wallace’s JPEG paper Figure is from slides at Gonzalez/ Woods DIP book website (Chapter 3)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.