Image Processing Architecture, © Oleh TretiakPage 1Midterm 2 review ECEC 453 Image Processing Architecture Midterm Review February 24, 2003
Image Processing Architecture, © Oleh TretiakPage 2Midterm 2 review Announcements Midterm exam on Thursday, 26 February Last names A through H will take the exam in our regular room; Last names K through Z in Commonwealth 403 Covers lectures 5 through 9 (approximately) HW 3 through 7 Practice exam on web (approximate coverage)
Image Processing Architecture, © Oleh TretiakPage 3Midterm 2 review Exam Topics Quantizers, bit allocation, rate-distortion tradeoff Decorrelation ideas: prediction, DCT, wavelets DCT Technology: 1D, 2D, seperable, fast Motion estimation: MAE, full search, log search, etc. JPEG: DCT block, quantizer coefficient, DC coding, zig-zag scan, AC runlength coding JPEG: Data interleaving and MCU, bit stream stucture, JFIF and colors. MPEG1: Macroblocks and blocks, I-P-B pictures, bitstream layers, motion estimation and compensation MPEG1 contrasted with teleconferencing
Image Processing Architecture, © Oleh TretiakPage 4Midterm 2 review Quantizer: practical lossy encoding Quantizer Symbols x — input to quantizer, q — output of quantizer, S — quantizer step Quantizer: q = round(x/S) Dequantizer characteristic x* = Sq Typical noise power added by quantizer- dequantizer combination: D = S 2 /12 noise standard deviation = sqrt(D) = 0.287S Example: S = 8, D = 8 2 /12 = 5.3, rms. quatization noise = sqrt(D) = 2.3 If input is 8 bits, max input is 255. There are 255/8 ~ 32 quantizer output values PSNR = 20 log10(255/2.3) = 40.8 dB Quantizer characteristic Dequantizer characteristic S S
Image Processing Architecture, © Oleh TretiakPage 5Midterm 2 review Rate-distortion theory: non- uniform random variables Given (x 1, x 2,... x n ), use orthogonal transform to obtain (y 1, y 2,... y n ). Sequence of independent Gaussian variables (y 1, y 2,... y n ), Var[y i ] = Q i. Distortion allocation: allocate D i distortion to Q i Rate (bits) for i-th variable is R i = max[0.5 log 2 (Q i /D i ), 0] Total distortion Total rate (bits) We specify R. What are the values of D i to get minimum total distortion D?
Image Processing Architecture, © Oleh TretiakPage 6Midterm 2 review Bit allocation solution Implicit solution (water-filling construction) Choose (parameter) D i = min(Q i, ) If (Qi > ) then D i = , else D i = Q i R i = max[0.5 log 2 (Q i /D i ), 0] If (Qi > ) then R i = 0.5 log 2 (Q i / ), else R i = 0. Find value of to get specified R
Image Processing Architecture, © Oleh TretiakPage 7Midterm 2 review Coding Correlated Samples How to code correlated samples Decorrelate Code Methods for decorrelation Prediction Transformation Block transform Wavelet transform
Image Processing Architecture, © Oleh TretiakPage 8Midterm 2 review Prediction Rules Simplest: previous value
Image Processing Architecture, © Oleh TretiakPage 9Midterm 2 review Block-Based Coding Discrete Cosine Transform (DCT) is used instead of the K-L transform Full image DCT - one set of decorrelated coefficients for whole image Block-based coding: Image divided into ‘small’ blocks Each block is decorrelated separately Block decorrelation performs almost as well (better?) than full image decorrelation Current standards (JPEG, MPEG) use 8x8 DCT blocks
Image Processing Architecture, © Oleh TretiakPage 10Midterm 2 review Predicting sequential images f(t-1)f(t)f(t) f(t)–f(t–1)
Image Processing Architecture, © Oleh TretiakPage 11Midterm 2 review What is the DCT? One-dimensional 8 point DCT Input x 0,... x 7, output y 0,... y 7 One-dimensional inverse DCT Input y 0,... y 7, output x 0,... x 7 Matrix form of equations: x, y are one column matrices Note: in these equations, p stands for
Image Processing Architecture, © Oleh TretiakPage 12Midterm 2 review Forward 2DDCT. Input x ij i = 0,... 7, j = 0, Output y kl k = 0,... 7, l = 0,... 7 Matrix form, X, Y ~ 8x8 matrices with coefficients x ij, y kl The 2DDCT is separable! Two-Dimensional DCT Note: in these equations, p stands for
Image Processing Architecture, © Oleh TretiakPage 13Midterm 2 review General DCT One dimension Two dimensions
Image Processing Architecture, © Oleh TretiakPage 14Midterm 2 review Computational Complexity 1D DCT N input and output samples ~ N 2 = 64 operations (additions + multiplications) 2D DCT - direct implementation M = N 2 input values, M output values -> M 2 = N 4 2D DCT - separable implementation, Y = TXT T = ZT T, where Z = TX, all matrices are NxN -> 2N 3 operations For N = 8 2D DCT direct — 4096 operations, 64 operations per pixel 2D DCT separable — 1024 operations, 16 ops/pixel Big savings due to separable transform Inverse DFT — same story.
Image Processing Architecture, © Oleh TretiakPage 15Midterm 2 review Optimized (fast) DCT 1-D Chen DCT diagram. Dashed lines indicate subtraction, — multi- plication by a constant, — multiplication by 0.5 (shift). Characteristics of optimized DCT algorithms
Image Processing Architecture, © Oleh TretiakPage 16Midterm 2 review DCT Complexity Direct DCT computation: 64 DCT values, each requires 64 multiplications & additions —> 4096 multiply-accumulate (MA) operations per block Separable algorithm (operate on rows, then on columns) —> 16 one-dimensional 8 point DCT operations —> 1024 MA operations Fast implementation ~ Nlog 2 N operations ~ 16x24 = 384 MA ops Special methods ~ many operations involve multiplication by 1 or -1, take advantage of this!
Image Processing Architecture, © Oleh TretiakPage 17Midterm 2 review Motion Estimation Terminology Issues: Size of macroblock Size of search region In video coding standards, M = N = 16
Image Processing Architecture, © Oleh TretiakPage 18Midterm 2 review Full-Search Method Compute for (2p+1) 2 values of (i, j). Each location requires 3MN operations Picture dimensions IxJ, F pictures per second 3IJF(2p + 1) 2 operations per second I = 720, J = 480, F = 30, p = 15 —> 30 GOPS Guaranteed to find best (MAE) displacement How to do it? Special computers Smaller p Faster (suboptimal) algorithm
Image Processing Architecture, © Oleh TretiakPage 19Midterm 2 review Hierarchical Search Prepare downsampled versions of current and reference images Full macroblock 16x16 Down 2 macroblock 8x8 Down 4 macroblock 4x4 Full search in Down 4 reference image 16 x speedup, smaller macroblock 16 x speedup, fewer displacement vectors p = ±16, p’ = ±4 Around point of best match, do local search in Down 2 reference image (3x3 search zone) Repeat for Full reference image (3x3 search zone) Full Down 2 Down 4
Image Processing Architecture, © Oleh TretiakPage 20Midterm 2 review Comparison
Image Processing Architecture, © Oleh TretiakPage 21Midterm 2 review JPEG in practice Lossy DCT coding used most often Some use of sequential DCT (Web images) Lossless? Hierarchical? JPEG DCT allows many varieties of parameters Choice of quantizer tables Choice of Huffman codes Arithmetic coding (ever?) Different encoding of color and luminance There are baseline (=default) combinations of quantizers, Huffman codes, and luminance/chrominance samples
Image Processing Architecture, © Oleh TretiakPage 22Midterm 2 review Features of JPEG coder Standard specifies compressed bitstream structure only File format not part of JPEG standard Many features of JPEG compression (e. g. coding tables) are optional Standard specifies input-output characteristics, not details of algorithms Either Huffman or arithmetic coding may be used Huffman coding almost universal — many implementations do not include arithmetic coding Standard specifies baseline implementation. This is optional, but often followed.
Image Processing Architecture, © Oleh TretiakPage 23Midterm 2 review Data Interleaving with Subsampling Example: a color image with Y (intensity), Cb, Cr, (color) components is subsampled so that one color block corresponds to four Y blocks MCU 1 = Y 00 Y 01 Y 10 Y 11 Cr 00 Cb 00, MCU 2 = Y 02 Y 03 Y 12 Y 13 Cr 01 Cb 01
Image Processing Architecture, © Oleh TretiakPage 24Midterm 2 review Huffman Coding - Block Diagram
Image Processing Architecture, © Oleh TretiakPage 25Midterm 2 review Coding AC Coefficients AC coefficients are coded in zig-zag (called ZZ in standard) order to maximize possible runs of zeros. Code unit consist of run length followed by coefficient size. Baseline coding of size category is the same as for DC differences (Table 2.9) Example: run of 6 zeros, size = -18. In the table, -18 is in category 5. Code is (6/5, 01101). If the Huffman code for 6/5 is 1101, codeword =
Image Processing Architecture, © Oleh TretiakPage 26Midterm 2 review JPEG Coding Example AC coding Zig-zag scan Run length codes (length/value) 0/16, 0/3, 1/-2, 0/-21, 0/10, 5/2, 0/2, 0/-15, 1/-3, 0/-1,.... DC coding Assume DC(i-1) = 39 DC(i) - DC(i-1) = = 3 Size = 2, value =3 Assume Huffman code for size 2 is 011 DC code is 01111
Image Processing Architecture, © Oleh TretiakPage 27Midterm 2 review About JFIF APP0 = FFE0 marker used to specify extensions Specify Y, Cb, Cr for components Includes ‘thumbnail’ picture for previewing Thumbnail can be RGB, index color, or compressed Image orientations: first scan line is on top Spatial relation of components specified Pixel aspect ratio (horizontal and vertical density) are specified
Image Processing Architecture, © Oleh TretiakPage 28Midterm 2 review JFIF Sample File written by PhotoShop [cbis 210] % od -c -x Mallorca | more \0 020 J F I F \ \0 d ffd8 ffe a SOI, start of image \0 \0 \0 031 h t t p : / / w w w. p a2f 2f e h o t o d i s c. c o m / \0 8 B URL of image copyright owner? F i l e w r i t t e n b y c e A d o b e P h o t o s h o p f f74 6f73 686f 70a8 APP0 markerVersion
Image Processing Architecture, © Oleh TretiakPage 29Midterm 2 review Color Conversion (from JFIF)
Image Processing Architecture, © Oleh TretiakPage 30Midterm 2 review Resolution Reduction Trials FullY down by 64Cb, Cr down by 64
Image Processing Architecture, © Oleh TretiakPage 31Midterm 2 review RGB reduction FullR, B reduced by 64
Image Processing Architecture, © Oleh TretiakPage 32Midterm 2 review MPEG-1: ‘1.5’ Mbps Sample rate reduction in spatial and temporal domains Spatial Block-based DCT Huffman coding (no arithmetic coding) of motion vectors and quantized DCT coefficients 352 x 340 pixels, 12 bits per pixel, picture rate 30 pictures per second —> 30.4 Mbps Coded bit stream 1.15 Mbps (must leave bandwidth for audio) Compression 26:1 Quality better than VHS! Temporal Block-based motion compensation Interframe coding (two kinds)
Image Processing Architecture, © Oleh TretiakPage 33Midterm 2 review Frames and Fields MPEG 1 works with pictures (~ frames)
Image Processing Architecture, © Oleh TretiakPage 34Midterm 2 review Input to MPEG-1 Standard allows many formats (up to 4095x4095 pixels) Standard optimized for CCIR 601 video formats: two source input formats (SIF’s) are specified (NTSC & PAL) Coded color video has three components: Y, Cb, Cr A MPEG-1 macroblock has 16x16 Y and 8x8 Cb, Cr pixels
Image Processing Architecture, © Oleh TretiakPage 35Midterm 2 review Picture Types MPEG-1 is designed to support random access & editing I — intraframe coding only P — predictive coding B — bi-directional coding
Image Processing Architecture, © Oleh TretiakPage 36Midterm 2 review Block Diagram of MPEG Decoder I frame P frame B frame
Image Processing Architecture, © Oleh TretiakPage 37Midterm 2 review Typical MPEG coding parameters Typical sequence IPBBPBBPBBPBBPBB (16 frames) Average compression 26.3
Image Processing Architecture, © Oleh TretiakPage 38Midterm 2 review Coding constraints (minimum) Constrained parameter bit stream. Every MPEG-1 decoder should support these parameters
Image Processing Architecture, © Oleh TretiakPage 39Midterm 2 review Macroblock Coding: I & P I pictures Divided into slices and macroblocks No motion compensation Each macroblock can have different quantization DC and AC coded differently, as in JPEG Different coding tables from JPEG P pictures Divided into slices and macroblocks Option: no motion compensation Option: can code block as inter or intra (like I picture) Can skip macroblock (replace with previous). Great compression
Image Processing Architecture, © Oleh TretiakPage 40Midterm 2 review Coding Image Blocks B pictures Inter or intra? Forward, backward, interpolational? Code block or skip? Quantization step?
Image Processing Architecture, © Oleh TretiakPage 41Midterm 2 review H.261 Features Common Interchange Format Interoperability between 25 fps and 30 fps countries 252 pix/line, 288 line, 30 fps noninterlace Terminal equipment converts frame and line numbers Y Cb Cr components, color sub-sampled by a factor of 2 in both directions Coding DCT, 8x8, 4 Y and 2 chrominance per masterblock I and P frames only, P blocks can be skipped Motion compensation optional, only integer compensation (Optional) forward error correction coding