Download presentation
Presentation is loading. Please wait.
Published byMilo Rodgers Modified over 9 years ago
1
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Midterm 2 review ECEC 453 Image Processing Architecture Midterm Review February 24, 2003
2
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 2Midterm 2 review Announcements Midterm exam on Thursday, 26 February Last names A through H will take the exam in our regular room; Last names K through Z in Commonwealth 403 Covers lectures 5 through 9 (approximately) HW 3 through 7 Practice exam on web (approximate coverage)
3
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 3Midterm 2 review Exam Topics Quantizers, bit allocation, rate-distortion tradeoff Decorrelation ideas: prediction, DCT, wavelets DCT Technology: 1D, 2D, seperable, fast Motion estimation: MAE, full search, log search, etc. JPEG: DCT block, quantizer coefficient, DC coding, zig-zag scan, AC runlength coding JPEG: Data interleaving and MCU, bit stream stucture, JFIF and colors. MPEG1: Macroblocks and blocks, I-P-B pictures, bitstream layers, motion estimation and compensation MPEG1 contrasted with teleconferencing
4
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 4Midterm 2 review Quantizer: practical lossy encoding Quantizer Symbols x — input to quantizer, q — output of quantizer, S — quantizer step Quantizer: q = round(x/S) Dequantizer characteristic x* = Sq Typical noise power added by quantizer- dequantizer combination: D = S 2 /12 noise standard deviation = sqrt(D) = 0.287S Example: S = 8, D = 8 2 /12 = 5.3, rms. quatization noise = sqrt(D) = 2.3 If input is 8 bits, max input is 255. There are 255/8 ~ 32 quantizer output values PSNR = 20 log10(255/2.3) = 40.8 dB Quantizer characteristic Dequantizer characteristic S S
5
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 5Midterm 2 review Rate-distortion theory: non- uniform random variables Given (x 1, x 2,... x n ), use orthogonal transform to obtain (y 1, y 2,... y n ). Sequence of independent Gaussian variables (y 1, y 2,... y n ), Var[y i ] = Q i. Distortion allocation: allocate D i distortion to Q i Rate (bits) for i-th variable is R i = max[0.5 log 2 (Q i /D i ), 0] Total distortion Total rate (bits) We specify R. What are the values of D i to get minimum total distortion D?
6
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 6Midterm 2 review Bit allocation solution Implicit solution (water-filling construction) Choose (parameter) D i = min(Q i, ) If (Qi > ) then D i = , else D i = Q i R i = max[0.5 log 2 (Q i /D i ), 0] If (Qi > ) then R i = 0.5 log 2 (Q i / ), else R i = 0. Find value of to get specified R
7
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 7Midterm 2 review Coding Correlated Samples How to code correlated samples Decorrelate Code Methods for decorrelation Prediction Transformation Block transform Wavelet transform
8
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 8Midterm 2 review Prediction Rules Simplest: previous value
9
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 9Midterm 2 review Block-Based Coding Discrete Cosine Transform (DCT) is used instead of the K-L transform Full image DCT - one set of decorrelated coefficients for whole image Block-based coding: Image divided into ‘small’ blocks Each block is decorrelated separately Block decorrelation performs almost as well (better?) than full image decorrelation Current standards (JPEG, MPEG) use 8x8 DCT blocks
10
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 10Midterm 2 review Predicting sequential images f(t-1)f(t)f(t) f(t)–f(t–1)
11
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 11Midterm 2 review What is the DCT? One-dimensional 8 point DCT Input x 0,... x 7, output y 0,... y 7 One-dimensional inverse DCT Input y 0,... y 7, output x 0,... x 7 Matrix form of equations: x, y are one column matrices Note: in these equations, p stands for
12
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 12Midterm 2 review Forward 2DDCT. Input x ij i = 0,... 7, j = 0,... 7. Output y kl k = 0,... 7, l = 0,... 7 Matrix form, X, Y ~ 8x8 matrices with coefficients x ij, y kl The 2DDCT is separable! Two-Dimensional DCT Note: in these equations, p stands for
13
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 13Midterm 2 review General DCT One dimension Two dimensions
14
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 14Midterm 2 review Computational Complexity 1D DCT N input and output samples ~ N 2 = 64 operations (additions + multiplications) 2D DCT - direct implementation M = N 2 input values, M output values -> M 2 = N 4 2D DCT - separable implementation, Y = TXT T = ZT T, where Z = TX, all matrices are NxN -> 2N 3 operations For N = 8 2D DCT direct — 4096 operations, 64 operations per pixel 2D DCT separable — 1024 operations, 16 ops/pixel Big savings due to separable transform Inverse DFT — same story.
15
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 15Midterm 2 review Optimized (fast) DCT 1-D Chen DCT diagram. Dashed lines indicate subtraction, — multi- plication by a constant, — multiplication by 0.5 (shift). Characteristics of optimized DCT algorithms
16
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 16Midterm 2 review DCT Complexity Direct DCT computation: 64 DCT values, each requires 64 multiplications & additions —> 4096 multiply-accumulate (MA) operations per block Separable algorithm (operate on rows, then on columns) —> 16 one-dimensional 8 point DCT operations —> 1024 MA operations Fast implementation ~ Nlog 2 N operations ~ 16x24 = 384 MA ops Special methods ~ many operations involve multiplication by 1 or -1, take advantage of this!
17
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 17Midterm 2 review Motion Estimation Terminology Issues: Size of macroblock Size of search region In video coding standards, M = N = 16
18
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 18Midterm 2 review Full-Search Method Compute for (2p+1) 2 values of (i, j). Each location requires 3MN operations Picture dimensions IxJ, F pictures per second 3IJF(2p + 1) 2 operations per second I = 720, J = 480, F = 30, p = 15 —> 30 GOPS Guaranteed to find best (MAE) displacement How to do it? Special computers Smaller p Faster (suboptimal) algorithm
19
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 19Midterm 2 review Hierarchical Search Prepare downsampled versions of current and reference images Full macroblock 16x16 Down 2 macroblock 8x8 Down 4 macroblock 4x4 Full search in Down 4 reference image 16 x speedup, smaller macroblock 16 x speedup, fewer displacement vectors p = ±16, p’ = ±4 Around point of best match, do local search in Down 2 reference image (3x3 search zone) Repeat for Full reference image (3x3 search zone) Full Down 2 Down 4
20
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 20Midterm 2 review Comparison
21
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 21Midterm 2 review JPEG in practice Lossy DCT coding used most often Some use of sequential DCT (Web images) Lossless? Hierarchical? JPEG DCT allows many varieties of parameters Choice of quantizer tables Choice of Huffman codes Arithmetic coding (ever?) Different encoding of color and luminance There are baseline (=default) combinations of quantizers, Huffman codes, and luminance/chrominance samples
22
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 22Midterm 2 review Features of JPEG coder Standard specifies compressed bitstream structure only File format not part of JPEG standard Many features of JPEG compression (e. g. coding tables) are optional Standard specifies input-output characteristics, not details of algorithms Either Huffman or arithmetic coding may be used Huffman coding almost universal — many implementations do not include arithmetic coding Standard specifies baseline implementation. This is optional, but often followed.
23
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 23Midterm 2 review Data Interleaving with Subsampling Example: a color image with Y (intensity), Cb, Cr, (color) components is subsampled so that one color block corresponds to four Y blocks MCU 1 = Y 00 Y 01 Y 10 Y 11 Cr 00 Cb 00, MCU 2 = Y 02 Y 03 Y 12 Y 13 Cr 01 Cb 01
24
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 24Midterm 2 review Huffman Coding - Block Diagram
25
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 25Midterm 2 review Coding AC Coefficients AC coefficients are coded in zig-zag (called ZZ in standard) order to maximize possible runs of zeros. Code unit consist of run length followed by coefficient size. Baseline coding of size category is the same as for DC differences (Table 2.9) Example: run of 6 zeros, size = -18. In the table, -18 is in category 5. Code is (6/5, 01101). If the Huffman code for 6/5 is 1101, codeword = 110101101
26
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 26Midterm 2 review JPEG Coding Example AC coding Zig-zag scan 16 3 0 -2 -21 10 0 0 0 0 0 2 2 -15 0 -3 -1 0 0... Run length codes (length/value) 0/16, 0/3, 1/-2, 0/-21, 0/10, 5/2, 0/2, 0/-15, 1/-3, 0/-1,.... DC coding Assume DC(i-1) = 39 DC(i) - DC(i-1) = 42-40 = 3 Size = 2, value =3 Assume Huffman code for size 2 is 011 DC code is 01111
27
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 27Midterm 2 review About JFIF APP0 = FFE0 marker used to specify extensions Specify Y, Cb, Cr for components Includes ‘thumbnail’ picture for previewing Thumbnail can be RGB, index color, or compressed Image orientations: first scan line is on top Spatial relation of components specified Pixel aspect ratio (horizontal and vertical density) are specified
28
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 28Midterm 2 review JFIF Sample File written by PhotoShop [cbis 210] % od -c -x Mallorca | more 0000000 377 330 377 340 \0 020 J F I F \0 001 002 001 \0 d ffd8 ffe0 0010 4a46 4946 0001 0201 0064 SOI, start of image 0000440 \0 \0 \0 031 h t t p : / / w w w. p 0000 0019 6874 7470 3a2f 2f77 7777 2e70 0000460 h o t o d i s c. c o m / \0 8 B URL of image copyright owner? 0001200 F i l e w r i t t e n b y 4669 6c65 2077 7269 7474 656e 2062 7920 0001220 A d o b e P h o t o s h o p 250 4164 6f62 6520 5068 6f74 6f73 686f 70a8 APP0 markerVersion
29
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 29Midterm 2 review Color Conversion (from JFIF)
30
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 30Midterm 2 review Resolution Reduction Trials FullY down by 64Cb, Cr down by 64
31
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 31Midterm 2 review RGB reduction FullR, B reduced by 64
32
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 32Midterm 2 review MPEG-1: ‘1.5’ Mbps Sample rate reduction in spatial and temporal domains Spatial Block-based DCT Huffman coding (no arithmetic coding) of motion vectors and quantized DCT coefficients 352 x 340 pixels, 12 bits per pixel, picture rate 30 pictures per second —> 30.4 Mbps Coded bit stream 1.15 Mbps (must leave bandwidth for audio) Compression 26:1 Quality better than VHS! Temporal Block-based motion compensation Interframe coding (two kinds)
33
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 33Midterm 2 review Frames and Fields MPEG 1 works with pictures (~ frames)
34
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 34Midterm 2 review Input to MPEG-1 Standard allows many formats (up to 4095x4095 pixels) Standard optimized for CCIR 601 video formats: two source input formats (SIF’s) are specified (NTSC & PAL) Coded color video has three components: Y, Cb, Cr A MPEG-1 macroblock has 16x16 Y and 8x8 Cb, Cr pixels
35
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 35Midterm 2 review Picture Types MPEG-1 is designed to support random access & editing I — intraframe coding only P — predictive coding B — bi-directional coding
36
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 36Midterm 2 review Block Diagram of MPEG Decoder I frame P frame B frame
37
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 37Midterm 2 review Typical MPEG coding parameters Typical sequence IPBBPBBPBBPBBPBB (16 frames) Average compression 26.3
38
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 38Midterm 2 review Coding constraints (minimum) Constrained parameter bit stream. Every MPEG-1 decoder should support these parameters
39
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 39Midterm 2 review Macroblock Coding: I & P I pictures Divided into slices and macroblocks No motion compensation Each macroblock can have different quantization DC and AC coded differently, as in JPEG Different coding tables from JPEG P pictures Divided into slices and macroblocks Option: no motion compensation Option: can code block as inter or intra (like I picture) Can skip macroblock (replace with previous). Great compression
40
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 40Midterm 2 review Coding Image Blocks B pictures Inter or intra? Forward, backward, interpolational? Code block or skip? Quantization step?
41
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 41Midterm 2 review H.261 Features Common Interchange Format Interoperability between 25 fps and 30 fps countries 252 pix/line, 288 line, 30 fps noninterlace Terminal equipment converts frame and line numbers Y Cb Cr components, color sub-sampled by a factor of 2 in both directions Coding DCT, 8x8, 4 Y and 2 chrominance per masterblock I and P frames only, P blocks can be skipped Motion compensation optional, only integer compensation (Optional) forward error correction coding
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.