ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Compression Fall’05 Instructor: Carol Espy-Wilson Electrical & Computer Engineering University.

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Compression Fall’05 Instructor: Carol Espy-Wilson Electrical & Computer Engineering University of Maryland, College Park  http://www.ece.umd.edu/class/enee408g/  espy@umd.edu ENEE408G Fall 2005 Lecture-8

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [2] Last Lecture Basic tools for compression –PCM coding, entropy coding, run-length coding –Quantization and truncation –Predictive coding –Transform coding: DCT-based JPEG image compression –8x8 Block-DCT based transform coding –Use predictive coding, quantization, run-length coding, and entropy coding Today: digital video and video compression UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [3] Recall: Storing An Encyclopedia –500,000 pages of text (2kB/page) ~ 1GB => 2:1 compress –3,000 color pictures (640  480  24bits) ~ 3GB => 15:1 –500 maps (640  480  16bits=0.6MB/map) ~ 0.3GB => 10:1 –60 minutes of stereo sound (176kB/s) ~ 0.6GB => 6:1 –30 animations with average 2 minutes long (640  320  16bits  16frames/s=6.5MB/s) ~ 23.4GB => 50:1 –50 digitized movies with average 1 minute long (640  480  24bits  30frames/s = 27.6MB/s) ~ 82.8GB => 50:1  Require a total of 111.1GB storage capacity if without compression  Reduce to 2.96GB if with compression From Ken Lam’s DCT talk 2001 (HK Polytech)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [4] Recall: List of Compression Tools Lossless encoding tools –Entropy coding: Huffman, Lemple-Ziv, and others (Arithmetic coding) –Run-length coding Lossy tools for reducing redundancy –Quantization and truncations Predictive coding –Encode prediction parameters and residues with less bits Transform coding –Transform into a domain with improved energy compaction (Vector quantization) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [5] Bring in Motion  Video Capturing video –Frame by frame => image sequence –Image sequence: A 3-D signal u 2 spatial dimensions & time dimension u continuous I( x, y, t ) => discrete I( m, n, t k ) Encode digital video –Simplest way ~ compress each frame image individually u e.g., “motion-JPEG” u only spatial redundancy is explored and reduced –How about temporal redundancy? u differential coding may not be a good idea because pixel-by- pixel difference could still be large due to motion u need better prediction UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [6] (From Princeton EE330 S’01 by B.Liu) Residue after motion compensation Pixel-wise difference w/o motion compensation Motion estimation “Horse ride”

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [7] Motion Estimation Help understanding the content of image sequence –For surveillance Help reduce temporal redundancy of video –For compression Stabilizing video by detecting and removing small, noisy global motions –For building stabilizer in camcorder UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [8] Block-Matching by Exhaustive Search Assume block-based translation motion model Search every possibility over a specified range for the best matching block –MAD (mean absolute difference) often used for simplicity Demo (by Dr. Ken Lam @ Hong Kong PolyTech Univ.) From Wang’s Preprint Fig.6.6 UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [9] Complexity of Exhaustive Block-Matching Assumptions –Block size NxN and image size S=M1xM2 –Search step size is 1 pixel ~ “integer-pel accuracy” –Search range +/–R pixels both horizontally and vertically Computation complexity –# Candidate matching blocks = (2R+1) 2 –# Operations for computing MAD for one block ~ O(N 2 ) –# Operations for MV estimation per block ~ O((2R+1) 2 N 2 ) –# Blocks = S / N 2 –Total # operations for entire frame ~ O((2R+1) 2 S) u i.e., overall computation load is independent of block size! E.g., M=512, N=16, R=16, 30fps => On the order of 8.55 x 10 9 operations per second! –Difficult for real time estimation, but possible with parallel hardware UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [10] Exhaustive Search: Cons and Pros Pros –Guaranteed optimal within search range Cons –Can only search among finitely many candidates u What if the motion is “fractional”? –High computation complexity u On the order of [search-range-size * image-size] for 1-pixel step size  How to improve accuracy? –Include blocks at fractional translation as candidates ~ require interpolation  How to improve speed? –Try to exclude unlikely candidates UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [11] Fractional Accuracy Search for Block Matching For motion accuracy of 1/K pixel –Upsample (interpolate) reference frame by a factor of K –Search for the best matching block in the upsampled reference frame Half-pel accuracy ~ K=2 –Significant accuracy improvement over integer-pel (esp. for low-resolution) –Complexity increase (From Wang’s Preprint Fig.6.7) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [12] Fast Algorithms for Block Matching Basic ideas –Matching errors near the best match are generally smaller than far away –Skip candidates that are unlikely to give good match UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [13] dx dy Fast Algorithm: 3-Step Search Search candidates at 8 neighbor positions Step-size cut down by 2 after each iteration –Start with step size approx. half of max. search range motion vector {dx, dy} = {1, 6} Total number of computations: 9 + 8  2 = 25 (3-step) (2R+1) 2 = 169 (full search) (From Ken Lam – HK Poly Univ. short course in summer’2001) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [14] Lowest resolution Lower resolution Original resolution Hierarchical Block Matching Problem with fast search at full resolution –Small mis-alignment may give high displacement error (E DFD ) u esp. for texture and edge blocks Hierarchical (multi-resolution) block matching –Match with coarse resolution to narrow down search range –Match with high resolution to refine motion estimation (From Wang’s Preprint Fig.6.19) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [16] Motion Compensation –Help reduce temporal redundancy of video PREVIOUS FRAMECURRENT FRAME PREDICTED FRAMEPREDICTION ERROR FRAME UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [17] DCT-M.E. Hybrid Video Coding “Hybrid” ~ combined transform coding & predictive coding Spatial redundancy removal –Use DCT-based transform coding for reference frame Temporal redundancy removal –Use motion-based predictive coding for next frames u estimate motion and use reference frame to predict u only encode MV & prediction residue (“motion compensation residue”) (From Princeton EE330 S’01 by B.Liu) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [18] Hybrid MC-DCT Video Encoder (From R.Liu’s Handbook Fig.2.18) Intra-frame: encoded without prediction Inter-frame: predictively encoded UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [19] Hybrid MC-DCT Video Decoder (From R.Liu’s Handbook Fig.2.18) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [20] Hybrid Video Coding: Problems to Be Solved Not all regions are easily inferable from previous frame –Occlusion ~ solvable by backward prediction using future frames as ref. –Adaptively deciding using prediction or not Drifting and error propagation –Solution: Encode reference regions or frames from time to time Random access –Solution: Encode frame without prediction from time to time How to allocate bits? –According to statistics –Consider constant or variable bit-rate requirement u Constant-bit-rate (CER) vs. Variable-bit-rate (VER)  Wrap up all solutions ~ MPEG-like codec UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [21] About MPEG MPEG – Moving Pictures Experts Group –Coding of moving pictures and associated audio Basic compression idea on the picture part –Can achieve compression ratio of about 50:1 through storing only the differences between successive frames –Some claim higher compression ratio u Depends on how we calculate u Notice color is often downsampled, and interleaving odd/even fields Audio part –Compression of audio data at ratios ranging from 5:1 to 10:1 –MP3 ~ “MPEG-1 audio Layer-3” UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [22] Compression Ratio Raw video –24 bits/pixel x (720 x 480 pixels) x 30 fps = 249 Mbps “Cheating” points ~ contributing about 4:1 inflation –Color components are actually downsampled –30 fps may refer to field rate in MPEG-2 ~ equiv. to 15 fps –( 8 x 720 x 480 + 16 x 720 x 480 / 4 ) x 15 fps = 62 Mbps UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [23] MPEG Generations MPEG-1 ~ 1-1.5Mbps (early 90s) –For compression of 320x240 full-motion video at rates around 1.15Mb/s –Applications: video storage (VCD) MPEG-2 ~ 2-80Mbps (mid 90s) –For higher resolutions –Support interlaced video formats and a number of features for HDTV –Address scalable video coding –Also used in DVD MPEG4 ~ 9-40kbps (later 90s) –For very low bit rate video and audio coding –Applications: interactive multimedia and video telephony UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [24] MPEG Generations (cont’d) (From Ken Lam – HK Poly Univ. short course in summer’2001) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [25] MPEG-1 Video Coding Standard Standard only specifies decoders’ capabilities –Prefer simple decoding and not limit encoder’s complexity –Leave flexibility and competition in implementing encoder Block-based hybrid coding (DCT + M.C.) –8x8 block size as basic coding unit –16x16 “macroblock” size for motion estimation/compensation 3-Type frame structures: I/P/B Group-of-Pictures (GOP) –Frame order I 1 BBB P 1 BBB P 2 BBB I 2 …… –Coding order I 1 P 1 BBB P 2 BBB I 2 BBB …… (From R.Liu Handbook Fig.3.13) UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [26] “Adaptive” Predictive Coding in MPEG-1 Half-pel M.V. search within +/-64 pel range –Use spatial differential coding on M.V. to remove M.V. spatial redundancy Coding each block in P-frame –Predictive block using previous I/P frame as reference –Intra-block ~ encode without prediction u use this if prediction costs more bits than non-prediction u good for occluded area u can also avoid error propagation Coding each block in B-frame –Intra-block ~ encode without prediction –Predictive block u Use prev. I/P frame as ref. (forward prediction), u Or use future I/P frame as ref. (backward prediction), u Or use both for prediction and take average UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [27] Coding of B-frame (cont’d) Previous frame Current frame Future frame A B C B = A  forward prediction B = C  backward prediction orB = (A+C)/2  interpolation one motion vector two motion vectors UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [28] Quantization for I-frame (I-block) & M.C. Residues Quantizer for I-frame (I-block) –Different step size for different freq. band (similar to JPEG) –Default quantization table –Scale the table for different compression-quality Quantizer for residues in predictive block –Noise-like residue –Similar variance in different freq. band –Assign same quantization step size for each freq. band UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [31] Video Coding Summary: Performance Tradeoff @ @ From R.Liu’s Handbook Fig.1.2: “mos” ~ 5-pt mean opinion scale of bad, poor, fair, good, excellent UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [37] Summary Motion estimation Video compression through hybrid coding –Exploit spatial redundancy via transform coding –Exploit temporal redundancy via predictive coding u motion estimation and compensation MPEG-1 video compression –Hybrid DCT-ME coding standard Friday lab session: video compression Next Lecture: video processing and analysis UMCP ENEE408G Slides (created by M.Wu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [38] Assignment Readings: (see E-handout in course webpage) –“Data Compression” from Lecture 3 u Section 7.7 MPEG (for 7.7.5 MPEG-4, only need to read till 7.7.5.3) –For more details on video coding => see instructor for copies u Liu’s Chapter 2 “Motion-Compensated DCT Video Coding” u Liu’s Chapter 3 “Video Coding Standards” UMCP ENEE408G Slides (created by M.Wu © 2002)

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Lec8 – Video Compression 3/31/04 [39] –AVI overview http://www.rahul.net/jfm/avi.html; http://www.jmcgowan.com/avi.html (AVI conversion to other format)http://www.rahul.net/jfm/avi.html http://www.jmcgowan.com/avi.html –Apple QuickTime http://www.apple.com/quicktime/tools_tips/tutorials/ –Choosing video format http://hotwired.lycos.com/webmonkey/html/96/44/index2a.html http://hotwired.lycos.com/webmonkey/html/96/44/index2a.html –Different video formats http://internet-tips.net/multimedia/avi.htm http://internet-tips.net/multimedia/avi.htm

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Compression Fall’05 Instructor: Carol Espy-Wilson Electrical & Computer Engineering University.

Similar presentations

Presentation on theme: "ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Compression Fall’05 Instructor: Carol Espy-Wilson Electrical & Computer Engineering University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Compression Fall’05 Instructor: Carol Espy-Wilson Electrical & Computer Engineering University.

Similar presentations

Presentation on theme: "ENEE408G Capstone -- Multimedia Signal Processing (F'05) Video Compression Fall’05 Instructor: Carol Espy-Wilson Electrical & Computer Engineering University."— Presentation transcript:

Similar presentations

About project

Feedback