Download presentation
Presentation is loading. Please wait.
Published byMalcolm Horn Modified over 9 years ago
2
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 10 ECEC 453 Image Processing Architecture Lecture 10, 2/17/2004 MPEG-2, Industrial Strength Video Compression and Friends Oleh Tretiak Drexel University
3
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 2Lecture 10 Lecture Outline Basic Video Coding Features of MPEG-1 Features of H261 MPEG-2 Introduction to MPEG-4
4
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 3Lecture 10 Basic Video Coding Layered digital video stream Picture types Coding parameters and compression Decoder: general diagram Options for block coding
5
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 4Lecture 10 Picture of Layers
6
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 5Lecture 10 Video Compression: Picture Types Group of Pictures: Three types I — intraframe coding only P — predictive coding B — bi-directional coding
7
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 6Lecture 10 Typical MPEG coding parameters Typical sequence IPBBPBBPBBPBBPBB (16 frames)
8
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 7Lecture 10 Block Diagram of MPEG Decoder I frame P frame B frame
9
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 8Lecture 10 Macroblock Coding: I & P I pictures (almost like JPEG) Divided into slices and macroblocks No motion compensation Each macroblock can have different quantization DC and AC coded differently, as in JPEG Different coding tables from JPEG P pictures Divided into slices and macroblocks Option: no motion compensation Option: can code block as inter or intra (like I picture) Can skip macroblock (replace with previous). Great compression
10
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 9Lecture 10 Coding Image Blocks B pictures Inter or intra? Forward, backward, interpolational? Code block or skip? Quantization step? Statistics for an image sequence
11
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 10Lecture 10 Old standards: MPEG-1 and Videoconferencing
12
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 11Lecture 10 MPEG-1: ‘1.5’ Mbps Sample rate reduction in spatial and temporal domains Spatial Block-based DCT Huffman coding (no arithmetic coding) of motion vectors and quantized DCT coefficients 352 x 340 pixels, 12 bits per pixel, picture rate 30 pictures per second —> 30.4 Mbps Coded bit stream 1.15 Mbps (must leave bandwidth for audio) Compression 26:1 Quality better than VHS! Temporal Block-based motion compensation Interframe coding (two kinds)
13
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 12Lecture 10 Video Teleconferencing Comprehensive Standard: H.320 Components of H.320 H.261: Video coding, 64 to 1920 kbits/sec G.722, G.726, G.728: Audio coding from 16 kbits/sec to 64 kbits/sec H.221: Multiplexing of audio and video (frame based rather than packet based) H.230 and H.242: Handshaking and control H.233: encryption
14
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 13Lecture 10 Generic Video Telephone System
15
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 14Lecture 10 H.261 Features Common Interchange Format Interoperability between 25 fps and 30 fps countries 252 pix/line, 288 line, 30 fps noninterlace Terminal equipment converts frame and line numbers Y Cb Cr components, color sub-sampled by a factor of 2 in both directions Coding DCT, 8x8, 4 Y and 2 chrominance per masterblock I and P frames only, P blocks can be skipped Motion compensation optional, only integer compensation (Optional) forward error correction coding
16
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 15Lecture 10 H.261 vs MPEG-1 Similarities CIF, SIF, non-interlaced DCT technology Differences H.261 uses mostly P frames, no B frames H.261 typical bit rates much lower (down to 64 kbits/sec) Low bit rates achieved by reducing frame rate Simpler motion compensations End-to-end coding delay must be low Conclusion: Same technology, different design to meet different needs
17
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 16Lecture 10 MPEG 2 i, i = 0, 1 History & Goals Expanding universe of video coding What are MPEG-2 profiles? Features of MPEG-2
18
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 17Lecture 10 MPEG Home Official web site (http://www.cselt.it/mpeg/ still works) (http://www.cselt.it/mpeg/ http://mpeg.telecomitalialab.com/ Information site http://www.mpeg.org/MPEG/ (unchanged) http://www.mpeg.org/MPEG/ History MPEG-1, the standard for storage and retrieval of moving pictures and audio on storage media (approved Nov. 92) MPEG-2, the standard for digital television (approved Nov. 94) MPEG-4 version 1, the standard for multimedia applications (approved Oct. 98), version 2, (approved Dec. 99) Under development: MPEG-4 versions 3&4 MPEG-7 the content representation standard for multimedia information search, filtering, management and processing. Started MPEG-21, the multimedia framework.
19
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 18Lecture 10 MPEG Example Film on DVD: 8 Gbytes Playing time: 2 hours Bit rate 8e9 bytes x 8 bits/byte / 7200 seconds ~ 9 Mbits/sec Information? on the web http://www.microsoft.com/windowsxp/moviemaker/expert/digitalvide o.asp http://www.microsoft.com/windowsxp/moviemaker/expert/digitalvide o.asp ‘ Bit Rate Explained Bit rate describes how much information there is per second in a stream of data. You might have seen audio files described as “128–Kbps MP3” or “64–Kbps WMA.” Kbps stands for “kilobytes per second,”....’ Site claims that 64 Kbps WMA is as good as 128 Kbps MP3 Ignorance about bits and bytes does not encourage credibility
20
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 19Lecture 10 MPEG-2 Goals Compatibility with MPEG-1 Good picture quality Flexibility in input format Random access capability (I pictures) Capability for fast forward, fast reverse play, stop frame Bit stream scalability Low delay for 2-way communications (videoconferencing) Resilience to bit errors
21
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 20Lecture 10 MPEG-2 Implications No reason to restrict to CCIR 601 High resolution can be included (HDTV) No single standard can satisfy all requirements Family of standards Most applications use a small set of the features Toolkit approach
22
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 21Lecture 10 MPEG-2 profiles A profile is a subset of the entire MPEG-2 bit-stream syntax Simple Main 4:2:2 SNR Spatial High Multiview Each profile has several levels (resolution quality) Low — MPEG1 Main — CCIR 601 High-1440 (Video Editing) High (HDTV)
23
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 22Lecture 10 Features of MPEG-2 Support of both non-interlaced and interlaced pictures Color handling Y Cb Cr color space Several subsampling schemes are used 4:2:0, 4:2:2, 4:4:4 MPEG-2 sequence can be either frames or fields Both frame prediction and field prediction are supported There can be motion between two fields in a frame, so that frame prediction is more tricky In frame prediction, both fields constitute one picture In field prediction, either field in the previous frame or the previous field in this frame can be used as reference Robustified coding of motion vectors to protect against bit errors Special prediction modes: 16x8, dual-prime
24
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 23Lecture 10 MPEG-2: DCT and Quantization Two quantizers: one for intra blocks and one for non-intra blocks Support different quantization blocks for luminance and chrominance Scalable bit streams data partitioning, SNR scalability, temporal scalability, spatial scalability Data partitioning: headers and motion vectors in two bit streams SNR scalability: lower layer provided basic video, other layers provide enhancements. Basic layer sent with robust modulation Spatial scalability: lower layer provides basic resolution (e. g., MPEG-1), upper layer provides detail Temporal scalability: lower layer provides basic (low) frame rate
25
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 24Lecture 10 MPEG-2: Profiles 4:2:2 profile at Main level Two Y blocks for each pair of Cb, Cr blocks Distribution format for video production Robust for several compressions and decompressions 720x608, 30 fps 50 Mbit/sec Luminance full raster, chrominance are at full line rate DC precision of intra blocks can be up to 11 bits Main (4:2:0) profile at Main level Four Y blocks for each pair of Cb, Cr blocks Intended for broadcast quality (actually, is better) 15 Mbit/sec Main profile at low level Like MPEG-1
26
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 25Lecture 10 MPEG2 features Schemes for ‘frame’ and field coding. There are two fields in a frame, T (top) B (bottom) Either can be first Frame prediction for frame pictures What’s there to say? Field prediction for field pictures Target macroblock is in one field Prediction pixels come from one field Can be the same of different parity as target field Field prediction for frame pictures Dual prime for P-pictures 16x8 macroblock for field pictures Motion vectors coded at half-pel resolution
27
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 26Lecture 10 MPEG2 - Alternate Scan Zig-zag scanAlternate scan
28
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 27Lecture 10 MPEG2 — Subsampling Suppose picture is 720x480 4:4:4 Luminance and chrominance @ 720x480 4:2:2 Luminance @ 720x480, chrominance 360x480 4:2:0 Luminance 420x480, chrominance 360x240 Weird terminology
29
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 28Lecture 10 Low Y ~ 352x240 Cb, Cr ~ 176x120 30 pictures per second +/- 64 pixel displacement, half pixel resolution
30
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 29Lecture 10 Main (4:2:0) Y ~ 720x480 Cb,Cr ~ 360x240 30 frames per second 4:3, 16:9 aspect ratio Bitrate 15 Mbps (some applications as low as 5 Mbps) Digital television
31
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 30Lecture 10 High Y 1920x1152 Cb, Cr 960x576 60 frames per second 80 Mbps HDTV
32
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 31Lecture 10 Low rate Where is it needed? How is it done?
33
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 32Lecture 10 MPEG-2: DCT and Quantization Two quantizers: one for intra blocks and one for non-intra blocks Support different quantization blocks for luminance and chrominance Scalable bit streams data partitioning, SNR scalability, temporal scalability, spatial scalability Data partitioning: headers and motion vectors in two bit streams SNR scalability: lower layer provided basic video, other layers provide enhancements. Basic layer sent with robust modulation Spatial scalability: lower layer provides basic resolution (e. g., MPEG-1), upper layer provides detail Temporal scalability: lower layer provides basic (low) frame rate
34
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 33Lecture 10 MPEG-4 Multimedia Standard Thumbnail Description
35
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 34Lecture 10 What Is Left for MPEG-4? Initial goals Coding standards for lower-than-MPEG-1 rates Hidden agenda: Incorporate new coding methods Wavelet, fractal Revised agenda: Object-based coding MPEG-4 Architecture Input to coder consist of audio, video, and stored objects Decoder combines encoded objects with local objects Example: send text by sending character codes, receiver uses character generator.
36
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 35Lecture 10 Schematic Overview of MPEG-4
37
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 36Lecture 10 MPEG-4 Ideas Video Object Plane (VOP) A VOP can be a natural image from video camera or from a graphics database A VOP can consist of several visual object. Visual objects do not have to have rectangular outline (arbitrary shape) A scene consists of several VO’s and VOP’s with appropriate compositing Different VOP’s can have their own motion In principle, a visual scene can be decomposed into video objects by segmentation. Color and texture can be attributes of visual objects A viewer can manipulate VO’s.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.