Presentation is loading. Please wait.

Presentation is loading. Please wait.

H.261 Video Compression Overview

Similar presentations


Presentation on theme: "H.261 Video Compression Overview"— Presentation transcript:

1 H.261 Video Compression Overview
Developed by CCITT (Consultative Committee for International Telephone and Telegraph) in Designed for videoconferencing, videotelephone applications over ISDN telephone lines. Bit-rate is p x 64 Kb/sec, where p ranges from 1 to 30. Frame types are CCIR 601 CIF (352 x 288) and QCIF (176 x 144) images with 4:2:0 subsampling.

2 Generic Encoder Architecture
Quantises the High Spatial Frequencies more coarsely than the low spatial frequencies Takes difference between current image and prediction of current image (from previous image) Generic Encoder Architecture Converts image from Spatial Data to Spatial Frequency Data Discrete Cosine Transform Quantizer Zig-Zag Scan & RLC Huffman Encoder Huffman Codes the Quantised, Spatial Frequencies + Image - Inverse Quantiser Zig-Zag scans & RLC the spatial frequencies Generates Prediction Image from Motion Compensated Previous Image (obtained using Motion Vectors and Previous Image Regenerates the next previous image from the Prediction Error Image and the Prediction Image of the Current Frame Multiplexor Encoded Image Stream Inverse DCT + + Combines the Huffman codes of the Motion Vectors and Prediction Errors into one stream Huffman Encoder Motion Compensator Image Store of Previous Frame Motion Estimator Estimates Motion between Current and Previous Images for Macroblocks as a motion Vector Huffman Codes Motion Vectors

3 Generic Decoder Architecture
Regenerates the next previous image from the Prediction Error Image and the Prediction Image of the Current Frame Generic Decoder Architecture Converts image from Spatial Frequency Data to Spatial Data De-Quantises the Spatial Frequencies Decodes the Huffman Encoded Prediction Errors Huffman Decoder Inverse Zig-Zag Scan & RLC Inverse Quantiser Inverse DCT + Image + Generates Prediction Image from Motion Compensated Previous Image (obtained using Motion Vectors and Previous Image De- Multiplexor Inverse Zig-Zag scans & RLC the spatial frequencies Encoded Image Stream Huffman Decoder Motion Compensator Splits up the incoming Stream into the Huffman codes of the Motion Vectors and Prediction Errors Image Store of Previous Frame Decodes the Huffman Encoded Motion Vectors

4 H.261 Video Compression: Limitations of Motion Estimation Algorithm
Object or Camera translation - The motion estimation algorithm only detects translation motion of either the object or the camera in x and y directions between successive images in video. This is a gross simplification of what processes may occur to instigate change between succeive images in video. Object rotation – objects in the scene may rotate whichmakes objects that are viewed by a camera have a different aspect of the image. Object change of shape – objects in a scene may change shape. An example of this may be clouds or human walking. Camera rotation and tilt – The camera may rotate or tilt. This cannot be modelled as translation motion and thus will incur prediction error from a predictor that only uses translation. Object Occlusion – When one object moves in front of an second object the the second object will have part of it occluded. When an object rotates then part of the object will go out of view or come into view due to occlusion. Camera positive and negative zoom – When a camera zooms, this scales the image up or down and brings out or in other part of a scene. Ambient lighting conditions – When the ambient lighting conditions change, due to a light being switched on/off or the sun going behind clouds, then the luminance of the image changes and can not be modelled with just translation. Scene cuts – When there is a scene cut then the image completely changes with no relation between successive images Object or camera absolute motion – Cameras or objects do not move to the nearest pixel. They might move to a fraction of a pixel.

5 H.261 Video Compression Encoder System

6 H.261 Video Compression Decoder System

7 H.261 Video Compression I Frames & P Frames
Frame Sequence Two frame types: Intra-frames (I-frames) and Inter- frames (P-frames): I-frame provides an accessing point, it uses basically JPEG. P-frames use "pseudo-differences" from previous frame ("predicted"), so frames depend on each other.

8 H.261 Video Compression Hierarchical block structure
H.261 divides images into a hierarchical block structure of Group of Blocks (GOB), Macroblocks (MB) and Blocks. e.g. for a CIF image

9 H.261 Video Compression Intra-frame Coding
Macroblocks are 16 x 16 pixel areas on Y plane of original image. A macroblock usually consists of 4 Y blocks, 1 Cr block, and 1 Cb block. Quantization is by constant value for all DCT coefficients (i.e., no quantization table as in JPEG).

10 H.261 Video Compression: Inter-frame (P-frame) Coding
Previous image is called reference image, the image to encode target image. The difference image (not the target image itself) is encoded. Need to use the decoded image as reference image, not the original. "Mean Absolute Error" (MAE) or "Mean Squared Error" (MSE) = sum(E*E)/N to decide best block

11 H.261 Video Compression: Motion Vector Coding
Motion estimation is done on a macroblock basis. Motion Vector Difference (MVD) are transmitted using Huffman codes. MVD is the difference between the Motion Vector (MV) of the current macroblock and the MV of the proceeding macroblock. MV consists of a pair of horizontal x and vertical y components in the two dimensional spatial domain. example MVDx(n) = MVx(n) - MVx(n-1) MVDx(n) = 15 - (-10) = 25

12 H.261 Video Compression: Inter-frame (P-frame) Coding
"Control" -- controlling the bit-rate. If the transmission buffer is too full, then bit-rate will be reduced by changing the quantization factors. "memory" -- used to store the reconstructed image (blocks) for the purpose of motion vector search for the next P-frame.

13 H.261 Video Compression: Methods for Motion Vector Searches
C(x + k, y + l) -- pixels in the macro block with upper left corner (x, y) in the Target frame. R(x + i + k, y + j + l) -- pixels in the macro block with upper left corner (x + i, y + j) in the Reference frame. Cost function is: Where MAE stands for Mean Absolute Error. Goal is to find a vector (u, v) such that MAE(u, v) is minimum.

14 H.261 Video Compression: Methods for Motion Vector Searches
Full Search Method Sequentially search the whole [-p, p] region --> very slow

15 H.261 Video Compression: Methods for Motion Vector Searches
Two-Dimensional Logarithmic Search Similar to binary search. MAE function is initially computed within a window of [-p/2, p/2] at nine locations as shown in the figure. Repeat until the size of the search region is one pixel wide: Find one of the nine locations that yields the minimum MAE Form a new searching region with half of the previous size and centered at the location found in step 1.

16 H.261 Video Compression: Hierarchical Motion Estimation
Form several low resolution version of the target and reference pictures Find the best match motion vector in the lowest resolution version. Modify the motion vector level by level when going up

17 H.261 Video Compression: Syntax of Hierarchical Motion Estimation
Many macroblocks will be exact matches (or close enough). So send address of each block in image --> Addr Sometimes no good match can be found, so send INTRA block --> Type Will want to vary the quantization to fine tune compression, so send quantization value --> Quant Motion vector --> vector Some blocks in macroblock will match well, others match poorly. So send bitmask indicating which blocks are present (Coded Block Pattern, or CBP). Send the blocks (4 Y, 1 Cr, 1 Cb) as in JPEG.

18 H.261 Video Compression: Hierarchical Motion Estimation
Need to delineate boundaries between pictures, so send Picture Start Code --> PSC Need timestamp for picture (used later for audio synchronization), so send Temporal Reference --> TR Is this a P-frame or an I-frame? Send Picture Type --> PType Picture is divided into regions of 11 x 3 macroblocks called Groups of Blocks --> GOB Might want to skip whole groups, so send Group Number (Grp #) Might want to use one quantization value for whole group, so send Group Quantization Value --> GQuant Overall, bitstream is designed so we can skip data whenever possible while still unambiguous.

19 H.263 Video Compression Overview
H. 263 is a new improved standard for low bit-rate video, adopted in March As H. 261, it uses the transform coding for intra-frames and predictive coding for inter-frames. Bit-rate is 8 Kb/sec and above. Designed for low bit rate video transmission in mobile networks. Frame types are CCIR CIF(1408 x 1152), 4CIF(704 x 576), CIF (352 x 288), QCIF (176 x 144) and SQCIF (128 x 96), images with 4:2:0 subsampling. Advanced Options: Half-pixel precision in motion compensation Unrestricted motion vectors Syntax-based arithmetic coding Advanced prediction and PB-frames Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

20 H.263 Video Compression Encoder System
Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

21 H.263 Video Compression Decoder System
Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

22 H.263 Video Compression Hierarchical block structure
H.263 divides images into a hierarchical block structure of Group of Blocks (GOB), Macroblocks (MB) and Blocks. e.g. for a CIF image

23 H.263 Video Compression Advanced Modes
Normal: Motion vector differences and DCT coefficients are Huffman coded, a Motion vector is coded per each macroblock, motion vectors are restricted to point within image and images are coded using Intra/Inter modes only. Syntax-based Arithmetic: Difference Motion Vectors and DCT coefficients are encoded using Arithmetic codes. Advanced Prediction: Four 8x8 Motion vectors per Macroblock are coded instead of one 16x16 Motion vector per Macroblock. Unrestricted Motion Vectors: Motion vectors are allowed to point outside the image. Prediction-Bidirectional: Two images are coded as one. Picture N+2 is forward predicted from picture N and picture N+1 is Bidirectionally coded from pictures N and N+2. Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

24 H.263 Video Compression Macroblock Based Motion Vector Prediction
Motion vectors MVx and MVy are computed with one vector per Macroblock. The predictor vector is the median predictor of the three candidate Motion Vectors (MV1, MV2, MV3). MVD, the difference between MV and the predictor is Huffman encoded. After motion compensation, blocks that have prediction errors above a threshold have their prediction errors intra coded using DCT. Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

25 H.263 Video Compression Block Based Motion Vector Prediction
Four motion vectors are computed per Macroblock. Motion vectors MVx and MVy are computed per block. The predictor vector is the weighted sum of the three prediction Motion Vectors (MV1, MV2, MV3). MVD, the difference between MV and the predictor is Huffman encoded. Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

26 H.263 Video Compression Half Pixel Prediction by Bilinear Interpolation
Motion estimation is carried out on an integer pixel resolution. Once this has been carried out motion estimation can be extended to 1/2 pixel resolution using bilinear interpolation to interpolate the fractional pixels. a = A b = (A + B)/2 c = (A + C)/2 d = (A + B + C + D)/4 Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

27 MPEG-1 Video Compression Overview
"Moving Picture Coding Experts Group", established in 1988 to create standard for delivery of video and audio. It became s two- Channel Coding Standard (1992). MPEG-1 Target: VHS quality on a CD-ROM (352 x CD audio @ 1.5 Mbits/sec) Standard had three parts: Video, Audio, and System (control interleaving of streams) Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

28 MPEG-1 Video Compression Video Encoder
MPEG-1 specifies four picture types I-pictures: blocks are coded using DCT only (based on JPEG & H.261) P-pictures: blocks are forward motion compensated B-pictures: blocks are forward and reverse motion compensated D-pictures: only the DC coefficients of DCT are coded (for fast forward mode) Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

29 MPEG-1 Video Compression Video Decoder
Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

30 MPEG-1 Video Compression Motion Estimation Search
Larger gaps between I and P frames allowed, so expand motion vector search range. To get better encoding, allow motion vectors to be specified to fraction of a pixel (1/2 pixels).

31 MPEG-1 Video Compression Need for B-frame
In prediction coding many macroblocks need information not in the reference frame. Solution is to code bidirectionally e.g. B frames

32 MPEG-1 Video Compression B-frame
B-frames search for macroblock in past and future frames. Typical pattern is IBBPBBPBB IBBPBBPBB IBBPBBPBB Actual pattern is up to encoder, and need not be regular. Bidirectional Coding provides greater compression

33 MPEG-1 Video Compression Coding B-frame
B frame macroblocks can specify two motion vectors (one to past and one to future), indicating result is to be averaged.

34 MPEG-1 Video Compression Slices
Added notion of slice for synchronization after loss/corrupt data. Example: picture with 7 slices:

35 MPEG-1 Video Compression Syntax
Bitstream syntax must allow random access, forward/backward play, etc. Sequence Information 1.Video Params include width, height, aspect ratio of pixels, picture rate. 2.Bitstream Params are bit rate, buffer size, and constrained parameters flag (means bitstream can be decoded by most hardware) 3.Two types of QTs: one for intra-coded blocks (I-frames) and one for inter-coded blocks (P-frames). Group of Pictures (GOP) information 1.Time code: bit field with SMPTE time code (hours, minutes, seconds, frame). 2.GOP Params are bits describing structure of GOP. Is GOP closed? Does it have a dangling pointer broken? Picture Information 1.Type: I, P, or B-frame? 2.Buffer Params indicate how full decoder's buffer should be before starting decode. 3.Encode Params indicate whether half pixel motion vectors are used. Slice information 1.Vert Pos: what line does this slice start on? 2.QScale: How is the quantization table scaled in this slice? Macroblock information 1.Addr Incr: number of MBs to skip. 2.Type: Does this MB use a motion vector? What type? 3.QScale: How is the quantization table scaled in this MB? 4.Coded Block Pattern (CBP): bitmap indicating which blocks are coded.

36 MPEG-2 Video Compression Overview
It is a Multi-Channel Coding Standard (1994,1997). It defines the multi-channel extension to MPEG-1 audio (MPEG-2 BC) It defines an audio coding standard at lower sampling frequencies (16 kHz, kHz, 24 kHz) than MPEG-1 (32 kHz, 44.1 kHz, 48 kHz) It defines a higher quality multi-channel standard than achievable with MPEG-1 extensions (MPEG-2 Advanced Audio Coding, AAC) Frame sizes as large as x 16383 MPEG-3: Originally for HDTV (1920 x 1080), got folded into MPEG-2 Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

37 MPEG-2 Video Compression Overview
Unlike MPEG-1 which is a standard for storing and playing video on a single computer at low bit-rates, MPEG-2 is a standard for digital TV. It meets the requirements for HDTV, DVD (Digital Video/Versatile Disc) and existing TV (PAL, SECAM and NTSC). Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

38 MPEG-2 Video Compression: Field and Frame Motion Prediction
Support both field prediction and frame prediction. There are four modes. Field Prediction: Predictions are made independently for each field. Predictions can be made between even fields and between odd fields or between even and odd fields. Frame Prediction: Predictions are made from one or more previously decoded frames. (16 x 8) motion compensation: Two motion vectors are used for each MB. The first is applied to the upper 16x8 region the second to the lower 16x8 region. Dual-Prime Prediction: Predictions are made from two reference fields (previous odd and even) and are averaged to form the final field prediction (next odd or even). Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

39 MPEG-2 Video Compression: DCT Coding
After motion compensation, the prediction errors are coded using the DCT. For frame prediction, the even (black) and odd (white) fields are structured for DCT coding: For field prediction, (16x8) and Dual-Prime prediction, the even (black) and odd (white) fields are structured for DCT coding: Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

40 MPEG-2 Video Compression: Subsampling and Macroblock Organisation
Typically MPEG-2 uses 4:2:0 subsampling It also allows 4:2:2 and 4:4:4 chroma subsampling. Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

41 MPEG-2 Video Compression: Scaleability
Scalable Coding Extensions: (so the same set of signals works for both HDTV and standard TV) SNR (quality) Scalability -- similar to JPEG DCT-based Progressive mode, adjusting the quantization steps of the DCT coefficients. Spatial Scalability -- similar to hierarchical JPEG, multiple spatial resolutions. Temporal Scalability -- different frame rates. Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.

42 MPEG-2 Video Compression: Macroblock Quantisation
Non-linear macroblock quantization factor of AC coefficients. If type 0, then there is a linear relation between quantiser code and step size. Suited for lower resolution video. If type 1, then there is a non-linear relation between quantiser code and step size. It is finer for small data values and coarser for larger data values. Suited for higher resolution video. Some Important Issues Avoiding propagation of errors Send an I-frame every once in a while Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data.


Download ppt "H.261 Video Compression Overview"

Similar presentations


Ads by Google