Overview of the H.264/AVC Video Coding Standard

Slides:



Advertisements
Similar presentations
Università degli studi Roma Tre
Advertisements

Low-Complexity Transform and Quantization in H.264/AVC
March 24, 2004 Will H.264 Live Up to the Promise of MPEG-4 ? Vide / SURA March Marshall Eubanks Chief Technology Officer.
with RGB Reversibility
AVC Compression Update: FRExt and future Matthew Goldman Vice President of Technology Compression Systems.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
KIANOOSH MOKHTARIAN SCHOOL OF COMPUTING SCIENCE SIMON FRASER UNIVERSITY 6/24/2007 Overview of the Scalable Video Coding Extension of the H.264/AVC Standard.
Overview of the H.264/AVC Video Coding Standard
MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA.
MPEG-1: A Standard for Digital Storage of Audio and Video Nimrod Peleg Update: Dec
H.264 Intra Frame Coder System Design Özgür Taşdizen Microelectronics Program at Sabanci University 4/8/2005.
MPEG4 Natural Video Coding Functionalities: –Coding of arbitrary shaped objects –Efficient compression of video and images over wide range of bit rates.
A Performance Analysis of the ITU-T Draft H.26L Video Coding Standard Anthony Joch, Faouzi Kossentini, Panos Nasiopoulos Packetvideo Workshop 2002 Department.
Basics of MPEG Picture sizes: up to 4095 x 4095 Most algorithms are for the CCIR 601 format for video frames Y-Cb-Cr color space NTSC: 525 lines per frame.
2004 NTU CSIE 1 Ch.6 H.264/AVC Part2 (pp.200~222) Chun-Wei Hsieh.
Overview of the H. 264/AVC video coding standard.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
Technion - IIT Dept. of Electrical Engineering Signal and Image Processing lab Transrating and Transcoding of Coded Video Signals David Malah Ran Bar-Sella.
1 Video Coding Concept Kai-Chao Yang. 2 Video Sequence and Picture Video sequence Large amount of temporal redundancy Intra Picture/VOP/Slice (I-Picture)
SWE 423: Multimedia Systems
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
Ch. 6- H.264/AVC Part I (pp.160~199) Sheng-kai Lin
Overview of Error Resiliency Schemes in H.264/AVC Standard Sunil Kumar, Liyang Xu, Mrinal K. Mandal, and Sethuraman Panchanathan Elsevier Journal of Visual.
Overview of the H.264/AVC Video Coding Standard
H.264 / MPEG-4 Part 10 Nimrod Peleg March 2003.
BY AMRUTA KULKARNI STUDENT ID : UNDER SUPERVISION OF DR. K.R. RAO Complexity Reduction Algorithm for Intra Mode Selection in H.264/AVC Video.
Concepts of Multimedia Processing and Transmission
H.264/AVC.
1 Image and Video Compression: An Overview Jayanta Mukhopadhyay Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur,
An Introduction to H.264/AVC and 3D Video Coding.
Image and Video Compression
EE 5359 H.264 to VC 1 Transcoding Vidhya Vijayakumar Multimedia Processing Lab MSEE, University of Arlington Guided.
H.264 ITU-T H.264 or ISO/IEC IS (MPEG-4 part 10) Advanced Video Coding (AVC)
Electrical Engineering National Central University Video-Audio Processing Laboratory Data Error in (Networked) Video M.K.Tsai 04 / 08 / 2003.
Page 19/15/2015 CSE 40373/60373: Multimedia Systems 11.1 MPEG 1 and 2  MPEG: Moving Pictures Experts Group for the development of digital video  It is.
Profiles and levelstMyn1 Profiles and levels MPEG-2 is intended to be generic, supporting a diverse range of applications Different algorithmic elements.
Windows Media Video 9 Tarun Bhatia Multimedia Processing Lab University Of Texas at Arlington 11/05/04.
Outline JVT/H.26L: History, Goals, Applications, Structure
Image Processing and Computer Vision: 91. Image and Video Coding Compressing data to a smaller volume without losing (too much) information.
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
- By Naveen Siddaraju - Under the guidance of Dr K R Rao Study and comparison of H.264/MPEG4.
June, 1999 An Introduction to MPEG School of Computer Science, University of Central Florida, VLSI and M-5 Research Group Tao.
Image Compression Supervised By: Mr.Nael Alian Student: Anwaar Ahmed Abu-AlQomboz ID: IT College “Multimedia”
Video Compression Standards for High Definition Video : A Comparative Study Of H.264, Dirac pro And AVS P2 By Sudeep Gangavati EE5359 Spring 2012, UT Arlington.
EE 5359 TOPICS IN SIGNAL PROCESSING PROJECT ANALYSIS OF AVS-M FOR LOW PICTURE RESOLUTION MOBILE APPLICATIONS Under Guidance of: Dr. K. R. Rao Dept. of.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
- By Naveen Siddaraju - Under the guidance of Dr K R Rao Study and comparison between H.264.
Watermarking Part 2: Future Work Electrical and Computer Engineering Department Villanova University 18 August 2004 Robert J. Berger II Michael P. Marcinak.
Fundamentals of Multimedia Chapter 12 MPEG Video Coding II MPEG-4, 7 Ze-Nian Li & Mark S. Drew.
Figure 1.a AVS China encoder [3] Video Bit stream.
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
Video Compression—From Concepts to the H.264/AVC Standard
Block-based coding Multimedia Systems and Standards S2 IF Telkom University.
Video Compression and Standards
MPEG CODING PROCESS. Contents  What is MPEG Encoding?  Why MPEG Encoding?  Types of frames in MPEG 1  Layer of MPEG1 Video  MPEG 1 Intra frame Encoding.
Introduction to MPEG Video Coding Dr. S. M. N. Arosha Senanayake, Senior Member/IEEE Associate Professor in Artificial Intelligence Room No: M2.06
Implementation and comparison study of H.264 and AVS china EE 5359 Multimedia Processing Spring 2012 Guidance : Prof K R Rao Pavan Kumar Reddy Gajjala.
MPEG Video Coding I: MPEG-1 1. Overview  MPEG: Moving Pictures Experts Group, established in 1988 for the development of digital video.  It is appropriately.
H. 261 Video Compression Techniques 1. H.261  H.261: An earlier digital video compression standard, its principle of MC-based compression is retained.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
JPEG Compression What is JPEG? Motivation
CSI-447: Multimedia Systems
Overview of the Scalable Video Coding
Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission Vineeth Shetty Kolkeri EE Graduate,UTA.
Supplement, Chapters 6 MC Course, 2009.
ENEE 631 Project Video Codec and Shot Segmentation
Introduction to H.264/AVC Video Coding
Standards Presentation ECE 8873 – Data Compression and Modeling
MPEG4 Natural Video Coding
H.264/AVC Video Coding Standard
Presentation transcript:

Overview of the H.264/AVC Video Coding Standard Ahmed Hamza School of Computing Science Simon Fraser University February, 2009

Outline Overview Network Abstraction Layer (NAL) Video Coding Layer (VCL) Profiles and Applications Performance Comparison Conclusions

Outline Overview Network Abstraction Layer (NAL) Video Coding Layer (VCL) Profiles and Applications Performance Comparison Conclusions

History of H.264/AVC Initiated by the Video Coding Experts Group (VCEG) in early 1998 Previous name H.26L Target to double the coding efficiency First draft was adopted in Oct. of 1999 In Dec. of 2001, VCEG and the Moving Pictures Experts Group (MPEG) formed a Joint Video Team (JVT) Approved by the ITU-T as H.264 and ISO/IEC as International Standard 14496-10 (MPEG-4 part 10) Advanced Video Codec (AVC) in Mar. 2003 MPEG -> A study group that develops standards for International Standards Organization (ISO) VCEG -> A study group of the International Telecommunications Union (ITU)

Timeline of Video Development Video Telephony Video-CD Video Conferencing Digital TV/DVD MPEG-2: an enabling technology for digital television systems initially designed to extend MPEG-1 and support interlaced video coding Used by transmission of SD/HD TV signal and storage of SD video onto DVDs. H.263/+/++: conversation/communication application Support diversified networks and adapted to their characteristics, e.g. PSTN, mobile, LAN/Internet Considering loss and error robustness requirements MPEG-4 Visual: object-based video coding Provide shape coding

Motivation Create a standard capable of providing good video quality at substantially lower bit rates than previous standards (e.g. half or less), without increasing the complexity of design so much that it would be impractical or excessively expensive to implement. Provide enough flexibility to allow the standard to be applied to a wide variety of applications on a wide variety of networks and systems including low and high bit rates, low and high resolution video, broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems.

H.264/AVC Coding Standard Various Applications Challenge: Broadcast: cable, satellites, terrestrial, and DSL Storage: DVDs (HD DVD and Blu-ray) Video Conferencing: over different networks Multimedia Streaming: live and on-demand Multimedia Messaging Services (MMS) Challenge: How to handle all these applications and networks Flexibility and customizability H.264/AVC consider a variety of applications over heterogeneous networks, such as broadcast over unidirectional comm. Channel, optical/magnetic media that supports sequential playouts as well as random access (jump to some points), low bit rate video conferencing, possibly high bit rate video streaming, and digital video mails. Often these applications impose different requirements, e.g., high/low bit rates, while the networks have different characteristics, e.g., low/high latency (satellites), low/high loss rate (wire/wireless) networks

Structure of H.264/AVC Codec Layered design Network Abstraction Layer (NAL) formats video and meta data for variety of networks Video Coding Layer (VCL) represents video in an efficient way Scope of H.264 standard To address the heterogeneity of applications and networks, H.264/AVC adopt a layered design, where …

Outline Overview Network Abstraction Layer (NAL) Video Coding Layer (VCL) Profiles and Applications Performance Comparison Conclusions

Network Abstraction Layer The purpose of separately specifying the VCL and NAL is to distinguish between coding- specific features (at the VCL) and transport- specific features (at the NAL) Provide network friendliness for sending video data over various network transports, such as: RTP/IP for Internet applications MPEG-2 streams for broadcast services ISO File formats for storage applications The main objective of NAL is to enable simple and effective customization of sending video data over different network systems.

NAL Units Packets consist of video data short packet header: one byte Support two types of transports stream-oriented: no free unit boundaries  use a 3-byte start code prefix packet-oriented: start code prefix is a waste Can be classified into: VCL units: data for video pictures Non-VCL units: meta data and additional info NAL units are packets that carry video data. NAL units have very short packet header, and most of its capacity is used for carrying payloads. … We next present non VCL units

Sequence of NAL units (Access Unit) Each NAL unit contains a Raw Byte Sequence Payload (RBSP), a set of data corresponding to coded video data or header information. Sequence of NAL units (Access Unit)

Access Units A set of NAL units Decoding an access unit results in one picture Structure: Delimiter: for seeking in a stream SEI (Supp. Enhancement Info.): timing and other info primary coded picture: VCL redundant coded picture: for error recovery Now, we know the building blocks. But how to put NAL units together to represent ONE picture? * Delimiter for seeking in byte-stream format * SEI: timing information and other supplemental data that may enhance application usability. * Consist of a set of VCL NAL units which together compose a primary coded picture. * Several redundant slices for error recovery. Redundant coded pictures are usually not decoded. Only when decoders fail to decode the primary coded picture, they will try to decode redundant coded picture.

RBSP Types

Video Sequences and IDR Frames Sequence: an independently decodable NAL unit stream  don’t need NALs from other sequences with one sequence parameter set starts with an instantaneous decoding refresh (IDR) access unit IDR frames: random access points Intra-coded frames no subsequent picture of an IDR frame will require reference to pictures prior to the IDR frame decoders mark buffered reference pictures unusable once seeing an IDR frame Intuitively, aggregating several access units (pictures) gives us a sequence. But, video sequence carries other requirement in H.264/AVC.

Outline Overview Network Abstraction Layer (NAL) Video Coding Layer (VCL) Profiles and Applications Feature Highlights Conclusions

Macroblocks and Slices Fixed size MBs: 16x16 for luma, and 8x8 for chroma Slice: a set of MBs that can be decoded without use of other slices I slice: intra-prediction (I-MB) P slice: possibly one inter-prediction signal (I- and P-MBs) B slice: up to two inter-prediction signals (I- and B-MBs) SP slice: efficient switch among streams SI slice: used in conjunction with SP slices The only reason that we may require adjacent slices is for deblocking filter. The main feature of SP-frames is that identical SP-frames can be reconstructed even when different reference frames are used for their prediction. SI-frames are used in conjunction with SP-frames. This property make them useful for applications such as random access, adapt to network condition, and error recovery/ resilience We next talk about ordering of MBs.

H.264 Slice Modes Slice Type Description Profile(s) I (Intra) Contains only I macroblocks (each block or MB is predicted from previously coded data within the same slice). All P (Predicted) Contains P macroblocks (each MB or MB partition is predicted from one list 0 reference picture) and/or I MBs. B (Bi-predictive) Contains B macroblocks (each MB or MB partition is predicted from a list 0 and/or a list 1 reference picture) and/or I macroblocks. Extended and Main SP (Switching P) Facilitates switching between coded streams; contains P and/or I macroblocks. Extended SI (Switching I) Facilitates switching between coded streams; contains SI macroblocks (a special type of intra coded MB). An H.264 encoder may optionally insert a picture delimiter RBSP unit at the boundary between coded pictures. This indicates the start of a new coded picture and indicates which slice types are allowed in the following coded picture. If the picture delimiter is not used, the decoder is expected to detect the occurrence of a new picture based on the header of the first slice in the new picture.

Slice Groups Flexible MB Order (FMO) Arbitrary Slice Order (ASO) Multiple slice groups makes it possible to map the sequence of coded MBs to the decoded picture in a number of flexible ways. Arbitrary Slice Order (ASO) sending and receiving the slices of the picture in any order relative to each other can improve end-to-end delay in real-time applications, particularly when used on networks having out-of-order delivery behavior

MB to Slice Group Mappings Foreground and Background Interleaved Dispersed

Inter Prediction Unlike earlier standards, H.264 supports a range of block sizes (from 16 × 16 down to 4×4) and fine subsample motion vectors (quarter-sample resolution in the luma component). Partitioning MBs into motion compensated sub- blocks of varying size is known as tree structured motion compensation.

Inter-Prediction in P Slices Two-level segmentation of MBs Luma MBs are divided into at most 4 partitions (as small as 8x8) 8x8 partitions are divided into at most 4 partitions Chroma – half size horizontally and vertically Maximum of 16 motion vectors for each MB Each chroma block is partitioned in the same way as the luma component, except that the partition sizes have exactly half the horizontal and vertical resolution.

Tree-Structured MC Why do we need different partition sizes? Motion vector are expensive Large partition - smooth area Small partition: - detailed area Note: motion vectors are coded using predictive coding

Why different partition sizes? MVs are expensive Smooth area  large partition Detailed area  small partition

Inter-Prediction Accuracy Using sub-pel motion estimation Granularity of ¼-pixel for luma, 1/8-pixel for Chroma Actual computations are done with addition, bit- shift, and integer arithmetics The granularity of motion vectors finite impulse response (FIR) filter J can be computed in two ways… Actual computations are done with addition, bit-shift, and integer arithmetics

Interpolation of luma half-pel positions Using a 6-tap Finite Impulse Response (FIR) filter With weights (1/32,−5/32, 5/8, 5/8,−5/32, 1/32). Note that un-rounded versions of h and m are used to generate j b = round((E − 5F + 20G + 20H − 5I + J) /32)

Interpolation of luma quarter-pel positions Using average of neighbors Chroma predictions: bilinear interpolations a = round((G + b) / 2)

H.264 Encoder In the figures, the reference picture is shown as the previous encoded picture F n−1 but the prediction reference for each macroblock partition (in inter mode) may be chosen from a selection of past or future pictures (in display order) that have already been encoded, reconstructed and filtered. A filter is applied to reduce the effects of blocking distortion and the reconstructed reference picture is created from a series of blocks Fn.

H.264 Decoder

Intra Prediction Unlike previous standards (e.g. H.263 and MPEG-4 Visual), Intra prediction in H.264 is conducted in the spatial (not the transform) domain. To prevent spatio-temporal error propagation, prediction is restricted to Intra-coded neighboring MBs.

Intra 4x4 Prediction Modes Samples in 4x4 block are predicted using 13 neighboring sample 9 prediction mode: 1 DC and 8 directional Sample D is used if E-H is not available Used for luma prediciton Well suited for coding of parts of a picture with significant details.

Intra 4x4 Prediction Modes DC: use one value to predict the whole block. For example: samples in mode 0 are copied into the 4x4 block. E-H may be unavailable because of (1) decoding order, (2) outside the subject slice First macroblock coded in a picture, we use DC prediction.

Intra 16x16 Prediction Modes Used for luma prediction More suited for coding very smooth areas of a picture.

Deblocking Filter Block-based coding produces annoying visible block artifacts. H.264 defines an adaptive in-loop deblocking filter that reduces blockiness. The filter smoothes block edges, improving the appearance of decoded frames. Filter is applied after the inverse transform in the encoder (before reconstructing and storing the MB for future predictions) and in the decoder (before reconstructing and displaying the MB). The filtered image is used for motion-compensated prediction of future frames and this can improve compression performance because the filtered image is often a more faithful reproduction of the original frame than a blocky, unfiltered image.

Deblocking Filter Block edges are typically reconstructed with less accuracy than interior pixels. Basic Idea: Relatively large absolute difference between samples near a block edge is probably a blocking artifact. Very large magnitude difference more likely reflects the actual behaviour of source picture. Whether to filter or not is determined using quantiazation parameter (QP) dependable thresholds.

Reconstructed, QP=36 (no filter) Reconstructed, QP=36 (with filter) Deblocking Filter Original Frame Reconstructed, QP=36 (no filter) Reconstructed, QP=36 (with filter)

Scanning Order of Residual Blocks within an MB If the MB is coded in 16×16 Intra mode, then the block labelled ‘−1’, containing the transformed DC coefficient of each 4 × 4 luma block, is transmitted first. Next, the luma residual blocks 0–15 are transmitted in the order shown (the DC coefficients in a macroblock coded in 16×16 Intra mode are not sent). Blocks 16 and 17 are sent, containing a 2×2 array of DC coefficients from the Cb and Cr chroma components respectively and finally, chroma residual blocks 18–25 (without DC coefficients) are sent.

Transform H.264 uses three transforms depending on the type of residual data that is to be coded: DCT-based transform for all other 4 × 4 blocks in the residual data. Hadamard transform for the 4×4 array of luma DC coefficients in Intra MBs predicted in 16×16 mode. Hadamard transform for the 2 × 2 array of chroma DC coefficients (in any macroblock).

4x4 Integer Transform Why smaller transform: Only use add and shift, an exact inverse transform is possible  no decoding mismatch Not too much residue to code Less noise around edge (ringing or mosquito noise) Less computations and shorter data type (16-bit) An approximation to 4x4 DCT: It would be desirable if only 16-bit multiplications were required, since many processors cannot perform 32-bit multiplications with a single instruction Ringing is still there, just they are smaller, and harder to see.

4x4 Integer Transform Factorizing the matrix multiplication: The symbol ⊗ indicates that each element of (CXCT) is multiplied by the scaling factor in the same position in matrix E (scalar multiplication rather than matrix multiplication) The constants a and b are as before and d is c/b (approximately 0.414).

4x4 Integer Transform To simplify the implementation of the transform, d is approximated by 0.5. Scale 2nd and 4th rows of C and 2nd and 4th columns of CT by a factor of 2 and scale down E to compensate Thus, we avoid multiplications by half in the ‘core’ transform CXCT which could result in loss of accuracy using integer arithmetic.

2nd Transform and Quantization Parameter 2nd Transform: Intra_16x16 and Chroma modes are for smooth area DC coefficients are transformed again to cover the whole MB Quantization step is adjusted by an exponential function of quantization parameter  to cover a wider range of QS QP increases by 6 => QS doubles QP increases by 1 => QS increases by 12% => bit rate decreases by 12% 2nd transform is needed to take advantage of spatial redundancy in lat areas. In addition to cover a wider range, exponential quantization step also simplifies the problem of rate control.

Quantization The mechanisms of the forward and inverse quantisers are complicated by the requirements to avoid division and/or floating point arithmetic, and incorporate the post- and pre-scaling matrices Ef and Ei A total of 52 values of Qstep are supported by the standard, indexed by a Quantisation Parameter, QP. Qstep doubles in size for every increment of 6 in QP. The wide range of quantiser step sizes makes it possible for an encoder to control the tradeoff accurately and flexibly between bit rate and quality. The values of QP can be different for luma and chroma.

Entropy Coding Non-transform coefficients: an infinite-extent codeword table (Exp-Golomb) Transform coefficients: Context-Adaptive Variable Length Coding (CAVLC) several VLC tables are switched dep. on prior transmitted data  better than a single VLC table Context-Adaptive Binary Arithmetic Coding (CABAC) flexible symbol probability than CAVLC  5 – 15% rate reduction efficient: multiplication free Previous standards usually have SEPARATE VLC tables for different elements, because they tend to have different probability characteristics In H.264/AVC a simple and efficient infinite-extent codeword table is shared by all elements For different element, only a mapping to this single codeword book is required. Arithmetic coding: non-integer number of bit to each symbol

Exp-Golomb Codes For pre-defined code tables (e.g. pre-calculated Huffman-based coding), encoder and decoder must store table in some form. Exponential Golomb (Exp-Golomb) codes use codes that can be generated automatically on- the-fly if input symbol is known. Exp-Golomb codes are VLCs with a regular construction.

The Complete Picture

Outline Overview Network Abstraction Layer (NAL) Video Coding Layer (VCL) Profiles and Applications Performance Comparison Conclusions

H.264 Profiles The Baseline Profile The Main Profile intra and inter-coding (using I-slices and P-slices) entropy coding with context-adaptive variable-length codes (CAVLC) The Main Profile supports interlaced video inter-coding using B-slices inter coding using weighted prediction entropy coding using context-based arithmetic coding (CABAC) The Extended Profile does not support interlaced video or CABAC but adds modes to enable efficient switching between coded bitstreams (SP- and SI-slices) and improved error resilience (Data Partitioning).

H.264 Profiles

Multi-frame Inter-Prediction in B Slices Weighted average of 2 predictions B-slices can be used as reference Two reference picture lists are used One out of four pred. methods for each partition: list 0 list 1 bi-predictive direct prediction: inferred from prior MBs The MB can be coded in B_skip mode (similar to P_skip)

Partition Prediction in B Slices Each MB partition in an inter coded MB in a B slice may be predicted from one or two reference pictures, before or after the current picture in temporal order.

Potential Applications Baseline (low latency) H.320 conversational video services 3GPP conversational H.324/M services H.323 with IP/RTP 3GPP using IP/RTP and SIP 3GPP streaming using IP/RTP and RTSP Main (moderate latency) Modified H.222.0/MPEG-2 Broadcast via satellite, cable, terrestrial or DSL DVD and VOD Extended Streaming over wired Internet Any (no requirement on latency) 3GPP MMS Video mail

Outline Overview Network Abstraction Layer (NAL) Video Coding Layer (VCL) Profiles and Applications Performance Comparison Conclusions

Performance Comparison Tempete CIF 30Hz 25 26 27 28 29 30 31 32 33 34 35 36 37 38 500 1000 1500 2000 2500 3000 3500 Bit-rate [kbit/s] Quality Y-PSNR [dB] MPEG-2 H.263 MPEG-4 JVT/H.264/AVC Visual

Outline Overview Network Abstraction Layer (NAL) Video Coding Layer (VCL) Profiles and Applications Performance Comparison Conclusions

Conclusions H.264 provides mechanisms for coding video that are optimised for compression efficiency and aim to meet the needs of practical multimedia communication applications. The success of a practical implementation of H.264 (or MPEG-4 Visual) depends on careful design of the CODEC and effective choices of coding parameters.

References Wiegand, T.; Sullivan, G.J.; Bjontegaard, G.; Luthra, A., "Overview of the H.264/AVC video coding standard," Circuits and Systems for Video Technology, IEEE Transactions on , vol.13, no.7, pp.560-576, July 2003 Sullivan, G.J.; Wiegand, T., "Video Compression - From Concepts to the H.264/AVC Standard," Proceedings of the IEEE , vol.93, no.1, pp.18-31, Jan. 2005 Richardson, I.; " H.264 and MPEG-4 Video Compression: Video Coding for Next Generation Multimedia," Wiley, 2003 Richardson, I.; "Video Codec Design: Developing Image and Video Compression Systems," Wiley, 2002

Thank You...

SAE = Sum of Absolute Errors Example SAE = Sum of Absolute Errors

SAE = Sum of Absolute Errors Example SAE = Sum of Absolute Errors Use 33 neighboring samples. Good for smooth/flat areas.

H.264 Residual Transform Example (DCT) (Approximate DCT) (Difference)

Entropy Coding

Macroblock Syntax Elements mb_type Whether the macroblock is coded in intra or inter (P or B) mode; determines macroblock partition size. mb_pred Determines intra prediction modes (intra MBs); determines list 0 and/or list 1 references and differentially coded motion vectors for each macroblock partition (inter MBs, except for inter MBs with 8 × 8 MB partition size). sub_mb_pred (Inter MBs with 8 × 8 MB partition size only) Determines sub-MB partition size for each sub-MB; list 0 and/or list 1 references for each MB partition. coded_block_pattern Which 8 × 8 blocks (luma and chroma) contain coded transform coefficients. mb_qp_delta Changes the quantiser parameter. residual Coded transform coefficients corresponding to the residual image samples after prediction Each MB contains a series of header elements (see Table) and coded residual data.

Design Features Highlights Features for enhancement of prediction Directional spatial prediction for intra coding Variable block-size motion compensation with small block size Quarter-sample-accurate motion compensation Motion vectors over picture boundaries Multiple reference picture motion compensation Decoupling of referencing order form display order Decoupling of picture representation methods from picture referencing capability Weighted prediction Improved “skipped” and “direct” motion inference In-the-loop deblocking filtering

Design Features Highlights Features for improved coding efficiency Small block-size transform Exact-match inverse transform Short word-length transform Hierarchical block transform Arithmetic entropy coding Context-adaptive entropy coding

Design Features Highlights Features for robustness to data errors/losses Parameter set structure NAL unit syntax structure Flexible slice size Flexible macroblock ordering (FMO) Arbitrary slice ordering (ASO) Redundant pictures Data Partitioning SP/SI synchronization/switching pictures