Concepts of Multimedia Processing and Transmission

Slides:



Advertisements
Similar presentations
March 24, 2004 Will H.264 Live Up to the Promise of MPEG-4 ? Vide / SURA March Marshall Eubanks Chief Technology Officer.
Advertisements

Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
MPEG4 Natural Video Coding Functionalities: –Coding of arbitrary shaped objects –Efficient compression of video and images over wide range of bit rates.
A Performance Analysis of the ITU-T Draft H.26L Video Coding Standard Anthony Joch, Faouzi Kossentini, Panos Nasiopoulos Packetvideo Workshop 2002 Department.
Basics of MPEG Picture sizes: up to 4095 x 4095 Most algorithms are for the CCIR 601 format for video frames Y-Cb-Cr color space NTSC: 525 lines per frame.
2004 NTU CSIE 1 Ch.6 H.264/AVC Part2 (pp.200~222) Chun-Wei Hsieh.
Overview of the H. 264/AVC video coding standard.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
Chapter 11.3 MPEG-2 MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications: Simple,
Technion - IIT Dept. of Electrical Engineering Signal and Image Processing lab Transrating and Transcoding of Coded Video Signals David Malah Ran Bar-Sella.
1 Video Coding Concept Kai-Chao Yang. 2 Video Sequence and Picture Video sequence Large amount of temporal redundancy Intra Picture/VOP/Slice (I-Picture)
Source Coding for Video Application
SWE 423: Multimedia Systems
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
CABAC Based Bit Estimation for Fast H.264 RD Optimization Decision
School of Computing Science Simon Fraser University
Ch. 6- H.264/AVC Part I (pp.160~199) Sheng-kai Lin
Overview of the H.264/AVC Video Coding Standard
H.264/Advanced Video Coding – A New Standard Song Jiqiang Oct 21, 2003.
2015/6/15VLC 2006 PART 1 Introduction on Video Coding StandardsVLC 2006 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
Context-Based Adaptive Binary Arithmetic Coding in the H.264/AVC Video Compression Standard Detlev Marpe, Heiko Schwarz, and Thomas Wiegand IEEE Transactions.
JPEG.
H.264 / MPEG-4 Part 10 Nimrod Peleg March 2003.
CS :: Fall 2003 MPEG-1 Video (Part 1) Ketan Mayer-Patel.
2015/7/12VLC 2008 PART 1 Introduction on Video Coding StandardsVLC 2008 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
5. 1 JPEG “ JPEG ” is Joint Photographic Experts Group. compresses pictures which don't have sharp changes e.g. landscape pictures. May lose some of the.
H.264/AVC.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
An Introduction to H.264/AVC and 3D Video Coding.
Delivering More Video Content at Half the Cost Using MPEG-4 AVC Bob Wilson Chairman & CEO
Image and Video Compression
Image Compression - JPEG. Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation.
Moving PicturestMyn1 Moving Pictures MPEG, Motion Picture Experts Group MPEG is a set of standards designed to support ”Coding of Moving Pictures and Associated.
H.264 ITU-T H.264 or ISO/IEC IS (MPEG-4 part 10) Advanced Video Coding (AVC)
 Coding efficiency/Compression ratio:  The loss of information or distortion measure:
Video Coding. Introduction Video Coding The objective of video coding is to compress moving images. The MPEG (Moving Picture Experts Group) and H.26X.
MPEG-1 and MPEG-2 Digital Video Coding Standards Author: Thomas Sikora Presenter: Chaojun Liang.
Profiles and levelstMyn1 Profiles and levels MPEG-2 is intended to be generic, supporting a diverse range of applications Different algorithmic elements.
Windows Media Video 9 Tarun Bhatia Multimedia Processing Lab University Of Texas at Arlington 11/05/04.
Low Bit Rate H Video Coding: Efficiency, Scalability and Error Resilience Faouzi Kossentini Signal Processing and Multimedia Group Department of.
Concepts of Multimedia Processing and Transmission IT 481, Lecture 5 Dennis McCaughey, Ph.D. 19 February, 2007.
Outline JVT/H.26L: History, Goals, Applications, Structure
Indiana University Purdue University Fort Wayne Hongli Luo
Codec structuretMyn1 Codec structure In an MPEG system, the DCT and motion- compensated interframe prediction are combined. The coder subtracts the motion-compensated.
Image Compression Supervised By: Mr.Nael Alian Student: Anwaar Ahmed Abu-AlQomboz ID: IT College “Multimedia”
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
- By Naveen Siddaraju - Under the guidance of Dr K R Rao Study and comparison between H.264.
Fundamentals of Multimedia Chapter 12 MPEG Video Coding II MPEG-4, 7 Ze-Nian Li & Mark S. Drew.
Figure 1.a AVS China encoder [3] Video Bit stream.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
1 Modular Refinement of H.264 Kermin Fleming. 2 What is H.264? Mobile Devices Low bit-rate Video Decoder –Follow on to MPEG-2 and H.26x Operates on pixel.
JPEG.
Video Compression—From Concepts to the H.264/AVC Standard
Page 11/28/2016 CSE 40373/60373: Multimedia Systems Quantization  F(u, v) represents a DCT coefficient, Q(u, v) is a “quantization matrix” entry, and.
Block-based coding Multimedia Systems and Standards S2 IF Telkom University.
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
Video Compression and Standards
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
MPEG CODING PROCESS. Contents  What is MPEG Encoding?  Why MPEG Encoding?  Types of frames in MPEG 1  Layer of MPEG1 Video  MPEG 1 Intra frame Encoding.
MPEG Video Coding I: MPEG-1 1. Overview  MPEG: Moving Pictures Experts Group, established in 1988 for the development of digital video.  It is appropriately.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
JPEG Compression What is JPEG? Motivation
CSI-447: Multimedia Systems
Adaptive Block Coding Order for Intra Prediction in HEVC
CS644 Advanced Topics in Networking
Standards Presentation ECE 8873 – Data Compression and Modeling
MPEG4 Natural Video Coding
JPEG Still Image Data Compression Standard
Image Coding and Compression
Presentation transcript:

Concepts of Multimedia Processing and Transmission GMU - IT 481, Spring 2006 1/23/2006 Concepts of Multimedia Processing and Transmission IT 481, Lecture #7 Dennis McCaughey, Ph.D. 23 October, 2006 (C) Hung Nguyen, 2006

Broadcast Environment IT 481, Fall 2006 08/28/2006

H.264 Overview 1997: ITU-T Video Coding Experts Group began work 2001: ISO/MPEG joined the ITU-T and formed a Joint Video Team (JVT) that took over the H.264 project The JVT objective was the creation of a single video coding standard that would simultaneously result in a new part of the MPEG-4 family of standards and a new ITU-T (H.264) Recommendation IT 481, Fall 2006 08/28/2006

History IT 481, Fall 2006 08/28/2006

H.264 Advantages Up to 50% savings in bit rate: Compared to H.263+ or MPEG-4 Simple Profile High quality video: H.264 offers consistently good video quality at high and low bit rates. Error resilience: H.264 provides the tools necessary to deal with packet loss in packet networks and bit errors in error-prone wireless networks. Network friendliness: Through the Network Adaptation Layer, H.264 bit streams can be easily transported over different networks IT 481, Fall 2006 08/28/2006

Relationship to Other Standards Identical specifications have been approved in both ITU-T / VCEG and ISO/IEC / MPEG In ITU-T / VCEG this is a new & separate standard ITU-T Recommendation H.264 ITU-T Systems (H.32x) will be modified to support it ! In ISO/IEC / MPEG this is a new “part” in the MPEG-4 suite Separate codec design from prior MPEG-4 visual New part 10 called “Advanced Video Coding” (AVC – similar to “AAC” position in MPEG-2 as separate codec) MPEG-4 Systems / File Format has been modified to support it H.222.0 | MPEG-2 Systems also modified to support it IETF finalizing RTP payload packetization IT 481, Fall 2006 08/28/2006

Applications Entertainment Video (1-8+ Mbps, higher latency) Broadcast / Satellite / Cable / DVD / VoD / FS-VDSL / … DVB/ATSC/SCTE, DVD Forum, DSL Forum Conversational Services (usu. <1Mbps, low latency) Circuit Switched H.320 Conversational H324/M 3GPP Conversational H.324/M Packet-switched H.323 Conversational Internet/best effort IP/RTP 3GPP Conversational IP/RTP/SIP Streaming Services (usu. lower bit rate, higher latency) 3GPP Streaming IP/RTP/RTSP Streaming IP/RTP/RTSP (without TCP fallback) Other Services 3GPP Multimedia Messaging Services IT 481, Fall 2006 08/28/2006

Profiles and Levels Many standards contain different configurations of capabilities – often based in “profiles” & “levels” A profile is usually a set of algorithmic features A level is usually a degree of capability (e.g. resolution or speed of decoding) H.264/AVC has three profiles Baseline (lower capability plus error resilience, e.g., videoconferencing, mobile video) Main (high compression quality, e.g., broadcast) Extended (added features for efficient streaming) IT 481, Fall 2006 08/28/2006

Grouping of Capabilities into Profiles Three profiles now: Baseline, Main, and Extended Baseline (e.g., Videoconferencing & Wireless) I and P picture types (not B) In-loop deblocking filter 1/4-sample motion compensation Tree-structured motion segmentation down to 4x4 block size VLC-based entropy coding (CAVLC) Some enhanced error resilience features Flexible macroblock ordering/arbitrary slice ordering Redundant slices Note: No support for interlaced video in Baseline IT 481, Fall 2006 08/28/2006

Non-Baseline Profiles Main Profile (esp. Broadcast/Entertainment) All Baseline features except enhanced error resilience features B pictures Adaptive weighting for B and P picture prediction Picture and MB-level frame/field switching CABAC Note: Main is not exactly a superset of Baseline Extended Profile (esp. Streaming/Internet) All Baseline features More error resilience: Data partitioning SP/SI switching pictures Note: Extended is a superset of Baseline (but not of Main) IT 481, Fall 2006 08/28/2006

H.264 Encoder IT 481, Fall 2006 08/28/2006

H.264 Main Stages Dividing each video frame into blocks of pixels so that processing of the video frame can be conducted at the block level Exploiting the spatial redundancies that exist within the video frame by coding some of the original blocks through transform, quantization and entropy coding. Exploiting the temporal dependencies that exist between blocks in successive frames so that only changes between successive frames need to be encoded. For any given block, a search is performed in the previously coded one or more frames to determine motion vectors that are then used by the encoder and decoder to predict the subject block Exploiting any remaining spatial redundancies that exist within the video frame by coding residual blocks; i.e. the difference between the original blocks and the corresponding predicted blocks again through transform, quantization and entropy coding. IT 481, Fall 2006 08/28/2006

H.264 Features Enhanced Motion compensation Multiple block sizes and shapes (MPEG-2 has 0nly 8x8) Higher resolution ¼ pixel motion estimation Multiple reference frame selection and bi-directional mode selection Employs an integer based DCT that does not have the mismatch problem in the inverse transform Improved entropy coding IT 481, Fall 2006 08/28/2006

Intra Prediction & Coding Intra coding refers to the case where only spatial redundancies within a video picture are exploited. The resulting frame is referred to as an I-picture and are typically encoded by directly applying the transform to the different macroblocks in the frame. Encoded I-pictures are large in size since a large amount of information is usually present in the frame, and no temporal information is used as part of the encoding process. In order to increase the efficiency of the intra coding process in H.264, spatial correlation between adjacent macroblocks in a given frame is exploited. Based on the observation that adjacent macroblocks tend to have similar properties. First step in the encoding process for a given macroblock, is the prediction of the macroblock of interest from the surrounding macroblocks (typically the ones located on top and to the left of the macroblock of interest, since those macroblocks would have already been encoded). The difference between the actual macroblock and its prediction is then coded, which results in fewer bits to represent the macroblock of interest IT 481, Fall 2006 08/28/2006

H.264 4x4 Intra Prediction Modes H.264 offers 9 modes for prediction of 4x4 luminance blocks, including DC prediction (Mode 2) and 8 directional modes, labeled 0 thru 8 in Figure 4. This process is illustrated above in which pixels A to M from neighboring blocks have already been encoded and may be used for prediction IT 481, Fall 2006 08/28/2006

Examples Mode 0 (Vertical Prediction) Mode 3 (Diagonal-Down-Left) a, e, i and m are equal to A, b, f, j and n are equal to B, c, g, k and o are equal to C, and d, h, l and p are equal to D. Mode 3 (Diagonal-Down-Left) a is equal to (A+2B+C+2)/4, b, e are equal to (B+2C+D+2)/4, c, f, i are equal to (C+2D+E+2)/4, d, g, j, m are equal to (D+2E+F+2)/4, h, k, n are equal to (E+2F+G+2)/4, l, o are equal to (F+2G+H+2)/4, and p is equal to (G+3H+2)/4. IT 481, Fall 2006 08/28/2006

Other Intra Prediction Modes 8x8 Intra Prediction Modes Luminance Blocks Uses all nine prediction modes Chrominance Blocks Uses four prediction modes (DC, Vertical, Horizontal and Planar). 16x16 Intra Prediction Modes For regions with less spatial detail (i.e., flat regions), Uses four prediction modes (DC, Vertical, Horizontal and Planar) is chosen for the prediction of the entire luminance component of the macroblock The prediction mode must be encoded for each block. The mode for each block is efficiently coded by assigning shorter symbols to more likely modes, The probability of each mode is determined based on the modes used for coding the surrounding blocks IT 481, Fall 2006 08/28/2006

Inter Prediction and Coding Inter prediction and coding is based on using motion estimation and compensation to take advantage of the temporal redundancies that exist between successive frames, hence. Motion estimation in H.264 supports most of the key features adopted in earlier video standards, but with improved efficiency. Supports P-pictures (with single and multiple reference frames) and B-pictures, and a new inter-stream transitional picture called an SP-picture. The inclusion of SP-pictures in a bit stream enables efficient switching between bit streams with similar content encoded at different bit rates, as well as random access and fast playback modes. Four main motion estimation features used in H.264: (1) the use of various block sizes and shapes, (2) the use of high-precision sub-pixel motion vectors, (3) the use of multiple reference frames, and (4) the use of de-blocking filters in the prediction loop IT 481, Fall 2006 08/28/2006

Block Sizes The availability of smaller motion compensation blocks improves prediction. The small blocks improve the ability of the model to handle fine motion detail and result in better subjective viewing quality by not large blocking artifacts IT 481, Fall 2006 08/28/2006

Tree Structure H.264 allows a combination of 4x8, 8x4, or 4x4 sub-blocks within an 8x8 sub-block as shown for a 16x16 macroblock. Zig-zag scan pattern IT 481, Fall 2006 08/28/2006

Motion Estimation Accuracy The prediction capability of the motion compensation algorithm in H.2 64 is further improved by allowing motion vectors to be determined with higher levels of spatial accuracy than in existing standards. Quarter-pixel accurate motion compensation is the lowest-accuracy form of motion compensation in H.264 In contrast with prior standards based primarily on half-pixel accuracy, with quarter-pixel accuracy only available in the newest version of MPEG-4. ¼-pixel spatial accuracy can yield as much as 20% in bit rate savings as compared to using integer-pixel spatial accuracy. IT 481, Fall 2006 08/28/2006

Multiple Reference Picture Selection The H.2 64 standard offers the option of having multiple reference frames in inter picture coding, Results in better subjective video quality and more efficient coding of the video frame under consideration. Multiple reference frames might help making the H.264 bit stream error resilient. There would be additional processing delays and higher memory requirements at both the encoder and decoder. Using 5 reference frames for prediction can yield 5-10% in bit rate savings as compared to using only one reference frame. IT 481, Fall 2006 08/28/2006

De-Blocking (Loop) Filter H.264 specifies the use of an adaptive de-blocking filter that operates on the horizontal and vertical block edges within the prediction loop Removes artifacts caused by block prediction errors. The filtering is generally based on 4x4 block boundaries, in which two pixels on either side of the boundary may be updated using a different filter. The rules for applying the de-blocking filter are intricate and quite complex, Its use is optional for each slice (loosely defined as an integer number of macroblocks). The improvement in subjective quality often more than justifies the increase in complexity. The de-blocking filter yields a substantial improvement in subjective quality. IT 481, Fall 2006 08/28/2006

IT 481, Fall 2006 08/28/2006

Integer Transform Prediction error blocks resulting from either intra prediction or inter prediction are then transformed using a new integer DCT. H.264 is unique in that it employs a purely integer spatial transform An approximation of the DCT as opposed to the usual floating-point DCT specified with rounding-error tolerances as used in earlier standards. H.264 allows the use of both 4x4 and 8x8 transform block sizes. The small shape helps reduce blocking and ringing artifacts, The precise integer specification eliminates any mismatch issues between the encoder and decoder in the inverse transform. IT 481, Fall 2006 08/28/2006

Quantization & transform Coefficient Scanning The quantization step provides a significant portion of the data compression. In H.264, the transform coefficients are quantized using scalar quantization. Fifty-two different quantization step sizes can be chosen on a macroblock basis Prior standards (H.263 supports thirty-one, for example). In H.264 the step sizes are increased at a compounding rate of approximately 12.5%, rather than by a constant increment. The fidelity of chrominance components is improved by using finer quantization step sizes as compared to those used for the luminance. IT 481, Fall 2006 08/28/2006

Entropy Coding Entropy coding is based on assigning shorter code words to symbols with higher probabilities of occurrence, and longer codewords to symbols with less frequent occurrences. Parameters to be entropy coded include: Transform coefficients for the residual data, Motion vectors and Other encoder information. Two types of entropy coding have been adopted: Variable-Length coding (VLC) Context-Based Adaptive Binary Arithmetic Coding (CABAC). IT 481, Fall 2006 08/28/2006

Variable Length Encoding In some video coding standards, symbols and the associated codewords are organized in look-up tables, referred to as VLC tables, which are stored at both the encoder and decoder. In H.263, a number of VLC tables are used, depending on the type of data under consideration (e.g., transform coefficients, motion vectors). H.264 offers a single Universal VLC (UVLC) table that is to be used in entropy coding of all symbols in the encoder except for the transform coefficients. Simple, Disadvantage, in that a single table is usually derived using a static probability distribution model, which ignores the correlations between the encoder symbols IT 481, Fall 2006 08/28/2006

VLC-Transform Coefficients In H.264, the transform coefficients are coded using Context Adaptive Variable Length Coding (CAVLC). CAVLC is designed to take advantage of several characteristics of quantized 4x4 blocks. First, non-zero coefficients at the end of the zigzag scan are often equal to +/- 1. CAVLC encodes the number of these coefficients (“trailing 1s”) in a compact way. Second, CAVLC employs run-level coding efficiently to represent the string of zeros in a quantized 4x4 block. Moreover, the numbers of non-zero coefficients in neighboring blocks are usually correlated. The number of non-zero coefficients is encoded using a look-up table that depends on the numbers of non-zero coefficients in neighboring blocks. Finally, the magnitude (level) of non-zero coefficients gets larger near the DC coefficient and get smaller around the high frequency coefficients. CAVLC takes advantage of this by making the choice of the VLC look-up table for the level adaptive in a way where the choice depends on the recently coded levels IT 481, Fall 2006 08/28/2006

Context-Based Adaptive Binary Arithmetic Coding (CABAC) Arithmetic coding makes use of a probability model at both the encoder and decoder for all the syntax elements Transform coefficients Motion vectors. A process called context modeling increases the coding efficiency of arithmetic coding, the underlying probability model is adapted to the changing statistics with a video frame Context modeling provides estimates of conditional probabilities of the coding symbols. Suitable context models, allow inter-symbol redundancy to be exploited Switching between different probability models according to already coded symbols in the neighborhood of the current symbol to encode. IT 481, Fall 2006 08/28/2006

Context Models Different models are often maintained for each syntax element (e.g., motion vectors and transform coefficients have different models). If a given symbol is non-binary valued, it will be mapped onto a sequence of binary decisions, so-called bins. The actual binarization is done according to a given binary tree – The UVLC binary tree is often used. Each binary decision is then encoded with the arithmetic encoder using the new probability estimates, which have been updated during the previous context modeling stage. After encoding of each bin, we adjust upward the probability estimate for the binary symbol that was just encoded. Hence, the model keeps track of the actual statistics IT 481, Fall 2006 08/28/2006

Example IT 481, Fall 2006 08/28/2006

Reference “Emerging H.264 Standard: Overview and TMS320C64x Digital Media Platform Implementation” UB Video Inc White Paper, 2002 IT 481, Fall 2006 08/28/2006