Overview of the High Efficiency Video Coding (HEVC) Standard

Slides:



Advertisements
Similar presentations
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Advertisements

Time Optimization of HEVC Encoder over X86 Processors using SIMD
MPEG4 Natural Video Coding Functionalities: –Coding of arbitrary shaped objects –Efficient compression of video and images over wide range of bit rates.
Basics of MPEG Picture sizes: up to 4095 x 4095 Most algorithms are for the CCIR 601 format for video frames Y-Cb-Cr color space NTSC: 525 lines per frame.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
MPEG-4 Objective Standardize algorithms for audiovisual coding in multimedia applications allowing for Interactivity High compression Scalability of audio.
MULTIMEDIA PROCESSING
A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors Chenggang Yan, Yongdong Zhang, Jizheng Xu, Feng Dai,
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
School of Computing Science Simon Fraser University
Ch. 6- H.264/AVC Part I (pp.160~199) Sheng-kai Lin
Overview of the H.264/AVC Video Coding Standard
Spatial and Temporal Data Mining
Context-Based Adaptive Binary Arithmetic Coding in the H.264/AVC Video Compression Standard Detlev Marpe, Heiko Schwarz, and Thomas Wiegand IEEE Transactions.
H.264 / MPEG-4 Part 10 Nimrod Peleg March 2003.
Image (and Video) Coding and Processing Lecture: DCT Compression and JPEG Wade Trappe Again: Thanks to Min Wu for allowing me to borrow many of her slides.
CS :: Fall 2003 MPEG-1 Video (Part 1) Ketan Mayer-Patel.
CMPT 365 Multimedia Systems
Block Partitioning Structure in the HEVC Standard
Adaptive Deblocking Filter in H.264 Ehsan Maani Course Project:
5. 1 JPEG “ JPEG ” is Joint Photographic Experts Group. compresses pictures which don't have sharp changes e.g. landscape pictures. May lose some of the.
1 Image and Video Compression: An Overview Jayanta Mukhopadhyay Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur,
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Image and Video Compression
Image Compression - JPEG. Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation.
3D EXTENSION of HEVC: Multi-View plus Depth Parashar Nayana Karunakar Student Id: Department of Electrical Engineering.
Trevor McCasland Arch Kelley.  Goal: reduce the size of stored files and data while retaining all necessary perceptual information  Used to create an.
Lossy Compression Based on spatial redundancy Measure of spatial redundancy: 2D covariance Cov X (i,j)=  2 e -  (i*i+j*j) Vertical correlation   
PROJECT PROPOSAL HEVC DEBLOCKING FILTER AND ITS IMPLIMENTATION RAKESH SAI SRIRAMBHATLA UTA ID: EE 5359 Under the guidance of DR. K. R. RAO.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 8 – JPEG Compression (Part 3) Klara Nahrstedt Spring 2012.
PROJECT INTERIM REPORT HEVC DEBLOCKING FILTER AND ITS IMPLEMENTATION RAKESH SAI SRIRAMBHATLA UTA ID:
MPEG-1 and MPEG-2 Digital Video Coding Standards Author: Thomas Sikora Presenter: Chaojun Liang.
Klara Nahrstedt Spring 2011
Outline JVT/H.26L: History, Goals, Applications, Structure
JPEG. The JPEG Standard JPEG is an image compression standard which was accepted as an international standard in  Developed by the Joint Photographic.
JPEG CIS 658 Fall 2005.
Codec structuretMyn1 Codec structure In an MPEG system, the DCT and motion- compensated interframe prediction are combined. The coder subtracts the motion-compensated.
June, 1999 An Introduction to MPEG School of Computer Science, University of Central Florida, VLSI and M-5 Research Group Tao.
High Efficiency Video Coding Kiana Calagari CMPT 880: Large-scale Multimedia Systems and Cloud Computing.
VIDEO COMPRESSION USING NESTED QUADTREE STRUCTURES, LEAF MERGING, AND IMPROVED TECHNIQUES FOR MOTION REPRESENTATION AND ENTROPY CODING Present by fakewen.
Rate-GOP Based Rate Control for HEVC SHANSHE WANG, SIWEI MA, SHIQI WANG, DEBIN ZHAO, AND WEN GAO IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING,
Figure 1.a AVS China encoder [3] Video Bit stream.
Image/Video Coding Techniques for IPTV Applications Wen-Jyi Hwang ( 黃文吉 ) Department of Computer Science and Information Engineering, National Taiwan Normal.
Study and Optimization of the Deblocking Filter in H.265 and its Advantages over H.264 By: Valay Shah Under the guidance of: Dr. K. R. Rao.
High-efficiency video coding: tools and complexity Oct
Reducing/Eliminating visual artifacts in HEVC by Deblocking filter Submitted By: Harshal Shah Under the guidance of Dr. K. R. Rao.
Porting of Fast Intra Prediction in HM7.0 to HM9.2
Video Compression Using Nested Quadtree Structures, Leaf Merging, and Improved Techniques for Motion Representation and Entropy Coding Present by fakewen.
JPEG Image Compression Standard Introduction Lossless and Lossy Coding Schemes JPEG Standard Details Summary.
Transcoding from H.264/AVC to HEVC
Video Compression—From Concepts to the H.264/AVC Standard
Page 11/28/2016 CSE 40373/60373: Multimedia Systems Quantization  F(u, v) represents a DCT coefficient, Q(u, v) is a “quantization matrix” entry, and.
Block-based coding Multimedia Systems and Standards S2 IF Telkom University.
Time Optimization of HEVC Encoder over X86 Processors using SIMD
Time Optimization of HEVC Encoder over X86 Processors using SIMD Kushal Shah Advisor: Dr. K. R. Rao Spring 2013 Multimedia.
Highly Parallel Mode Decision Method for HEVC Jun Zhang, Feng Dai, Yike Ma, and Yongdong Zhang Picture Coding Symposium (PCS),
By Dr. Hadi AL Saadi Lossy Compression. Source coding is based on changing of the original image content. Also called semantic-based coding High compression.
Fundamentals of Multimedia 2 nd ed., Chapter 12 Li, Drew, Liu1 Chapter 12 New Video Coding Standards: H.264 and H H H Comparisons.
H. 261 Video Compression Techniques 1. H.261  H.261: An earlier digital video compression standard, its principle of MC-based compression is retained.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
HEVC Intra Prediction Prepared by Shevach Riabtsev
JPEG Compression What is JPEG? Motivation
CSI-447: Multimedia Systems
HEVC Complexity and Implementation Analysis
Adaptive Block Coding Order for Intra Prediction in HEVC
Intra Coding of the HEVC Standard
JPEG.
Supplement, Chapters 6 MC Course, 2009.
Study and Optimization of the Deblocking Filter in H
MPEG4 Natural Video Coding
Presentation transcript:

Overview of the High Efficiency Video Coding (HEVC) Standard G.J. Sullivan, J.R. Ohm, W.J. Han, and T. Wiegand IEEE Trans. Circuits and Systems for Video Technology, vol. 22, no. 12, Dec., 2012 Gaewon Kim (Ph.D. course) and Prof. Changhoon Yim Department of Internet and Multimedia Engineering, Konkuk University

Typical HEVC video encoder

HEVC Video Coding Layer Coding tree unit (CTU) and coding tree block (CTB) A CTU consists of one luma CTB and two chroma CTB L×L luma CTB: L can be 16, 32, 64 Coding unit (CU) and coding block (CB) The root of quadtree is CTU. CTU is partitioned into CUs recursively. A CU consists of one luma CB and two chroma CB. Each CU has an associated partitioning into prediction units (PUs) and a tree of transform units (TUs)

HEVC Video Coding Layer Prediction unit (PU) and prediction block (PB) A PU partitioning structure has its root at the CU level. PB size can be from 64×64 down to 4×4. Transform unit (TU) and transform block (TB) A TU tree structure has its root at the CU level. A luma CB may be identical to the luma TB or may be split into smaller luma TBs. TB size can be 4×4, 8×8, 16×16, and 32×32.

HEVC Video Coding Layer Motion compensation Quarter-sample precision is used for the MVs. 7-tap or 8-tap filters are used for interpolation of fractional-sample positions. Intrapicture prediction 33 directional modes, planar (surface fitting), DC (flat) Modes are encoded by deriving most probable modes (MPMs) based on those of previously decoded neighboring PBs.

HEVC Video Coding Layer Quantization control Uniform reconstruction quantization (URQ) Entropy coding Context adaptive binary arithmetic coding (CABAC) In-Loop deblocking filtering Similar to the one in H.264 More friendly to parallel processing Sample adaptive offset (SAO) Nonlinear amplitude mapping For better reconstruction of amplitude by histogram analysis

HEVC Video Coding Techniques HEVC : block-based hybrid video coding Interpicture prediction Temporal statistical dependences Intraprcture prediction Spatial statistical dependences Transform coding

Sampled Representation of Pictures HEVC uses YCbCr color space with 4:2:0 subsampling. Y component Luminance (luma). Represents brightness (gray level). Cb and Cr components Chrominance (chroma). Color difference from gray toward blue and red.

Coding Tree Unit (CTU) A picture is partitioned into CTUs. The CTU is the basic processing unit. It contains luma CTBs and chroma CTBs. A luma CTB covers L × L samples. Two chroma CTBs cover each L/2 × L/2 samples. HEVC supports variable-size CTBs. The value of L may be equal to 16, 32, or 64. It is selected according to needs of encoders. In terms of memory and computational requirements. Large CTB is beneficial when encoding high-resolution video content.

Division of the CTB into CBs. CTBs can be used as CBs or can be partitioned into multiple CBs using quadtree structures. The quadtree splitting process can be iterated until the size for a luma CB reaches a minimum allowed luma CB size (8 × 8 or larger).

PBs and PUs The prediction mode for the CU is signaled as being intra or inter. When it is signaled as intra, the PB (prediction block) size is the same as the CB size for all block. CB can be split into four PB quadrants when the CB size is equal to the smallest CB size. It allows mode selections for blocks as small as 4 × 4.

PBs and PUs When the prediction mode is signaled as inter, It is specified whether the CBs are split into one, two, or four PBs. The splitting into four PBs is allowed only when the CB size is equal to the smallest CB size. Each interpicture-predicted PB is assigned one or two motion vectors and reference picture indices.

Asymmetric Motion Partitioning PBs and PUs Intrapicture prediction Interpicture Asymmetric Motion Partitioning

Tree-Structured Partitioning into Transform Blocks and Units For residual coding, a CB can be recursively partitioned into transform blocks. The partitioning is signaled by a residual quadtree.

Tree-Structured Partitioning into Transform Blocks and Units Subdivision of a CTB into CBs and TBs. Solid lines: CB boundaries, dotted lines: TB boundaries

Slices and Tiles Slices are a sequence of CTUs that are processed in the order of a raster scan. The main purpose of slices is resynchronization after data losses.

Slices and Tiles Slices are self-contained. It can be correctly decoded without the use of any data from other slices in the same picture. This means that prediction within the picture is not performed across slice boundaries. Except for the in-loop filtering.

Slices and Tiles Each slice can be coded using different coding types. I slice A slice in which all CUs are coded using only intrapicture prediction. P slice Some CUs can be coded using interpicture prediction with uniprediction. B slice Some CUs can be coded using interpicture prediction with biprediction.

Slices and Tiles Tiles are self-contained and independently decodable. The main purpose of tiles is to enable the use of parallel processing architectures for encoding and decoding.

Slices and Tiles A slice is divided into rows of CTUs. This supports parallel processing of rows of CTUs by using several processing threads in the encoder or decoder. Wavefront parallel processing (WPP)

Intrapicture Prediction Planar prediction (Intra_Planar) Amplitude surface with a horizontal and vertical slope derived from boundaries DC prediction (Intra_DC) Flat surface with a value matching the mean value of the boundary samples Directional prediction (Intra_Angular) 33 different directional prediction is defined for square TB sizes from 4×4 up to 32×32

Intrapicture Prediction Fig. 6. Modes and directional orientations for intrapicture prediction

Intrapicture Prediction PB Partitioning When the CB size is larger than the minimum CB size, PB size is equal to the CB size When the CB size is equal to the minimum CB size, An intrapicture-predicted CB may have two types of PB partitions PART_2N×2N: no split PART_N×N: split into four equal-sized PBs

Intrapicture Prediction Intra-Angular Prediction 33 prediction directions, Intra-Angular[k], k: 2~34 Each TB is predicted directionally from spatially neighboring samples that are reconstructed For TB of size N×N, a total of 4N+1 spatially neighboring samples may be used for prediction Left, Above, Above right, Lower left To improve the intrapicture prediction accuracy, the projected reference sample is computed with 1/32 sample accuracy

Intrapicture Prediction Reference Sample Smoothing Reference samples used for the intrapicture prediction are sometimes filtered by [1 2 1]/4 smoothing filter 4×4 block Smoothing filter is not applied 8×8 block Only for diagonal directions, k = 2, 8, 34 16×16 block Most directions, except near horizontal, vertical 32×32 block Most directions, except exact horizontal, vertical

Intrapicture Prediction Mode Coding HEVC considers 3 most probable modes (MPM) when coding luma intrapicture prediction modes predictively First two modes are initialized by the prediction modes of the above and left PBs Any unavailable prediction mode is considered to Intra_DC When the first two MPM are not equal, the third MPM is set to Intra_Planar, Intra_DC, or Intra_Angular[26] (vertical) If the current luma prediction mode is one of three MPMs, only the MPM index is transmitted Otherwise, the index of the current luma prediction mode is transmitted by using 5-b fixed length code

Interpicture Prediction Partitioning modes PART_2N×2N The CB is not split. PART_2N×N The CB is split into two equal-size PBs horizontally. PART_N×2N PART_N×N The CB is split into four equal-size PBs. PART_2N×nU, PART_2N×nD, PART_nL×2N, and PART_nR×2N These types are known as asymmetric motion partitions (AMP).

Interpicture Prediction HEVC supports motion vectors with units of one quarter of the distance between luma samples. Fractional Sample Interpolation It is used to generate the prediction samples for noninteger sampling positions.

Interpicture Prediction Fractional Sample Interpolation HEVC uses an eight-tap filter for the half-sample positions and a seven-tap filter for the quarter sample positions. HEVC uses a single interpolation process. It improves precision and simplifies the architecture.

Interpicture Prediction

Transform, Scaling, and Quantization HEVC uses transform coding of the prediction error residual. The residual block is partitioned into multiple square TBs. The supported transform block sizes are 4×4, 8×8, 16×16, and 32×32.

Transform, Scaling, and Quantization Core Transform Two-dimentional transforms are computed by applying 1-D transforms in the horizontal and vertical directions. The elements of the core transform matrices were derived by approximating scaled DCT basis functions.

Transform, Scaling, and Quantization Alternative integer Transform It is derived from a DST. It is applied to only 4×4 luma residual blocks. For intrapicture prediction modes. It is not much more computationally demanding than the 4×4 DCT-style transform. It provides approximately 1% bit-rate reduction.

Transform, Scaling, and Quantization HEVC uses a uniform reconstruction quantization (URQ) scheme controlled by a quantization parameter (QP). The range of the QP values is defined from 0 to 51.

Entropy Coding HEVC uses only CABAC for entropy coding. Context modeling The number of contexts used in HEVC is substantially less than in H.264/MPEG-4 AVC. Entropy coding design actually provides better compression. Adaptive coefficient scanning Coefficient scanning is performed in 4×4 subblocks for all TB sizes. The selection of the coefficient scanning order depends on the directionalities of the intrapicture prediction.

Entropy Coding Adaptive coefficient scanning The horizontal scan is used when the prediction direction is close to vertical. The vertical scan is used when the prediction direction is close to horizontal. For other prediction directions, the diagonal up-right scan is used.

Entropy Coding Coefficient coding HEVC transmits the position of the last nonzero transform coefficient, a significance map, sign bits and levels for the transform coefficient.

In-Loop Filters Two processing steps, a deblocking filter (DBF) followed by an sample adaptive offset (SAO) filter, are applied to the reconstructed samples. The DBF is intended to reduce the blocking artifacts due to block-based coding. The DBF is only applied to the samples located at block boundaries. The SAO filter is applied adaptively to all samples satisfying certain conditions. e.g. based on gradient.

In-Loop Filters Deblocking Filter It is applied to all samples adjacent to a PU or TU boundary. Except the case when the boundary is also a picture boundary, or when deblocking is disabled across slice or tile boundaries. HEVC only applies the deblocking filter to the edge that are aligned on an 8×8 sample grid. This restriction reduces the worst-case computational complexity without noticeable degradation of the visual quality. It also improves parallel-processing operation. The processing order of the deblocking filter is defined as horizontal filtering for vertical edges for the entire picture first, followed by vertical filtering for horizontal edges.

In-Loop Filters Deblocking Filter The strength of the deblocking filter is controlled to only three strengths. Given P and Q are two adjacent blocks with a common 8×8 grid boundary, The filter strength of 2 is assigned when one of the blocks is intrapicture predicted. The filter strength of 1 is assigned if any of the following conditions is satisfied. P or Q has at least one nonzero transform coefficient. The reference indices of P and Q are not equal. The motion vectors of P and Q are not equal. The difference between a motion vector component of P and Q is greater than or equal to one integer sample. The filter strength of 0 means that the deblcoking process is not applied.

In-Loop Filters SAO (sample adaptive offset) It is a process that modifies the decoded samples by conditionally adding an offset value to each sample after the application of the deblocking filter, based on values in look-up tables transmitted by the encoder. It is performed on a region basis, based on filtering type selected per CTB. sao_type_idx 0: it is not applied to the CTB. sao_type_idx 1: band offset filtering sao_type_idx 2: edge offset filtering

In-Loop Filters SAO In the band offset mode. The selected offset value directly depends on the sample amplitude. The full sample amplitude range is uniformly split into 32 segments called bands. The sample values belonging to four of these bands (which are consecutive within the 32 bands) are modified by adding transmitted values. The main reason for using four consecutive bands is that in the smooth areas artifacts can appear.

In-Loop Filters SAO In the edge offset mode. a horizontal, vertical, or one of two diagonal gradient directions is used for the edge offset classification in the CTB. Each sample in the CTB is classified into one of five EdgeIdx categories.

In-Loop Filters SAO In the edge offset mode. Depending on the EdgeIdx category, an offset value is added to the sample value. It generally has a smoothing effect in the edge offset mode.

Special Coding Modes I_PCM mode The prediction, transform, quantization and entropy coding are bypassed. The samples are directly represented by a pre-defined number of bits. Its main purpose is to avoid excessive consumption of bits when the signal characteristics are extremely unusual and cannot be properly handled by hybrid coding.

Special Coding Modes Lossless mode The transform, quantization, and other processing that affects the decoded picture are bypassed. The residual signal from inter- or intrapicture prediction is directly fed into the entropy coder. It allows mathematically lossless reconstruction. SAO and deblocking filtering are not applied to this regions.

Special Coding Modes Transform skipping mode Only the transform is bypassed. It improves compression for certain types of video content such as computer-generated images or graphics mixed with camera-view content. It can be applied to TBs of 4×4 size only.