HEVC High Level Syntax and Around

Slides:



Advertisements
Similar presentations
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Advertisements

MPEG4 Natural Video Coding Functionalities: –Coding of arbitrary shaped objects –Efficient compression of video and images over wide range of bit rates.
Basics of MPEG Picture sizes: up to 4095 x 4095 Most algorithms are for the CCIR 601 format for video frames Y-Cb-Cr color space NTSC: 525 lines per frame.
2004 NTU CSIE 1 Ch.6 H.264/AVC Part2 (pp.200~222) Chun-Wei Hsieh.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
Technion - IIT Dept. of Electrical Engineering Signal and Image Processing lab Transrating and Transcoding of Coded Video Signals David Malah Ran Bar-Sella.
1 Video Coding Concept Kai-Chao Yang. 2 Video Sequence and Picture Video sequence Large amount of temporal redundancy Intra Picture/VOP/Slice (I-Picture)
A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors Chenggang Yan, Yongdong Zhang, Jizheng Xu, Feng Dai,
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.
Ch. 6- H.264/AVC Part I (pp.160~199) Sheng-kai Lin
Overview of Error Resiliency Schemes in H.264/AVC Standard Sunil Kumar, Liyang Xu, Mrinal K. Mandal, and Sethuraman Panchanathan Elsevier Journal of Visual.
Overview of the H.264/AVC Video Coding Standard
Analysis, Fast Algorithm, and VLSI Architecture Design for H
CS :: Fall 2003 MPEG-1 Video (Part 1) Ketan Mayer-Patel.
Efficient Fine Granularity Scalability Using Adaptive Leaky Factor Yunlong Gao and Lap-Pui Chau, Senior Member, IEEE IEEE TRANSACTIONS ON BROADCASTING,
Block Partitioning Structure in the HEVC Standard
BY AMRUTA KULKARNI STUDENT ID : UNDER SUPERVISION OF DR. K.R. RAO Complexity Reduction Algorithm for Intra Mode Selection in H.264/AVC Video.
An Introduction to H.264/AVC and 3D Video Coding.
January 26, Nick Feamster Development of a Transcoding Algorithm from MPEG to H.263.
Liquan Shen Zhi Liu Xinpeng Zhang Wenqiang Zhao Zhaoyang Zhang An Effective CU Size Decision Method for HEVC Encoders IEEE TRANSACTIONS ON MULTIMEDIA,
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
Kai-Chao Yang Hierarchical Prediction Structures in H.264/AVC.
MPEG-2 Standard By Rigoberto Fernandez. MPEG Standards MPEG (Moving Pictures Experts Group) is a group of people that meet under ISO (International Standards.
PROJECT INTERIM REPORT HEVC DEBLOCKING FILTER AND ITS IMPLEMENTATION RAKESH SAI SRIRAMBHATLA UTA ID:
 Coding efficiency/Compression ratio:  The loss of information or distortion measure:
Page 19/15/2015 CSE 40373/60373: Multimedia Systems 11.1 MPEG 1 and 2  MPEG: Moving Pictures Experts Group for the development of digital video  It is.
MPEG-1 and MPEG-2 Digital Video Coding Standards Author: Thomas Sikora Presenter: Chaojun Liang.
Profiles and levelstMyn1 Profiles and levels MPEG-2 is intended to be generic, supporting a diverse range of applications Different algorithmic elements.
Windows Media Video 9 Tarun Bhatia Multimedia Processing Lab University Of Texas at Arlington 11/05/04.
By Abhishek Hassan Thungaraj Supervisor- Dr. K. R. Rao.
Outline JVT/H.26L: History, Goals, Applications, Structure
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
High Efficiency Video Coding Kiana Calagari CMPT 880: Large-scale Multimedia Systems and Cloud Computing.
VIDEO COMPRESSION USING NESTED QUADTREE STRUCTURES, LEAF MERGING, AND IMPROVED TECHNIQUES FOR MOTION REPRESENTATION AND ENTROPY CODING Present by fakewen.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
Rate-GOP Based Rate Control for HEVC SHANSHE WANG, SIWEI MA, SHIQI WANG, DEBIN ZHAO, AND WEN GAO IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING,
- By Naveen Siddaraju - Under the guidance of Dr K R Rao Study and comparison between H.264.
Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu CSVT
Image/Video Coding Techniques for IPTV Applications Wen-Jyi Hwang ( 黃文吉 ) Department of Computer Science and Information Engineering, National Taiwan Normal.
IEEE Transactions on Consumer Electronics, Vol. 58, No. 2, May 2012 Kyungmin Lim, Seongwan Kim, Jaeho Lee, Daehyun Pak and Sangyoun Lee, Member, IEEE 報告者:劉冠宇.
IntroductiontMyn1 Introduction MPEG, Moving Picture Experts Group was started in 1988 as a working group within ISO/IEC with the aim of defining standards.
Video Compression—From Concepts to the H.264/AVC Standard
Block-based coding Multimedia Systems and Standards S2 IF Telkom University.
Overview of the High Efficiency Video Coding (HEVC) Standard
Time Optimization of HEVC Encoder over X86 Processors using SIMD Kushal Shah Advisor: Dr. K. R. Rao Spring 2013 Multimedia.
Highly Parallel Mode Decision Method for HEVC Jun Zhang, Feng Dai, Yike Ma, and Yongdong Zhang Picture Coding Symposium (PCS),
Introduction to MPEG Video Coding Dr. S. M. N. Arosha Senanayake, Senior Member/IEEE Associate Professor in Artificial Intelligence Room No: M2.06
Fundamentals of Multimedia 2 nd ed., Chapter 12 Li, Drew, Liu1 Chapter 12 New Video Coding Standards: H.264 and H H H Comparisons.
H. 261 Video Compression Techniques 1. H.261  H.261: An earlier digital video compression standard, its principle of MC-based compression is retained.
Perceptually-Driven Video Coding with the Daala Video Codec
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
HEVC Intra Prediction Prepared by Shevach Riabtsev
CSI-447: Multimedia Systems
HEVC Complexity and Implementation Analysis
Adaptive Block Coding Order for Intra Prediction in HEVC
Thomas Daede October 5, 2017 AV1 Update Thomas Daede October 5, 2017.
Overview of the Scalable Video Coding
Video-in-Video Insertion into a Pre-encoded Bit-stream
Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission Vineeth Shetty Kolkeri EE Graduate,UTA.
Supplement, Chapters 6 MC Course, 2009.
Study and Optimization of the Deblocking Filter in H
ENEE 631 Project Video Codec and Shot Segmentation
/ Fast block partitioning method in HEVC Intra coding for UHD video /
Viewport-based 360 Video Streaming:
Standards Presentation ECE 8873 – Data Compression and Modeling
Viewport-based 360 Video Streaming:
MPEG4 Natural Video Coding
Bongsoo Jung, Byeungwoo Jeon
Presentation transcript:

HEVC High Level Syntax and Around Prepared by Shevach Riabtsev All questions/suggestions pls. address to riabtsev@yahoo.com

HEVC Encoder T & Q Q-1& T-1 + Ref Ref Ref - + Ref. Ref Ref Ref + Residual T & Q CABAC Bit-Stream Ref Ref Motion Est. - Input Video MVs Inter Motion Comp. Intra/Inter Decision Reference samples MVs/Intra modes Mode Intra Pred. Intra Intra Est. Quantized residuals + Ref. Ref Ref Ref SAO Deblk. Reconstructed Q-1& T-1 + DPB SAO Params Est. Filter Control SAO params Notes: In addition to AVC/H.264, SAO and SAO Params Estimation added. The block “SAO Params Est.” can be executed right after deblocking or right after the reconstruction (with negligible penalty) as shown in the figure. HEVC similarity with AVC/H.264 allows quick upgrading of existing AVC/H.264 solutions to HEVC ones.

Bitstream Structure VPS SPS PPS Slice Header Slice Data * * * * * * * * Picture #1 Slice Header Slice Data Slice Header Slice Data * * * * Picture #k As in H.264/AVC the byte stream format is also specified in HEVC, where each NAL unit is delimited by start-code (0x000001). Notice that each stream must commences with the 4-bytes start code (0x00000001) at least. The 4-bytes start code at the very beginning of a stream enables a decoder to achieve byte boundary and not skip over the first NAL (provided that the decoder enter to the stream in bit-aligned position and not byte-aligned one).

High-Level Syntax ( VPS/SPS) VPS – dedicated to convey information that is common for multiple layers, i.e. each layer refers same VPS SPS – contain information which applies to all pictures of a video sequence and is fixed within this sequence: Profile, level, picture size, number sub-layers Enabling flags Restrictions: log2_min_luma_coding_block_size_minus3 - minimal CU size log2_diff_max_min_luma_coding_block_size – together with the minimal CU size specifies maximal CU size log2_min_transform_block_size – minimal transform block size log2_diff_max_min_transform_block_size – together with the minimal TB size specifies the maximal TB size Temporal scalability control Visual usability information (VUI) Notes: There is a duplication of some information between SPS and VPS (e.g. profile_idc).

Potential Usage of Some SPS Parameters log2_min_luma_coding_block_size – specify the minimal CU size. Potential usage: if a priory known that a video sequence is “flat” or “smooth” then it’s worth to consider setting log2_min_luma_coding_block_size = 4 (16x16). Otherwise split-flags at the depth 16x16 are redundantly signaled. log2_diff_max_min_luma_coding_block_size – together with log2_min_luma_coding_block_size specify CTU size. There is no reason (excepting maybe a legacy to H.264/AVC) to set CTU size smaller than 64x64. Moreover, according to [8], 64 × 64-sized CTU brings nearly 12% bitrate reduction on the average compared with 16×16-sized CTU. log2_min_transform_block_size – specify the minimal transform block size. Potential usage, in case of “flat” video sequence it’s worth to consider setting log2_min_transform_block_size to 8x8. log2_diff_max_min_transform_block_size - together with the minimal TB size specifies the maximal TB size. Large transform sizes can cause performance peaks therefore it’s worth consider to avoid 32x32 transforms by setting maximal transform size to 16x16.

High-Level Syntax ( PPS/Slice Header) PPS – conveys information which could change from picture to picture Reference list size Initial QP, by the way do not confuse QP with the quantizer step size. QP is a control parameter that controls what the step size is. Enabling flags Tiles/Wavefronts Slice Header - conveys information that can change from slice to slice POC, Slice type Prediction weights Deblocking parameters Tiles Entry points Reference picture lists: the list of reference pictures in DPB is explicitly signaled in the slice header (unlike to AVC/H.264 where MMCO or sliding window mode is used). Not mentioned pictures in the list are marked as unused for reference and should be removed from DPB respectively. It’s worth mentioning that the explicit signaling of the reference pictures enhances error resilience. Indeed, if a decoder detects that one of the mentioned pictures is not exist in DPB then the decoder derives that this picture got lost. Maximal number of reference indexes is 15 (unlike 16 in AVC/H.264).

Selected Picture Types (IDR, CRA) IDR - pictures following the IDR in decoding order cannot use pictures decoded prior to the IDR as reference: CRA – pictures following the CRA in both decoding and presentation order cannot use pictures decoded prior the CRA as reference: Leading pictures CRA 1 2 3 4 CRA 2 3 1 4

Selected Picture Types (RADL, RASL) Leading pictures - following in decoding order but preceding in presentation order. Leading pictures are divided into two types: RADL (random access decodable leading) – can be correctly decoded if decoding starts with the current CRA RASL (random access skipped leading) - can’t be correctly decoding if decoding starts with the current CRA and therefore this picture should be skipped. Decoding order: Leading pictures CRA 1 2 3 4 RASL RADL CRA Presentation order: 2 3 1 4

Picture Syntax (2) Coding Tree Block (CTB): Picture is partitioned into square coding tree blocks (CTBs). The size N of the CTBs is chosen by the encoder (16x16, 32x32, 64x64). Luma CTB covers a square picture area of N ×N samples and the corresponding chroma CTBs cover each (N/2) × (N/2) samples (in 4:2:0 format). Coding Tree Units (CTU): The luma CTB and the two chroma CTBs, together with the associated syntax, form a coding tree unit (CTU). The CTU is the basic processing unit similar to MB in prior standards. Coding Block (CB): Each CTB can be further partitioned into multiple coding blocks (CBs). The size of the CB can range from the same size as the CTB to a minimum size (8×8). Coding Unit (CU) The luma CB and the chroma CBs, together with the associated syntax, form a coding unit (CU). Each CU can be either Intra or Inter predicted. Actually CU is the basic unit for compression.

CTU Syntax 64x64 CTU

CTU Syntax (2) All CUs in a CTU are encoded (traversed) in Z–Scan (depth-first) order, this order makes top and left samples to be available (casual) in most cases : 64x64 CTU The figure taken: Benjamin Bross: “Relax it's only HEVC”, WBU-ISOG Forum, European Broadcast Union, Geneva, Switzerland, November 28, 2012,

CTU Syntax (3) CU Hdr CU Data CU Hdr CU Data * * * * CTU Formally CTU specifies quad-tree traversed in depth-first order Note: unlike to prior standards where MB consists of MB header followed by MB data, in HEVC ‘headers are dispersed’ or interleaved with data (complicate CTU pipeline, we can’t separate CTU headers parsing and CTU data decoding): CTU Header CU Hdr CU Data CU Hdr CU Data * * * * CTU

CU Syntax (1) Prediction Block (PB): Each CB is partitioned in 1, 2 or 4 prediction blocks (PBs). Prediction Unit (PU): The luma PB and the chroma PBs, together with the associated syntax, form a prediction unit (PU). Intra: 2Nx2N NxN (only if CB size is smallest CB size) Inter: 2Nx2N NxN 2NxN Nx2N

CU Syntax (2) Inter Assymetric Partitions (conditioned by amp_enabled_flag in SPS and disabled for the minimal CB size): nLx2N nRx2N 2NxnU 2NxnD Examples when assymetric partitions are beneficial: 2NxnU 2NxnD nLx2N nRx2N Notice that if CU size is 8x8 assymetric partitions are disabled (in order to reduce complexity). I think that assymetric partitions could be disabled for 16x16 sizes too.

CU Syntax (3) Notes: The smallest luma PB size is 4 × 8 or 8 × 4 samples (where 4x8 and 8x4 are permitted only for uni-directional predictions, no bi-prediction < 8x8 allowed). Chroma PBs mimic corresponding luma partition with the scaling factor 1/2 for 4:2:0. Assymetric splitting is also applied to chroma CBs. Preprocessing (basing on texture and/or block complexity metric) can be used to speed up PB size decision process. If the complexity of CTU is high (detailed, textured region) then large PUs are filtered out, if the complexity is low (flat region) then small PUs are filtered out. Example method for the fast PU size detection (for intra case) is described in the paper “Content Adaptive Prediction Unit Size Decision Algorithm for HEVC Intra Coding”, 2012 Picture Coding Symposium.

CU Syntax (4) Transform Block (TB) : Each luma CB can be quadtree partitioned into one, four or larger number of TBs. The number of transform levels is controlled by max_transform_hierarchy_depth_inter and max_transform_hierarchy_depth_intra. Example. CB divided into two TB levels (the block #1 is split into four blocks): 1,0 1,1 1,2 1,3 2 3 2 3 1,0 1,1 1,2 1,3

CU Syntax (5) [Shevach] Computational complexity to find best TU partition: For the range transform block sizes from 8x8 to 32x32 we evaluate RD cost 21 times: 1 {32x32} + 4 {16x16} + 16 {8x8} = 21 For the range transform block sizes from 4x4 to 32x32 (intra CU) we evaluate RD cost 53 times: 1 {32x32} + 4 {16x16} + 16 {8x8} + 32 {4x4} = 53

CU Syntax (6) Notes Unlike to H.264/AVC where TB ≤ PB, prediction and transform partitioning are almost independent (i.e. TB can contain several PBs and vice versa). However, TB>PB is allowable only for Inter and not for Intra (i.e. intra TB ≤ PB ). Reported by some experts that prediction discontinuities on PB boundaries within TB are smoothed by transform and quantization. If PB and TB boundaries coincide then the discontinuities are observed increased. 2x2 TBs are disabled (minimal TB size is 4x4). How handle chroma blocks in 4:2:0 format if luma TB is 4x4? Luma 8x8 Chroma 4x4 4x4 1 4x4 Cb 4x4 2 4x4 3 4x4 Cr 4x4

Restrictions/Constraints HEVC disallows 16x16 CTBs for level 5 and above (4K TV). Motivation: 16x16 CTBs add overheads for decoders to target 4K TV: Up to 10% increase in worst-case decode time Add storage for SAO params. Maximal CTU size shall be less than or equal to 5*RawCtuBits/3.   The variable RawCtuBits is derived as RawCtuBits=CtbSizeY * CtbSizeY * BitDepthY +2 * ( CtbWidthC * CtbHeightC ) * BitDepthC   Numeric Example: Let’s take CtbSizeY=16 (as in AVC/H.264). Then RawCtuBits = 16*16*8+2*8*8*8 = 3072, the maximal CTB bit-size is 5*3072/3 = 5120 bits ( much more than the corresponding 3200 bits threshold in AVC/H.264).

Note on maximal CTU bit-size and worst-case CABAC performance CABAC decoding (as well as encoding) contains the renormalization stage (due to finite arithmetic). The renormalization procedure is time consuming since it contains a while-loop and several if-else statements inside the loop. The number of calls the renormalization routine for a CTU is less or equal than the CTU bit-size (because during the renormalization at least one bit is read from bit-stream). Therefore if the worst case CTU bit-size is 5120 bits then the decoder has to invoke the renormalization at most 5120 times, i.e. 5120 times in the worst case. From point of CABAC HW design the execution of renormalization the 5120 times is a serious performance bottleneck.

Note on Interlace Coding Unlike to H.264/AVC, support of interlace coding in HEVC is not exist: No mixed frame-field interaction (like PAFF in H.264/AVC) No interlace scanning of transform coefficients No correction MVX[1] (or y-component of MV) if current and reference pictures are in different polarity (top-bottom or bottom-top). Field pictures are signaled by an SEI message (pic_timing) for every picture in the sequence.   If progressive and interlace streams are spliced together then it’s required to insert a new sequence start to switch from progressive coding to interlaced one (or vice versa). In addition a particular flag ‘general_interlaced_source_flag’ is signaled in VPS/SPS (within profile_tier_level section) In H.264/AVC PAFF mode can be used to diminish I-frame bitrate peaks: I-frame is divided into two field pictures where the top field picture is coded as I-picture while the bottom picture is coded as P-picture. Consequently total bits produced by two I-P field pictures is expected to be smaller than the bits generated by single I-frame. Because H.265/HEVC does not support PAFF the above trick can’t be applied to cope with I-frame bitrate peaks.

Note on Picture Boundaries As per the standard the picture boundaries are defined in units of the minimum luma CB size (MinCbSizeY): pic_width_in_luma_samples shall not be equal to 0 and shall be an integer multiple of MinCbSizeY. pic_height_in_luma_samples shall not be equal to 0 and shall be an integer multiple of MinCbSizeY As a result, at the right and bottom edges of the picture CTBs may exceed the picture boundaries. Data outside of the picture is not coded, therefore quadtree on the right and bottom edges are pruned respectively. Pls. see the following slide (granted by John Funnel from Parabola) for illustration: