Presentation is loading. Please wait.

Presentation is loading. Please wait.

EE5359:MULTIMEDIA PROCESSING

Similar presentations


Presentation on theme: "EE5359:MULTIMEDIA PROCESSING"— Presentation transcript:

1 EE5359:MULTIMEDIA PROCESSING
Project Proposal on: COMPARISON AND ANALYSIS OF INTRA-PREDICTION EFFICIENCY IN HEVC, H.264, VP9, AVS PART 2 AND DIRAC UNDER THE GUIDANCE OF DR. K.R.RAO ELECTRICAL ENGINEERING DEPARTMENT, THE UNIVERSITY OF TEXAS AT ARLINGTON SWETHAA ALLIYALAMANGALAM JAYARAMAN THE UNIVERSITY OF TEXAS AT ARLINGTON

2 Acronyms AVC Advanced Video Coding JVT Joint Video Team AVS
Audio Video Standard MB Macroblock ADST Asymmetric Discrete Sine Transform MPEG Moving Picture Experts Group AU Access Unit MSE Mean Square Error BBC British Broadcasting Corporation NAL Network Adaptation Layer BD-BR Bjøntegaard-Delta Bit-Rate NGOV Next Generation Open Video BD-PSNR Bjøntegaard-Delta Peak Signal-to-Noise Ratio OBMC overlapped block-based motion compensation CABAC Context-adaptive binary arithmetic coding PSNR Peak Signal-to-Noise Ratio CTU Coding Tree Unit PU Prediction Unit CU Coding Unit QCIF Quarter Common Intermediate Format DBF De-Blocking Filter RD Rate Distortion DC Direct Current RDO Rate Distortion Optimization DCT Discrete Cosine Transform SAO Sample Adaptive Offset DFT Discrete Fourier Transform SDTI Serial Data Transport Interface DST Discrete Sine Transform SMPTE Society of Motion Picture and Television Engineers HD High Definition SSIM Structural Similarity Index HDTV High Definition Television TM True Motion HEVC High Efficiency Video Coding TU Transform Unit ISO International Organization for Standardization UVLC Universal Variable Length Code ITU-T International Telecommunication Union (Telecommunication Standardization Sector) VC Video Coding JPEG Joint Photographic Experts Group VLC Variable Length Coding

3 Objective This Project aims at comparing various video coding standards such as HEVC (High Efficiency Video Coding), H.264/AVC, VP9, DIRAC PRO and AVS (Audio Video Standard) China part 2 based upon Intra-Prediction Efficiency. The comparison will be carried out with the help of performance comparison metrics such as PSNR [5], SSIM [45], MSE [5], and BD – PSNR [4], BD BR [4], and Computational Complexity. The tests will be carried using The HM Test Model 16.3 [8], JM Software 18.6 [9], The WebM Project’s Encoder [7], DIRAC PRO Software [10] and AVS China Reference Software [34] for HEVC, H.264/AVC, VP9, DIRAC PRO and AVS PART 2 respectively

4 Basics (1) Pixel: An abbreviation for picture element.
Image: An image is an array, or a matrix, of square pixels arranged in columns and rows[29]. Video: A sequence of images processed electronically into an analog or digital format and displayed on a screen with sufficient rapidity as to create the illusion of motion and continuity. Multimedia Signal: It is the integration of several media sources, such as video, audio, graphics, animation, text in a meaningful way to convey some information[30]. Video Compression: It is the process of lessening the amount of data needed for representation of the videos by removing redundant data. Video Decompression: It is the inverse process of Video compression. Figure 1: Representation of an Image

5 Basics (2) Inter-pixel Redundancy: It is referred as the inter-pixel correlation present between the pixels of an image frame or pixels of a group of successive image or video frames[31]. Spatial Redundancy(Intra-Frame Redundancy):It represents the statistical correlation between pixels within an image frame[31]. Temporal Redundancy(Inter-Frame Redundancy): It is concerned with the statistical correlation between pixels from successive frames in a temporal image or video sequence[31]. Figure 2: Classification of Types of Redundancies Coding Tools: These are the tools which help in the process of video compression by removing the redundancies. Coding Efficiency: It is the ability to minimize the bit rate necessary for representation of video content to reach a given level of video quality—or, as alternatively formulated, to maximize the video quality achievable within a given available bit rate.

6 Basics (3) Macroblocks: The input video frame is initially partitioned into blocks of the same size called macroblocks [22]. The compression and decoding process works within each macroblock. A macroblock is sub partitioned into smaller blocks to perform prediction. The aim of the prediction process is to reduce data redundancy and therefore, not store excessive information in coded bit stream [22]. There are two basic types of prediction: intra and inter. Intra-prediction works within a current video frame and is based upon the compressed and decoded data available for the block being predicted. Inter-prediction is used for motion compensation: a similar region on previously coded frames close to the current block is used for prediction.

7 What is Intra – Prediction?
Intra-prediction is carried in the current video frame and makes prediction for the current block based upon the available encoded and decoded data [14]. Intra-Prediction plays a key role in the determination of Compression Efficiency of the whole codec [11]. It was initially proposed in and then it saw its application in transform domain such as H.261 and H.263 [12]. VIDEO CODING STANDARDS BLOCK SIZE NUMBER OF PREDICTION MODES AVS PART 2 8x8 block 5 (0-4) DIRAC PRO 4x4 Spatial (Forward and Backward) 2 VP9 Super Blocks upto 64x64 10 H.264/AVC 4x4 and 16x16 9 or 4 HEVC 16x16 or 32x32 or 64x64 CTU 35 (0-34) Table 1: Intra-Prediction among various video coding standards at a glance

8 AVS (Audio Video Standard) PART 2:
The AVS Video Coding standard was developed by the China Audio Video Coding Standard (AVS) working group. AVS Part 2 focusses into high-definition digital video broadcasting and high-density storage media. It is also known as AVS1-P2 in AVS [18]. Also called as AVS Jizhun Profile. The spatial intra prediction is based on 8x8 block. It uses decoded information in the current frame as the reference of prediction, exploiting statistical spatial dependencies between pixels within a picture. 5 luminance intra-prediction modes. 4 chrominance intra prediction modes. Mode 0: Horizontal Mode 1: Vertical Mode 2: DC Mode: each pixel of current block is predicted by an average of the vertically and horizontally corresponding reference pixels Mode 3: Diagonal down left Mode 4: Diagonal down right

9 AVS China Part 2: Encoder and Decoder
Figure 3: AVS China PART 2: Encoder [34] Figure 4: AVS China PART 2: Decoder [34]

10 Intra frame prediction in AVS China Part 2
Prediction of the most probable mode is according to the intra-prediction modes of neighboring blocks. This will help to reduce average bits needed to describe the intra-prediction mode in video bit stream. The reconstructed pixels of neighboring blocks before deblocking filtered is used as reference pixels for the current block is shown in Figure 4. Figure 4: AVS China Part 2: Macroblock partioning

11 Figure 6: AVS China Part 2: Five Luminance intra-prediction modes [20]
Figure 5: AVS China Part 2: Neighbour pixels in luminance prediction[31] Figure 6: AVS China Part 2: Five Luminance intra-prediction modes [20]

12 DIRAC: Dirac is a hybrid motion-compensated state-of-the-art video codec that uses modern techniques such as wavelet transforms and arithmetic coding [23]. It is an open source video coding technology, used without any license fees developed by the BBC, named in honor of the British scientist Paul Dirac. Dirac Pro is a version of Dirac family of compression tools mainly optimized for video production and archiving applications and the focus is on high quality and low latency. Dirac Pro is intended for high quality applications with lower compression ratios [32].

13 DIRAC: Encoder and Decoder
Figure 7: DIRAC: Encoder [35] Figure 8 DIRAC: Decoder [35]

14 Motion Compensation in DIRAC:
Motion compensation is used to predict the present frame. It uses overlapped block-based motion compensation (OBMC) to achieve good compression and avoid block-edge artifacts which would be expensive to code using wavelets. OBMC allows interaction of neighboring blocks. OBMC is performed with basic blocks arranged into macro-blocks consisting of a 4x4 array of blocks [28]. The OBMC overlapping function used is an integer approximation to the raised-cosine function [28]. Figure 9: DIRAC: Various modes of Splitting macroblocks into sub- blocks [28]

15 Figure 10: Partitioning of a Super Block in VP9 [22]
Like DIRAC, it is also an open source and free-license video compression standard developed by Google [26]. It also aims at reduced bit rate by 50% compared to its predecessor with the same video quality [27]. VP9 introduces super-blocks (SB) of size up to 64x64 and allows breakdown using recursive decomposition all the way down to 4x4. But unlike H.265 these do not need to be square so it can sample 64x32 or 4x8 blocks for greater efficiency. Figure 10: Partitioning of a Super Block in VP9 [22]

16 VP9: Encoder and Decoder
Figure 11 : VP9: Encoder [22] Figure 12 : VP9: Decoder [22]

17 Partitioning of a Super Block and Intra Prediction Modes in VP9
A large part of the coding efficiency improvements achieved in VP9 can be attributed to incorporation of larger prediction block sizes It has 10 prediction modes to rebuild them. For blocks of 4x4: DC, Vertical, Horizontal TM (True Motion), Horizontal Up, Left Diagonal, Vertical Right, Vertical Left, Right Diagonal, and Horizontal Down. For blocks from 8x8 to 64x64: DC_PRED (DC prediction) TM_PRED (True-motion prediction) H_PRED (Horizontal prediction) V_PRED (Vertical prediction) D27 (angle 27 degrees) D45 (angle 45 degrees) D63 (angle 63 degrees) D117 (angle 117 degrees) D135 (angle 135 degrees) D153 (angle 153 degrees) Figure 13: Angular Intra-Prediction Modes for VP9 [14]

18 Figure 14: Intra_4x4 Prediction in H.264/AVC[40]
THE H.264/AVC is the newest video coding standard developed by ITU- T Video Coding Experts Group (VCEG) and ISO/JEC MPEG Video Group named Joint Video Group (JVT) [21]. Each PU is predicted from neighboring image data in the same picture, using DC prediction (an average value for the PU), planar prediction (fitting a plane surface to the PU) or directional prediction (extrapolating from neighboring data)[40]. Figure 14: Intra_4x4 Prediction in H.264/AVC[40] Mode 0: Vertical Prediction Mode 1: Horizontal Prediction Mode 2: DC Prediction Mode 3: Diagonal Down-Left Prediction Mode 4: Diagonal Down-Right Prediction Mode 5: Vertical Right Prediction Mode 6:Horizontal Down Prediction Mode 7: Vertical Left Prediction Mode 8: Horizontal Up Prediction

19 H.264/AVC: Encoder and Decoder
Figure 15: H.264/AVC: Encoder [36] Figure 16 :H.264/AVC: Decoder [36]

20 HEVC(HIGH EFFICIENCY VIDEO CODING)
High Efficiency Video Coding (HEVC) is the latest Video Coding format [10]. It challenges the state-of-the-art H.264/AVC [17] Video Coding standard which is in current use in the industry by being able to reduce the bit rate by 50% and retaining the same video quality. On 13 April 2013 [17], HEVC standard also called H.265 was approved by ITU- T. Joint Collaborative Team on Video Coding (JCTVC), is a group of video coding experts from ITU-T Study Group (VCEG) and ISO/IEC JTC 1/SC 29/WG 11 (MPEG).

21 HEVC: Encoder and Decoder
Figure 16: HEVC: Encoder [37] Figure 17:HEVC: Decoder [38]

22 PREDICTION BLOCK SIZES AND MACROBLOCK CONCEPT in HEVC[14]:
The concept of macroblock in HEVC [9] is represented by the Coding Tree Unit (CTU). CTU size can be 16x16, 32x32 or 64x64. Larger CTU size aims to improve the efficiency of block partitioning on high resolution video sequence. Larger blocks provoke the introduction of quad-tree partitioning of a CTU into smaller coding units (CUs). A coding unit is a bottom-level quad-tree syntax element of CTU splitting. The CU contains a prediction unit (PU) and a transform unit (TU). The TU is a syntax element responsible for storing transform data. Allowed TU sizes are 32x32, 16x16, 8x8 and 4x4. The PU is a syntax element to store prediction data like the intra-prediction angle or inter-prediction motion vector. (a) (b) Figure 18: Coding Tree Unit splitting example with solid lines for CU split: with PU splitting depicted as dotted lines b) with TU splitting depicted as dotted lines [14]

23 The CU can contain up to four prediction units.
CU splitting on PUs can be 2Nx2N, 2NxN, Nx2N, NxN, 2NxnU, 2NxnD, nLx2N and nRx2N (Figure 5) where 2N is a size of a CU being split. In the intra-prediction mode only 2Nx2N PU splitting is allowed. An NxN PU split is also possible for a bottom level CU that cannot be further split into sub CUs. Figure 19 : Prediction Unit Splitting in HEVC [14] Figure 21: Luma Intra Prediction Modes in HEVC [14] Figure 20: Prediction Modes in HEVC [14]

24 Project plan Intra-Prediction in all the standards have been studied.
Finalization of Video Test Sequences. Installation and testing of The HM Test Model 16.3[6], JM Software 18.6[7] , The WebM Project’s Encoder [5], DIRAC PRO Software [8] and AVS China Reference Software[34] for the testing in case of HEVC and H.264. Performing the comparison based on the metrics PSNR [4], SSIM [9], MSE [4], BD – PSNR [3], BD BR [3], Computational Complexity and RD Plots will be plotted.

25 References [1] I. Richardson, “The H.264 Advanced video Compression Standards”, Wiley, 2010. [2] K.R. Rao, D.N. Kim and J.J. Hwang, “Video Coding Standards: AVS China, H.264/MPEG-4 Part10, HEVC, VP6, DIRAC and VC-1”, Springer, 2014 [3] BD-BR and BD-PSNR: G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves”, ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr [4] PSNR and MSE: [5] The WebM Project’s VP9 Encoder: [6] The HM Test Model 16.3: [7] JM Software 18.6: [8] Dirac Pro Software: [9] Z. Wang, et al., ―Image quality assessment: From error visibility to structural similarity,‖ IEEE Trans. Image Processing, vol. 13, pp. 600–612, Apr. 2004 [10] G.J. Sullivan et al, "Standardized Extensions of High Efficiency Video Coding (HEVC)", IEEE Journal of Selected Topics in Signal Processing, vol.7, no.6, pp , Dec. 2013 [11] Access the website Project on: “Intra Prediction Efficiency and Performance Comparison of HEVC and VP9”, S. Sukumaran, 2014. [12] ITU-T Recommendation H.263, “Video coding for low bit-rate communication”, Feb. 1998 [13] J. Ostermann et al, “Video Coding with H.264/AVC: Tools, performance and complexity”, IEEE -Circuits and Systems Magazine, vol. 4, pp.7-28, First Quarter 2004. [14] M.P. Sharabayko et al, "Intra Compression Efficiency in VP9 and HEVC" Applied Mathematical Sciences, Vol. 7, no. 137, pp.6803 – 6824, Hikari Ltd, 2013 [15] G. Bjontegaard, “Coding improvement by using 4x4 blocks for motion vectors and transform”, Nov. 1997

26 [16] Z. Nan et al, “Spatial Prediction Based Intra-Coding [video-coding]” IEEE International Conference on Multimedia and Expo (ICME), Vol. 1, pp , June 2004. [17] T. Wiegand et al, “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp , Jul [18] W. Gao et al, “AVS Video Coding Standard”, Intelligent Studies in Computational Intelligence Volume 280, pp , [19] W. Gao et al, “AVS- the Chinese Next Generation Video Coding Standard”, [20] L. Yu et al. “An Overview of AVS-Video: tools, performance and complexity”, Visual Communications and Image Processing 2005, Proc. of SPIE, vol. 5960, pp , July 31, 2006. [21] ISO/IEC JTC1/SC29/WG1 1 (MPEG), "Coding of audio-visual objects - Part 10: Advanced Video Coding," International Standard , ISO/IEC, 2004. [22] Access the website Project on: “Comparative study of Intra Frame Coding efficiency in HEVC and VP9”, S. Kodpadi, 2014. [23] T. Borer, and T. Davies, “Dirac video compression using open technology”, BBC EBU Technical Review, July 2005 [24] “Dirac Pro to bolster BBC HD links”: bbc-hd-links/ article [25] “And now, Dirac from the Olympics, a new free codec!” [26] Video Test Sequences: [27] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved Available on: [28] T. Davies, “A modified rate-distortion optimization strategy for hybrid wavelet video coding”, IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 Proceedings. 2006, Vol.: 2, pp. II, Publication Date: May 2006. [29] Basics of Image Processing: [30] Multimedia Processing: contents/IIT%20Kharagpur/Multimedia%20Processing/pdf/ssg_m1l1.pdf

27 [31] L. Yu, S. Chen and J. Wang, “Overview of AVS-video coding standards”, Signal Processing: Image Communication, Vol. 24, Issue 4, Special Issue on AVS and its Application, pp , April 2009. [32]“Dirac Pro web page” at [33] M.Wien: High Efficiency Video Coding: Coding Tools and Specification, Springer, 2015. [34] Access the Project on: “Video compression standard for high definition video: A comparative study of H.264, Dirac Pro and AVS part 2”, S. Gangavati, 2012. [35] Access the website Project on: “Performance Analysis of Dirac Pro with H.264 Intra frame coding”, P. Kharwandikar, 2010. [36] X. Zhou, E. Q. Li, and Y.-K. Chen, “Implementation of H.264 decoder on general purpose processors with media instructions”, SPIE Conference on Image and Video Communications and Processing, vol. 5022, pp , May 2003. [37] D. Marpe et al, “The H.264/MPEG4 advanced video coding standard and its applications”, IEEE Communications Magazine, Vol. 44, pp , Aug [38] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012. [39] HM Software manual: [40] I.E.Richardson, “Coding Video : A Practical guide to HEVC and beyond”, Wiley, 11 May 2015.


Download ppt "EE5359:MULTIMEDIA PROCESSING"

Similar presentations


Ads by Google