COURSE: EE MULTIMEDIA PROCESSING,

COURSE: EE5359 - MULTIMEDIA PROCESSING,
PERFORMANCE COMPARISON OF HEVC intra, JPEG, JPEG 2000, JPEG XR, JPEG LS and VP9 intra A PROJECT UNDER THE GUIDANCE OF DR. K. R. RAO COURSE: EE MULTIMEDIA PROCESSING, SPRING 2016 By Swaroop Krishna Rao

Contents Objective Configurations of tools Need for Compression
Test Platform Lossless and Lossy Compression Results Conclusions HEVC JPEG JPEG 2000 JPEG XR JPEG LS VP9 Comparison metrics Profiles used for comparison Test Sequences

List of Acronyms and Abbreviations
ADST: Asymmetric Discrete Sine Transform. JPEG-LS: JPEG Lossless. AVC: Advanced Video Coding. JPEG-XR: JPEG extended range. CABAC: Context Adaptive Binary Arithmetic Coding. KTA: Key Technical Areas CAVLC: Context Adaptive Variable Length Coding. LOCO: Low Complexity Lossless Compression. CSVT: Circuits and Systems for Video Technology. MC: Motion Compensation. CTB: Coding Tree Block. ME: Motion Estimation. CTU: Coding Tree Unit. MJPEG: Motion JPEG. CU: Coding Unit. MPEG: Moving Picture Experts Group. DBF: De-blocking Filter. MSE: Mean Square Error. DCT: Discrete Cosine Transform. NGOV: Next Generation open Video DST: Discrete Sine Transform. PB: Prediction Block. DPB: Decoded Picture Buffer. PCS: Picture Coding Symposium. EBCOT: Embedded Block Coding with Optimized Truncation. PSNR: Peak Signal to Noise Ratio. PU: Prediction Unit. EZW: Embedded Zero-tree Wavelet. QP: Quantization Parameter. HD: High Definition. RD: Rate Distortion. HEVC: High Efficiency Video Coding. SAO: Sample Adaptive Offset. HM: HEVC Test Model. SPIE: Society of Photo-Optical Instrumentation Engineers. HP: High Profile. SSIM: Structural Similarity Index. IEC: International Electro-technical Commission. TB: Transform Block. ISO: International Organization for Standardization. TU: Transform Unit. ITU-T: International Telecommunication Union- Telecommunication Standardization Sector. VCEG: Visual Coding Experts Group. VLC: Variable Length Coding. JCT-VC: Joint Collaborative Team on Video Coding. JM: H.264 Test Model. JPEG: Joint Photographic Experts Group.

Objective The objective of this project is to study coding standards HEVC intra [1] [26] [27] [28], JPEG [32], JPEG 2000 [34], JPEG XR [35], JPEG LS [36] and VP9 intra [3] [4] and understand various techniques in image coding such as prediction, transform, quantization and coding. A performance comparison of these codecs based on two metrics such as PSNR [19] and SSIM [5] [14] [24] is carried out.

Need for Compression Reduce redundancy of the image or video data in order to be able to store or transmit data in an efficient form. Compressed video can effectively reduce the bandwidth required to transmit video via terrestrial broadcast, via cable TV, or via satellite TV services.

Lossless and Lossy Compression
Lossless compression: There is no information loss, and the image can be reconstructed exactly the same as the original Applications: Medical imagery, Archiving Lossy compression: Information loss is tolerable Applications: commercial distribution (DVD) and rate constrained environment where lossless methods cannot provide enough compression ratio Lossy file compression results in lost data and quality from the original version.

HEVC/H.265 [1]: High Efficiency Video Coding (HEVC) [1] is an international standard for video compression developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group) [1]. High Efficiency Video Coding (HEVC) is the latest Video Coding format . It challenges the state-of-the-art H.264/AVC [2] Video Coding standard which is in current use in the industry by being able to reduce the bit rate by 50% and retaining the same video quality [1].

Figure 1: Block Diagram of HEVC Encoder [11]
Encoder in HEVC [11] Figure 1: Block Diagram of HEVC Encoder [11]

Figure 2: Block Diagram of HEVC Decoder [15]
DECODER IN HEVC [15] Figure 2: Block Diagram of HEVC Decoder [15]

Features of HEVC Partitioning [13]:
Figure 3: Picture, Slice, Coding Tree Unit (CTU), Coding Unit (CU) [13]

Features of HEVC (contd…)
Prediction [1] : - Intra prediction: Each CU is predicted from neighboring image data in the same picture, using DC prediction, planar prediction or directional prediction. Inter Prediction: Each PU is predicted from image data in one or two reference pictures, using motion compensated prediction. Transform and Quantization [13]: Any residual data remaining after prediction, is transformed using a block transform based on the Discrete Cosine Transform (DCT) [9]. The transformed data is quantized. One or more block transforms of size 32x32, 16x16, 8x8 and 4x4 are applied to residual data in each CU. Entropy Coding: - HEVC uses Context Adaptive Binary Arithmetic Coding (CABAC) [10] for Entropy coding.

JPEG [32] Compression ratio 2:30 Main Compression Technologies DCT
Perceptual quantization Zig zag scanning Huffman coding Arithmetic coding

Figure 4: Block diagram of JPEG encoder [32]

Figure 5: Block diagram of JPEG decoder [32]

JPEG 2000 [34] Compression ratio 2:50
Main Compression Technology- Wavelets EBCOT Figure 6(a): Block diagram of JPEG 2000 encoder [34] Figure 6(b): Block diagram of JPEG 2000 decoder [34] The first module is component and tile separation, whose function is to cut the image into manageable chunks and to decorrelate the color components. Each tile of each component is then processed separately. The data are first transformed into the wavelet domain, and are then quantized and entropy coded, before forming the output.

Figure 7: Block diagram of JPEG XR encoder [38]
Higher compression ratio than JPEG Main Compression Technology- Based on HD Photo of Microsoft (Windows Media Photo) Figure 7: Block diagram of JPEG XR encoder [38] The first module is component and tile separation, whose function is to cut the image into manageable chunks and to decorrelate the color components. Transform used in JPEG XR is two stage photo core transform (PCT) [7]. Transform is block based using 4x4 blocks. In order to prevent blocking effect before each transform stage overlap filtering can be used. Quantization coefficient can differ between frequency bands (DC, LP and HP), and between macroblocks. Direction of prediction can differ between macroblocks. Macroblocks with high horizontal correlation are predicted from the left, and macroblocks with high vertical correlation are predicted from the top. DC coefficients can also be predicted from both top and left macroblocks. There are three different scanning schemes in JPEG XR. Two of them are used for scanning HP blocks. Third scanning scheme is used for LP coefficients scanning. Adaptive entropy coding is the final step of the algorithm and output of this step is compressed bit stream.

Figure 8: Block diagram of JPEG XR decoder [38]

JPEG LS [36] Compression Ratio 2:1 Main Compression Technologies
Context Modeling Prediction Golomb Codes Arithmetic coding

Figure 9: Block diagram of JPEG-LS [37]

VP9 [3][4] VP9 is an open and royalty free video compression standard developed by Google. VP9 is a successor to VP8 [3]. One of the goals of VP9 is to reduce the bit rate by 50% compared to VP8 while having the same video quality [16]. VP9 has many design improvements compared to VP8. VP9 supports the use of superblocks of 64x64 pixels.

Figure 10: Encoder block diagram for VP9 [18]
Encoder in VP9 [18] Figure 10: Encoder block diagram for VP9 [18]

Figure 11: Decoder block diagram for VP9 [18]
Decoder in VP9 [18] Figure 11: Decoder block diagram for VP9 [18]

Figure 12: Example partitioning of a 64x64 Super-block [4] [17]
Features of VP9 Partitioning [4] [17]: Figure 12: Example partitioning of a 64x64 Super-block [4] [17]

Features of VP9 (contd…)
Prediction [4]: - Intra Prediction : VP9 supports a set of 10 Intra prediction modes for block sizes ranging from 4x4 up to 32x32 - Inter Prediction : VP9 supports a set of 4 inter prediction modes for block sizes ranging from 4x4 up to 64x64 pixels Transform and Quantization [4]: The residuals after subtraction of predicted pixel values are subjected to transformation and quantization . Transform blocks can be 32x32, 16x16, 8x8 or 4x4 pixels. Entropy coding [4]: - VP9 uses 8-bit arithmetic coding engine from VP8 known as bool-coder

Comparison metrics Peak Signal to Noise Ratio (PSNR) [19]
The PSNR in dB is defined as: MAXf is the maximum signal value that exists in the original image The formula used to calculate Y’CbCr PSNR for 4:2:0format is Structural Similarity Index (SSIM) [5][14][24]

Profiles used for comparison
The HM 16.9 [20] [25], JPEG baseline reference software [39], JasPer (version ) [40], HD Photo Device Porting Kit (version 1.0) [41], HP LOCO-I [42] and VPX encoder from the WebM Project [21] test models for HEVC intra, JPEG, JPEG 2000, JPEG XR, JPEG LS and VP9 respectively, is used for comparison in this project. Software and tool: - Visual Studio 2015 - MSU Video Quality Measurement Tool Demo 64-bit

Figure 13:Johnny_1280x720_60.yuv [22]
Test Sequences Figure 13:Johnny_1280x720_60.yuv [22]

Figure 14: Jockey_1920x1080.yuv [22]
Test Sequences (contd…) Figure 14: Jockey_1920x1080.yuv [22]

Figure 15: PeopleOnStreet_2560_1600_30_crop.yuv [22]
Test Sequences (contd…) Figure 15: PeopleOnStreet_2560_1600_30_crop.yuv [22]

Figure 16: Bosphorus_3840x2160.yuv [22]
Test Sequences (contd…) Figure 16: Bosphorus_3840x2160.yuv [22]

Configuration of HM 16.9 [20] Main all-intra profile settings :
IntraPeriod : 1 # Period of I-Frame ( -1 = only first) GOPSize : 1 # GOP Size (number of B slice = GOPSize-1) QP : 32 # Quantization parameter Command line parameters for using HM 16.9 encoder: TAppEncoder [-h] [-c config.cfg] [--parameter=value] where -h Prints parameter usage -c Defines configuration file to use. Multiple configuration files may be used with repeated –c options. --parameter=value Assigns value to a given parameter. Sample command line parameters for HM 16.9 encoder: C:\Users\Swaroop\OneDrive\Documents\Second Semester\MMP\HM 16.9\bin\vc2015\Win32\Release>TAppEncoder.exe –c encoder_intra_main.cfg -wdt hgt fr 30 –f 1 -i Jockey_1920x1080.yuv

Configuration of JPEG [39]
The command line arguments for JPEG-baseline software [39] are as follows: Input image should be in pnm or ppm Encoder: cjpeg –quality N inputfile.pnm outputfile.jpg where quality factor N denotes the scale quantization tables to adjust image quality. Quality factor varies from 0 (worst) to 100 (best); default is 75. Decoder: djpeg –outfile outputfilename.pnm –fileformat inputfile.jpg

Configuration of JPEG 2000 [40]
The command line arguments for JPEG-2000 software [40] are as follows: Encoder: jasper --input inputfilename.bmp --output outputfilename.jp2 –output-format jp2 –O rate=0.01 where rate specify target rate as a positive real number. ‘rate’=1 corresponds to no compression.

Configuration of JPEG XR [41]
For Microsoft HD Photo [41], all options are set to their default values with the only control coming from the quality factor setting: No tiling One-level of overlap in the transformation stage No color space sub-sampling Spatial bit-stream order All sub-bands are included without any skipping

The command line arguments for JPEG XR software [41] are as follows: Encoder: wmpencapp –i input.bmp –o output.wdp –q [] where, quality factor ‘q’ leads to lowering of PSNR resulting in lossy compression. q=0 is the case of lossless compression and wmpencapp command line converts certain uncompressed file formats into equivalent HD photo files

Decoder: wmpdecapp –i input. wdp –o output
Decoder: wmpdecapp –i input.wdp –o output.bmp –c [] where wmpdecapp command line converts HD photo files to different uncompressed file formats and ‘c’ denotes format, c– 0 for 24bppBGR, c-2 for 8bppGray

Configuration of JPEG LS [42]
The settings for JPEG-LS software [42] are as follows at the encoder. Decoder settings need not be changed from default as they follow the encoder settings. Images should be in ppm or pgm format. Line interleaved mode is considered in the project. Error value is varied from 1 to 60. Error value of zero corresponds to no compression. T1, T2, T3 are thresholds. While giving the settings the following condition need to be met. Error value+1<T1<T2<T3. Default RESET value of 64 is considered in the project

Encoder: locoe [flags] [-i infile] [-o outfile] Files: infile: Input file -- must be in PGM or PPM format. outfile: Output file in JPEG-LS format. Decoder: locod [flags] [-i infile] [-o outfile]

Configuration of VP9 [18] This section describes command line parameters required for encoding a video test sequence in VP9. Sample command line parameters for VP9: vpxenc PeopleOnStreet_2560x1600_30_crop.yuv -o pos.22.webm \--codec=vp9 --i420 --width= height= passes=50 -t 0 \--good --cpu-used=0 --end-usage=q \--limit=1 --fps=30000/ verbose --psnr \--lag-in-frames=25 --kf-max-dist=1 \--min-q=32 --max-q=32

Test Platform Processor: Intel(R) Core(TM) i3-5005U CPU @ 2.00GHz
Installed Memory (RAM): 6.00 GB System Type: 64-bit operating system, x-64 based processor

Figure 17: PSNR Vs Frame Index plot for Johnny_1280x720_60.yuv
Results Figure 17: PSNR Vs Frame Index plot for Johnny_1280x720_60.yuv

Figure 18: SSIM Vs Frame Index plot for Johnny_1280x720_60.yuv

Figure 19: PSNR Vs Frame Index plot for Jockey_1920x1080.yuv

Figure 20: SSIM Vs Frame Index plot for Jockey_1920x1080.yuv

Figure 21: PSNR Vs Frame Index plot for PeopleOnStreet_2560_1600_30_crop.yuv

Figure 22: SSIM Vs Frame Index plot for PeopleOnStreet_2560_1600_30_crop.yuv

Figure 23: PSNR Vs Frame Index plot for Bosphorus_3840x2160.yuv

Figure 24: SSIM Vs Frame Index plot for Bosphorus_3840x2160.yuv

Figure 25 : Rank of the codecs

Figure 26: Comparison of encoding time (secs) for Johnny_1280x720_60
Figure 26: Comparison of encoding time (secs) for Johnny_1280x720_60.yuv

Figure 27: Comparison of encoding time (secs) for Jockey_1920x1080.yuv

Figure 28: Comparison of encoding time (secs) for PeopleOnStreet_2560_1600_30_crop.yuv

Figure 29: Comparison of encoding time (secs) for Bosphorus_3840x2160
Figure 29: Comparison of encoding time (secs) for Bosphorus_3840x2160.yuv

Conclusions In terms of PSNR, the difference between HEVC and VP9 is ∼0.95–1.05 dB in favor of the former. When comparing HEVC with JPEG, a PSNR difference of ∼4.05–4.45 dB can be observed. JPEG 2000 scheme shows a peak in PSNR loss of ∼ dB when compared with HEVC. In terms of SSIM, HEVC outperforms other codecs. Second best performing scheme is VP9. JPEG produced worst quality images. HEVC encoder is more complex.

ACKNOWLEDGEMENT I would sincerely like to thank Dr. K. R. Rao for his constant support and guidance throughout the duration of my project. I would also like to thank Tuan Ho and MPL members for assisting me in my project.

References [1] G. J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no.12, pp – 1668, Dec 2012. [2] JVT Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264-ISO/IEC AVC), March 2003, JVT-G050 available on [3] D. Grois et al, “Performance Comparison of H.265/ MPEG-HEVC, VP9, and H.264/MPEG-AVC Encoders”, IEEE PCS 2013, pp , San José, CA, USA, Dec 8-11, 2013 [4] D. Mukherjee et al, “The latest open-source video codec VP9–An overview and preliminary results”, Google Inc., United States [5] Z. Wang et al, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp , Apr. 2004 [6] G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves”, ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr. 2001

References (Contd…) [7] X. Li et al, “Rate-complexity-distortion evaluation for hybrid video coding”, IEEE International Conference on Multimedia and Expo (ICME), pp , July. 2010 [8] I. E. G. Richardson, “Video Codec Design: Developing Image and Video Compression Systems”, Wiley, 2002 [9] N. Ahmed , T. Natarajan and K. R. Rao, “Discrete Cosine Transform”, IEEE Transactions on Computers, Vol. C-23, pp , Jan [10] D. Marpe, H. Schwarz, and T. Wiegand, “Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, pp. 620–636, July 2003. [11] G. J. Sullivan , et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp , Dec [12] HEVC white paper - [13] HEVC tutorial by I.E.G. Richardson: [14] W. Malpica and A. Bovik, "Range image quality assessment by structural similarity", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr

References (Contd…) [15] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012. [16] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved Available on: [17] M. P. Sharabayko et al, "Intra Compression Efficiency in VP9 and HEVC" Applied Mathematical Sciences, Vol. 7, no. 137, pp – 6824, Hikari Ltd, 2013 [18] J. Padia, “Complexity reduction for VP6 to H.264 transcoder using motion vector reuse,” M.S. Thesis, EE Dept., UTA, Arlington, TX, Available on: [19] White paper on PSNR- National Instruments- [20] Access to HM Reference Software: [21] Chromium® open-source browser project, VP9 source code, Online: [22] Video test sequences-

References (Contd…) [23] Cisco Visual Networking Index - [24] J. Wang et al, "Fractal image coding using SSIM", IEEE 18th International Conference on Image Processing, pp , Brussels, Sept [25] HEVC Software Reference Manual: [26] K. R. Rao, D. N. Kim and J. J. Hwang, “Video Coding Standards: AVS China, H.264/MPEG-4Part10, HEVC, VP6, DIRAC and VC-1”, Springer, [27] V. Sze, M. Budagavi and G. J.Sullivan, "High Efficiency Video Coding (HEVC): Algorithms and Architectures", Springer, [28] M. Wien, "High Efficiency Video Coding: Coding Tools and Specification", Springer, [29] I. E. Richardson, "Coding Video: A practical guide to HEVC and beyond ", Wiley, [30] D. K. Kwon and M.Budagavi, "Combined scalable and multiview extension of High Efficiency Video Coding (HEVC)", IEEE Picture Coding Symposium, pp , Dec

References (Contd…) [31] K. Sayood, “Introduction to Data Compression”, Third Edition, Morgan Kaufmann, [32] G. K. Wallace, “The JPEG still picture compression standard,” Communication of the ACM, vol. 34, no. 4, pp , April [33] D. Marpe, V. George, and T.Weigand, “Performance comparison of intra-only H.264/AVC HP and JPEG 2000 for a set of monochrome ISO/IEC test images”, JVT-M014, pp.18-22, Oct [34] P. Topiwala, T. Tran and W.Dai, “Performance comparison of JPEG2000 and H.264/AVC high profile intra-frame coding on HD video sequences,” Proc. SPIE International Symposium, Applications of Digital Image Processing XXX, vol. 6696, 66960B, San Diego, CA, Aug [35] T. Tran, L.Liu and P. Topiwala, “Performance comparison of leading image codecs: H.264/AVC intra, JPEG 2000, and Microsoft HD photo,” Proc. SPIE International Symposium, Digital Image Processing, vol. 6696, San Diego, CA, Sept [36] M. J. Weinberger, G. Seroussi and G. Sapiro, “The LOCO-I lossless image compression algorithm: principles and standardization into JPEG-LS”, IEEE Transactions on Image Processing, vol.9, pp , Aug [37] JPEG-LS encoder and decoder block diagrams:

References (Contd…) [38] JPEG-XR encoder and decoder block diagrams: [39] JPEG reference software: ftp://ftp.simtel.net/pub/simtelnet/msdos/graphics/jpegsr6.zip [40] JPEG2000 latest reference software: [41] JPEG-LS reference software: [42] Microsoft HD photo specification: [43] Test sequences: [44] JCT-VC documents can be accessed. [online]. Available: [45] VCEG & JCT documents available from in the video-site and jvt-site folders. [46] T. Nguyen and D. Marpe, “Objective Performance Evaluation of the HEVC Main Still Picture Profile”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 25, no. 5, pp , May 2015

References (Contd…) [47] Tutorials: D. Grois, B. Bross and D. Marpe, “HEVC/H.265 Video Coding Standard (Version 2) including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial) Monday 29 June am-3:00pm , IEEE ICME 2015, Torino, Italy, 29 June – 3 July, 2015. [48] Tutorials: D. Grois, B. Bross and D. Marpe, “HEVC/H.265 Video Coding Standard including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial), IEEE ICCE , Berlin, Germany, 6 – 9 Sept [49] Tutorials: D. Grois, B. Bross and D. Marpe, “HEVC/H.265 Video Coding Standard (Version 2) including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial) Sunday 27 Sept 2015, 9:00 am to 12:30 pm), IEEE ICIP, Quebec City, Canada, 27 – 30 Sept The tutorial below is for personal use only. Password: a2FazmgNK

Thank you!

Questions?!

COURSE: EE MULTIMEDIA PROCESSING,

Similar presentations

Presentation on theme: "COURSE: EE MULTIMEDIA PROCESSING,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COURSE: EE MULTIMEDIA PROCESSING,

Similar presentations

Presentation on theme: "COURSE: EE MULTIMEDIA PROCESSING,"— Presentation transcript:

Similar presentations

About project

Feedback