Download presentation
Presentation is loading. Please wait.
Published byHortense Douglas Modified over 8 years ago
1
Parallelization of HEVC Deblocking filters using CUDA GPU A PROJECT PROPOSAL UNDER THE GUIDANCE OF DR. K. R. RAO COURSE: EE5359 - MULTIMEDIA PROCESSING, SPRING 2016 SUBMISSION DATE: 7th MARCH 2016 SUBMITTED BY ARPITA YAGNIK UT ARLINGTON ID: 1000810583 EMAIL ID: arpita.yagnik@mavs.uta.eduarpita.yagnik@mavs.uta.edu
2
TABLE OF CONTENTS 1.Acronyms 2.Objective and Action Plan 3.Why Parallel processing 4.Advantages of parallelization of Deblocking filter 5.HEVC Decoder block diagram 6.HEVC Deblocking filter 7.What is GPU accelerated computing 8.Concept of GPU in video processing 9.GPU vs CPU 10.References
3
Acronyms AVC: Advanced Video Coding BS: Boundary Strength CODEC: COder/DECoder Croma: Chrominance CTU: Coding Tree Unit CU: Coding Unit CUDA: Compute Unified Device Architecture DBF: Deblocking Filter DCT: Discrete Cosine Transform DFT: Discrete Fourier Transform GPU: Graphics Processing Unit HEVC: High Efficiency Video Coding ITU-T: International Telecommunication Union (Telecommunication Standardization Sector) IEC: International Electrotechnical Commission ISO: International Standards Organization JBIG: Joint Bi-level Image Experts Group JPEG: Joint photographic experts group JCT-VC: Joint collaborative team on video coding LOT: Lapped Orthogonal Transform Luma: Luminance MB: Macro Block MPEG: Moving picture experts group OBMC: Overlapped Block Motion Compensation PU: Prediction Unit QP: Quantization Parameter SAO: Sample Adaptive Offset TU: Transform Unit
4
Objective and Action Plan Objective: To Implement the parallelization of Deblocking filter for HEVC CODEC using CUDA GPU. 1.Implement an algorithm for parallel processing of Deblocking filter operation. 2.Determine a way to program CUDA GPU. 3.Find the place to implement that algorithm at the appropriate location in HM code. 4.Execute the algorithm with both modes CPU only and CPU+GPU mode to compare the processing time and quality parameters. 5.If time permits device a novel algorithm for the same.
5
Why Parallel Processing It has been shown that the HEVC deblocking filter is responsible for 14% of the time consumption in the random access configuration, for Full HD video sequences. Compared to the DBF of AVC, DBF of HEVC is computationally less complex and offers more parallelization possibilities.
6
Advantages of Parallelization of Deblocking Filter Reduced hardware complexity as the order of filtering the block boundaries does not change with different orders of CTU decoding. Useful for parallel processing on multi core processors. Improves throughput and greatly reduces the bandwidth requirement for multicore based HEVC implementation. Highly parallelized HEVC deblocking filter provides enough cycle margins to enable a combination of deblocking filter and SAO in the same building block.
7
HEVC Decoder HEVC Decoder Block Diagram[28]
8
Contd… The three stages of HEVC video decoding Entropy Decoding and Picture Reconstruction DBFSAO Filtering
9
HEVC Deblocking filter The deblocking filter in HEVC has been designed to improve the subjective quality while reducing the complexity. HEVC deblocking filter is sustainable to parallel processing. It has been designed in a way to prevent spatial dependencies across the picture, which together with design features, enables easy parallelization.
10
Contd… Deblocking in HEVC has been designed to prevent spatial dependencies of the deblocking process across the picture. There is no overlap between the filtering operations for one block edge, which can modify 3 pixels,and the filtering decisions for the neighboring parallel block edge, which involves at most 4 pixels from block edge. Hence any vertical block edge in the picture can be deblocked in a parallel way to any other vertical edge. The same holds for horizontal edges. The picture is divided into non overlapping 8x8 blocks of samples. Each of those 8x8 blocks contains data for deblocking. consequently deblocking can be performed independently for each of the 8x8 blocks. Moreover the order of vertical and horizontal filtering for each block is exactly the same irrespective of the block position.
11
Contd… [1] 8x8 grid for deblock filtering
12
What is GPU accelerated computing GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, analytics, engineering, consumer, and enterprise applications. Pioneered in 2007 by NVIDIA, GPU accelerators now power energy-efficient data centers in government labs, universities, enterprises, and small-and-medium businesses around the world. GPUs are accelerating applications in platforms ranging from cars, to mobile phones and tablets, to drones and robots.[51]
13
Concept of GPU accelerated computing Concept of GPU Parallel Processing [51]
14
GPU vs CPU A simple way to understand the difference between a CPU and GPU is to compare how they process tasks. A CPU consists of a few cores optimized for sequential serial processing while a GPU has a massively parallel architecture consisting of thousands of smaller, more efficient cores designed for handling multiple tasks simultaneously. [51] GPU vs CPU Video
15
Contd… Parallel processing trough GPU [51]
16
Introduction to CUDA GPU CUDA® (Compute Unified Device Architecture) is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).[51] CUDA is widely used in general purpose computation, such as astronomical calculation, computational fluid dynamics simulation, image processing and video codec.
17
References 1.Andrey Norkin et al.,”HEVC Deblocking Filter”, IEEE Transactions on CSVT, vol. 22, no. 12, pp. 1746-1754, Dec. 2012 2.Wei-Yi Wei, “Deblocking Algorithms in Video and Image Compression Coding”, National Taiwan University, Taipei, Taiwan, ROC 3.B. Bross, et al., High Efficiency Video Coding (HEVC) Text Specification Draft 8, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVTC-J1003, Joint Collaborative Team on Video Coding (JCTVC), Stockholm, Sweden, Jul. 2012. 4.ITU-T and ISO/IEC JCT 1, Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec. H.264 and ISO/IEC 14496-10 (AVC), May 2003 (and subsequent editions). 5.T. Wedi and H. G. Musmann, “Motion and aliasing compensated prediction for hybrid video coding,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 577–586, Jul. 2003. 6.P. List, et al., “Adaptive deblocking filter,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 614– 619, Jul. 2003. 7.K. Ugur, K. R. Andersson, and A. Fuldseth, Video Coding Technology Proposal by Tandberg, Nokia, and Ericsson, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-A119, Joint Collaborative Team on Video Coding (JCTVC), Dresden, Germany, Apr. 2010. 8.A. Norkin, et al., CE12: Ericsson’s and MediaTek’s Deblocking Filter, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-F118, Joint Collaborative Team on Video Coding (JCTVC), Turin, Italy, Jul. 2011. 9.M. Ikeda and T. Suzuki, Non-CE10: Introduction of Strong Filter Clipping in Deblocking Filter, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-H0275, Joint Collaborative Team on Video Coding (JCTVC), San Jose, CA, Feb. 2012. 10.M. Ikeda, J. Tanaka, and T. Suzuki, CE12 Subset2: Parallel Deblocking Filter, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-E181, Joint Collaborative Team on Video Coding (JCTVC),Geneva, Switzerland, Mar. 2011.
18
Contd… 11. M. Narroschke, S. Esenlik, and T. Wedi, CE12 Subtest 1: Results for Modified Decisions for Deblocking, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-G590, Joint Collaborative Team on Video Coding (JCTVC), Geneva, Switzerland, Nov. 2011. 12. A. Norkin, CE10.3: Deblocking Filter Simplifications: Bs Computation and Strong Filtering Decision, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-H0473, Joint Collaborative Team on Video Coding (JCTVC), San Jose, CA, Feb. 2012. 13. A. Fuldseth, et al., Tiles, ITU-TSG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-F335,Joint Collaborative Team on Video Coding (JCTVC), Turin, Italy, Jul. 2011. 14. T. Yamakage, et al.,CE12: Deblocking Filter Parameter Adjustment in Slice Level, ITUT SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVCG174,Joint Collaborative Team on Video Coding (JCTVC), Geneva, Switzerland, Nov. 2011 15. G. Van der Auwera,et al. (Panasonic), Support of Varying QP in Deblocking, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVCG1031,Joint Collaborative Team on Video Coding (JCTVC), Geneva, Switzerland, Nov. 2011. 16. M. Zhou, O. Sezer, and V. Sze, CE12 Subset 2: Test Results and Architectural Study on De-Blocking Filter Without Parallel on/off Filter Decision, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-G088, Joint Collaborative Team on Video Coding (JCTVC), Geneva, Switzerland, Nov. 2011. 17. G. Bjontegaard, Calculation of Average PSNR Differences Between RDCurves,ITU-T-T SG16 document VCEG-M33, Joint Collaborative Team on Video Coding (JCTVC), 2001. 18.F. Bossen, Common Test Conditions, JCTVC-H1100, Joint Collaborative Team on Video Coding (JCTVC), San Jose, CA, 2012. 19.Po-Kai Hsu and Chung-An Shen, The VLSI Architecture of a Highly Efficient Deblocking Filter for HEVC Systems, DOI 10.1109/TCSVT.2016.2515306, IEEE Transactions on Circuits and Systems for Video Technology 20.HEVC presentation: http://www.hardware.fr/news/12901/hevc-passe-ratifie.htmlhttp://www.hardware.fr/news/12901/hevc-passe-ratifie.html 21.Overview of H.264/AVC: http://www.csee.wvu.edu/~xinl/courses/ee569/H264_tutorial.pdfhttp://www.csee.wvu.edu/~xinl/courses/ee569/H264_tutorial.pdf 22.Detailed overview of HEVC/H.265: https://app.box.com/s/rxxxzr1a1lnh7709yvihhttps://app.box.com/s/rxxxzr1a1lnh7709yvih
19
Contd… 23. I.E.G. Richardson, “Video Codec Design: Developing Image and Video Compression Systems”, Wiley, 2002. 24. I.E.G. Richardson, “The H.264 advanced video compression standard”, 2nd Edition, Hoboken, NJ, Wiley, 2010. 25. K. Sayood, “Introduction to Data compression”, Third Edition, Morgan Kaufmann Series in Multimedia Information and Systems, San Francisco, CA, 2005. 26. V. Sze and M. Budagavi, “Design and Implementation of Next Generation Video Coding Systems (H.265/HEVC Tutorial)”, IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne, Australia, June 2014. 27.V. Sze, M. Budagavi and G.J. Sullivan (Editors), “High Efficiency Video Coding (HEVC): Algorithms and Architectures”, Springer, 2014. 28.G. J. Sullivan et al, “Overview of the High Efficiency Video Coding (HEVC) Standard”, IEEE Trans. on Circuits and Systems for Video Technology, Vol. 22, No. 12, pp. 1649-1668, Dec. 2012. 29.G. J. Sullivan et al,“Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, vol. 7, pp.1001-1016, Dec. 2013. 30.K.R. Rao, D.N. Kim and J.J. Hwang, “Video Coding Standards: AVS China, H.264/MPEG-4 Part 10, HEVC, VP6, DIRAC and VC-1”, Springer, 2014. 31.D. Grois, B. Bross and D. Marpe, “HEVC/H.265 Video Coding Standard (Version 2) including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial) Sunday 27 Sept 2015, 9:00 am to 12:30 pm), IEEE ICIP, Quebec City, Canada, 27 – 30 Sept. 2015. 32.Generic quadtree based approach for block partitioning http://www.hhi.fraunhofer.de/fields-of-competence/image-processing/research- groups/image-video-coding/hevchigh-efficiency-video-coding/generic-quadtree-based-approach-for-block-partitioning.htmlwww.hhi.fraunhofer.de/fields-of-competence/image-processing/research- groups/image-video-coding/hevchigh-efficiency-video-coding/generic-quadtree-based-approach-for-block-partitioning.html 33.The tutorial below is for personal use only [Password: a2FazmgNK ] https://datacloud.hhi.fraunhofer.de/owncloud/public.php?service=files&t=8edc97d26d46d4458a9c1a17964bf881 https://datacloud.hhi.fraunhofer.de/owncloud/public.php?service=files&t=8edc97d26d46d4458a9c1a17964bf881 34. Please find the links to YouTube videos on the tutorial - HEVC/H.265 Video Coding Standard including the Range Extensions Scalable Extensions and Multiview Extensions below: https://www.youtube.com/watch?v=TLNkK5C1KN8 34.HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.htmlhttp://www.vcodex.com/h265.html 35.“Special issue on HEVC extensions and efficient HEVC implementations”, IEEE Trans. on Circuits and Systems for Video Technology, Vol. 26, pp. 1-249, Jan. 2016. 36.K.R. Rao and J.J. Hwang, “Techniques and standards for image/video/audio coding”, Prentice Hall, 1996.
20
Contd… 37.Video lectures from IITs and IISC: http://nptel.iitm.ac.in/http://nptel.iitm.ac.in/ 38.Image and video processing courses at UT Arlington (EE 5351, EE 5355, EE 5356 and EE 5359) : http://www.uta.edu/faculty/krrao/dip/http://www.uta.edu/faculty/krrao/dip/ 39.HEVC chapter 1: http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/HEVCCH1a_updated.dochttp://www.uta.edu/faculty/krrao/dip/Courses/EE5359/HEVCCH1a_updated.doc 40.Online course on fundamentals of digital image and video processing from Coursera: https://www.coursera.org/course/digitalhttps://www.coursera.org/course/digital 41.Access to HM 16.0 Software Manual: http://iphome.hhi.de/marpe/download/Performance_HEVC_VP9_X264_PCS_2013_preprint.pdfhttp://iphome.hhi.de/marpe/download/Performance_HEVC_VP9_X264_PCS_2013_preprint.pdf 42.Test Sequences: ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/ 43.HEVC white paper-Ittiam Systems: http://www.ittiam.com/Downloads/en/documentation.aspxhttp://www.ittiam.com/Downloads/en/documentation.aspx 44.HEVC white paper-Elemental Technologies: http://www.elementaltechnologies.com/lp/hevc-h265-demystified-white-paperhttp://www.elementaltechnologies.com/lp/hevc-h265-demystified-white-paper 45.Access to HM 16.0 Reference Software: http://hevc.hhi.fraunhofer.de/http://hevc.hhi.fraunhofer.de/ 46.Han W-J, et al. (2010), “Improved video compression efficiency through flexible unit representation and corresponding extension of coding tools”, IEEE Trans. on Circuits and Systems for Video Technology, Vol. 20, no.12, pp. 1709-1720, Dec. 2010. 47.Norkin A (2012) Non-CE1: non-normative improvement to deblocking filtering, Joint Collaborative Team on Video Coding (JCT-VC), Document JCTVC-K0289, Shanghai, Oct. 2012 48.Norkin A, Andersson K, Fuldseth A, Bjøntegaard G (2012) HEVC deblocking filtering and decisions. In: Proc. SPIE. 8499, Applications of Digital Image Processing XXXV, no. 849912, Oct. 2012 49. Norkin A, Andersson K, Kulyk V (2013) “Two HEVC encoder methods for block artifact reduction”. In: Proceedings of the IEEE international conference on visual communications and image processing (VCIP) 2013, Kuching, Sarawak, pp. 1–6, Nov. 2013 50.Norkin A, Andersson K, Sjöberg R (2013) AHG6: on deblocking filter and parameters signaling, Joint Collaborative Team on Video Coding (JCT-VC), Document JCTVC-L0232, Geneva, Jan. 2013 51.Information on GPU accelearted computing : http://www.nvidia.com/object/what-is-gpu-computing.htmlhttp://www.nvidia.com/object/what-is-gpu-computing.html 52.Xiaoou Sun et al, “Aceelerating IEEE 1857 Deblocking Filter on GPU using CUDA’, IEEE International Conference on Multimedia Big Data, pp. 415-419, Apr. 2015.
21
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.