prediction techniques for palette coding in screen content Advisor: Dr. K. R. Rao Department of Electrical Engineering University of Texas, Arlington EE 5359 Multimedia Processing Spring 2016 Presented By: Rakhee Barkur (1001096946) rakhee.barkur@mavs.uta.edu This template is in wide-screen format and demonstrates how transitions, animations, and multimedia choreography can be used to enrich a presentation.
OUTLINE Scope of this project Screen content coding Palette coding Prediction techniques for palette coding Implementation and expected results Conclusion and future scope
ACRONYMS AVC – Advanced Video Coding CB – Coding Block CU – Coding Unit HEVC-High Efficiency Video Coding ITU-T - International Telecommunication Union(Telecommunication Standardization Sector) IEC - International Electrotechnical Commission ISO – International Standards Organization JBIG- Joint Bi-level Image Experts Group JCT-VC- Joint collaborative team on video coding MPEG-Moving picture experts group MV – Motion Vector
ACRONYMS [CONTD.] NCC – Natural captured Content QP – Quantization Parameter RD – Rate Distortion SCC - Screen Content Coding VCEG – Video Coding Experts Group
SCOPE OF THIS PROJECT In this project, the concept of ‘Palette Coding for screen content’ [13] will be presented. Palette coding utilizes the fact that there are few unique colors in screen content video blocks, and tries to send palettes of these unique colors. However, the size of these palettes can expand, especially in high resolution video. Therefore, to reduce the number of bits for palette transmission, the projects aims at introducing palette prediction techniques. In order to propose the prediction techniques, an initial analysis of palette characteristics in screen content will be done. Following this, the project aims at conducting experiments to show that efficient palette prediction schemes used in conjunction with palette coding can provide significant compression gains for screen content coding.
SCREEN CONTENT CODING Screen content is a computer generated content [11] [12]. Contains text, graphics and animations unlike camera captured images. Used in applications such as desktop sharing, video conferencing, social networks and remote education. Coding of screen content differs from that of a photographic content. Figure 1. Images of screen content: (a) Slide editing (b)Video with text overlay (c) Mobile display[12]
NATURAL VIDEOS V/S SCREEN CONTENT VIDEOS [12] Uses wide range of colors to represent video content. More noise due to smooth edges. Pattern repetition is less likely to occur. The number of colors tend to be limited. Less noise while the edges tend to be much sharper. Repetitive patterns could be observed. For example, English characters, logos etc. Figure 2.Image captured by camera [12] Figure 4.Image with screen content (web browsing) [12] Figure 3. Histogram of the camera captured image in RGB color format [12] Figure 5. Histogram of a screen content image in RGB color format [12]
NEED FOR SCREEN CONTENT CODING With the rapid development of communication, screen content also takes a large portion of the network bandwidth for transmission in addition to natural camera captured data [11]. Screen content has additional attributes such as text, shape and graphics consisting of uniformly flat regions and repeated patterns, high contrast and sharp edges. Coding techniques that are proposed for natural videos cannot provide best coding efficiency for screen content. Thus properties of screen content demand for a different coding tool other than that is being used for natural videos. Significantly higher compression is achievable if compression tools are designed specifically for screen content. Various technologies such as Intra Block Copy [21], Edge Mode[22], Transform Skipping[23], Palette coding [24] [25] [26] exist to achieve better compression efficiency of screen content coding.
PALETTE CODING The traditional intra and inter prediction removes redundancy between different coding units [10]. Palette coding targets at the redundancy of repetitive pixel values/patterns within the coding unit. Most frequent pixel values are first selected as major colors. All the pixels in the current CB are quantized to these palette colors. The palette colors, quantization indices and the quantization errors are then encoded. High compression ratio can then be achieved by carefully encoding the quantization indices. Figure 6. Dividing an input block into major colors and index (structure) map [12].
PALETTE PREDICTION Palette coding utilizes the fact that there are few unique colors in screen content video blocks and tries to send palettes of these unique colors [11]. In high resolution videos, size of palettes can expand significantly and encoding the actual palette will consume large number of bits. Prediction techniques are used to exploit the correlation between palettes of neighboring blocks and reduce the overall bits for palette transmission. Figure 7. Palette prediction example: Red blocks are palette coded blocks without any palette prediction. Green blocks represent possible blocks where palette prediction can be used [11]
NEED FOR PALETTE PREDICTION SC has only 9 unique colors (unique vector in [Y,U,V],color space).However, NCC has 256 unique colors [11] Typically, there are multiple unique colors in a palette, and large number of bits will be utilized in encoding them. For example, in a 8-bit YUV video sequence, each color is represented by 24 bits. Assuming there are 10 different colors in a 8x8 block, there would be 240 additional bits required to be transmitted for each 8x8 block being encoded by palette mode. Clearly, this is expensive if the palettes for each block in a video are coded separately. To reduce the overall number of bits of encoding palettes separately, [25] and [26] proposed palette prediction techniques. At a high level, in [25] and [26], prediction for palettes can happen from a CU on top or left, if they were also coded using palette mode. Figure 8.. A zoomed version of 16x16 (a) screen content block (b) natural captured content [11]
PALETTE PREDICTION FRAMEWORK Figure 9. Palette prediction framework [11] The palette to be used for encoding the CU can either be computed by the current CU itself, or predicted from already palette-coded blocks [11] A Rate-Distortion optimization is performed at the encoder to select the best mode (using current palette, or previous palette, or other modes) for a coding block. The RDO at the encoder selects the mode based on the bits (R) used and distortion (D) as J = R+λD, where the value of λ depends on the Quantization Parameter (QP) of the codec [27]. The mode which gives the smallest cost J will be used, and a flag is signaled to the decoder about the chosen mode at the encoder.
SCHEME A FOR PALETTE PREDICTION: NO MOTION VECTOR Predict the palette from the last palette-coded CU on the left, which may not necessarily adjacent to current CU [11]. The current CU x(n) uses a palette-coded CU Y on its left as predictor for its palette. The RD process picks one of the two options: a) Use palette from current block x(n) itself; and b) Use the predicted palette from the last palette-coded CU Y on the left. . Advantage- No motion vector is required to be sent for palette prediction. At the decoder, if the received flag is 11, the decoder finds the predictor palette similar to the encoder Fig 10. Palette prediction without MV [11] Table 1. Palette prediction mode signaling [11]
SCHEME B FOR PALETTE PREDICTION : WITH MOTION VECTOR Figure 11. Palette prediction with motion vector [11] Here current CB is marked as x(n) and all previously palette-coded blocks are marked as Y [11]. The encoder searches through all the available palettes for the coding block and uses the one which results in least Rate-Distortion cost. Given current coding block index n, and the best predicted palette index m, the motion vector is k = n − m − 1, since n is always greater than m. The motion vector k is then encoded.
IMPLEMENTATION AND EXPECTED RESULT Implementation procedure : To encode full length sequences at various resolutions with different QP’s for both lossy scenarios. Used HM16.7+SCC 6.0 HEVC Range Extensions reference software. Test sequences [29] to be used are in YUV color space and 4:4:4 sampled. Expected results : Planning to implement schemes. Scheme A simply uses the palette of the last left palette coded CU as predictor; and is believed to attain good bitrate savings over palette coding (without prediction) [11]. Scheme B we restricts the search only in the current largest coding unit for a hardware-friendly implementation.This scheme achieves significant compression gain [11].
SCHEME A: PRINCIPLE OF OPERATION Tries to predict the palette from the previously palette-coded CU on the left[11] The past CU may not necessarily be adjacent to current CU. The Rate-Distortion process picks the best among one of the two options: Use palette from current block x(n) itself. Use the predicted palette from the last palette-coded CU Y on the left. The major advantage of this scheme is that no motion vector is required to be sent for palette prediction. At the decoder, if the received flag is 11, the decoder finds the predictor palette similar to the encoder. If we are at the picture boundary, and there is no palette-coded CU on the left, palette prediction mode is not used.
CONCLUSIONS AND FUTURE SCOPE To show that efficient palette prediction schemes used in conjunction with palette coding can provide significant compression gains and bit-rate savings for screen content coding. [10] [11] Future work includes finding the optimal compression and complexity point for these prediction schemes.
REFERENCES [1] G.J. Sullivan et al, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Trans. on Circuits and Systems for Video Technology, vol.22, no.12, pp.1649-1668, Dec. 2012. [2] Generic Coding of Moving Pictures and Associated Audio Information—Part 2: Video, ITU-T Rec. H.262 and ISO/IEC 13818-2 (MPEG 2 Video), ITU-T and ISO/IEC JTC 1, Nov. 1994. [3] Advanced Video Coding for Generic Audio-Visual Services, ITU-T Rec. H.264 and ISO/IEC 14496-10 (AVC), ITU-T and ISO/IEC JTC 1, May 2003. [4] B. Bross, et al, High Efficiency Video Coding (HEVC) Text Specification Draft 9, document JCTVC-K1003, ITU- T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC), Oct. 2012.. [5] Multimedia processing website: http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html. [6] Video Codec for Audiovisual Services at px64 kbit/s, ITU-T Rec. H.261, version 1: Nov. 1990, version 2: Mar. 1993. [7] Video Coding for Low Bit Rate Communication, ITU-T Rec. H.263, Nov. 1995 (and subsequent editions). [8] Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About 1.5 Mbit/s—Part 2: Video, ISO/IEC 11172-2 (MPEG-1), ISO/IEC JTC 1, 1993.
REFERENCES [CONT.] [ 9 ] Coding of Audio-Visual Objects—Part 2: Visual, ISO/IEC 14496-2 (MPEG-4 Visual version 1), ISO/IEC JTC 1, Apr. 1999 (and subsequent editions). [10] L Guo et al, “Color Palette for Screen Content Coding”, IEEE International Conference on Image Processing, pp.5556-5560, Oct. 2014. [11] G Jin et al, “On prediction Techniques for Palette Coding”, IEEE International Conference on Image Processing, pp.5566-5570 Oct. 2014. [12] N N Mundgemane, “Multi-stage prediction scheme for Screen Content based on HEVC”, M.S. Thesis, EE Dept, UTA, Dec 2015.See [5] [13] J. Xu et al, “Overview of the Emerging HEVC Screen Content Coding Extension,” IEEE Trans.. Circuits and Systems for Video Technology, vol. 26, pp.50-62, Jan.2016. [14] Access to JCT-VC documents: http://phenix.int-evry.fr/jct/ [15] D. Flynn, J. Sole and T. Suzuki, “High efficiency video coding (HEVC) range extensions text specification”, Draft 4, JCT-VC. Retrieved 2013-08-07. [16] W. Zhu et al, “Screen content coding based on HEVC framework”, IEEE Trans. Multimedia , vol.16, pp.1316-1326 Aug. 2014 (several papers related to MRC) MRC: mixed raster coding. [17] IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS),Special Issue on Screen Content Video Coding and Applications: Final papers are due by July 2016.
REFERENCES [CONT.] [18] HEVC SCC Extension Reference software https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-12.1+RExt-4.0 [19] HEVC SCC Software reference manual https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-SCC-extensions/doc/software-manual.pdf [20] Z. Ma et al, “Advanced screen content coding using color table and index map,” IEEE Trans. on Image Processing, vol. 23, no. 10 , pp. 4399 – 4412, Oct. 2014. [21] G.J. Sullivan et al, “Standardized extensions of high efficiency video coding (HEVC)”, IEEE Journal of selected topics in signal processing, vol. 7, pp. 1001-1016, Dec. 2013. [22] J. Sole et al, “Summary report on HEVC Range Extensions Core Experiment 3 (RCE3) on Intra block copy refinement,” JCTVC-P0034, San Jose, USA, Jan. 2014. [23] S. Hu et al, “Screen content coding for HEVC using edge modes,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1714–1718, 2013. [24] M. Mrak et al, “Improving screen content coding in HEVC by transform skipping,” in Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp.1209–1213, 2012 [25] L. Guo et al, “Evaluation of palette mode coding on HM-12.0+RExt-4.1,” JCTVC-O0218, Geneva, Switzerland, Oct. 2013. [26] X. Guo et al, “AHG8: Majorcolor-based screen content coding,” JCTVC-O0182, Geneva, Switzerland, October 2013.
REFERENCES [CONT.] [27] L. Guo et al,“RCE4: Results of test 2 on palette mode for screen content coding,” JCTVC-P0198, San Jose, USA, January 2014. [28] I. E. Richardson, The H.264 Advanced Video Compression Standard, Wiley, 2010. [29] ITU-T and ISO-IEC: JCTVC, “HEVC range extensions test model (HM) http://hevc.kw.bbc.co.uk/git/w/jctvchm.git/commit/0edaa66758a088f53a2b8f5dd93bd0c34fcd7c75 [30] Tutorials Links : Tut1. N. Ling, “High efficiency video coding and its 3D extension: A research perspective,” Keynote Speech, ICIEA, Singapore, July 2012. Tut2. X. Wang et al, “Paralleling variable block size motion estimation of HEVC on CPU plus GPU platform”, IEEE ICME workshop, 2013. Tut3. H.R. Tohidpour, M.T. Pourazad and P. Nasiopoulos, “Content adaptive complexity reduction scheme for quality/fidelity scalable HEVC”, IEEE ICASSP 2013, pp. 1744-1748, June 2013. Tut4. M. Wien, “HEVC – coding tools and specifications”, Tutorial, IEEE ICME, San Jose, CA, July 2013. Tut5. D. Grois, B. Bross and D. Marpe, “HEVC/H.265 Video Coding Standard (Version 2) including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial) Monday 29 June 2015, IEEE ICME 2015, Torino, Italy, 29 June – 3 July, 2015.
REFERENCES [CONT.] Tut6. D. Grois, B. Bross and D. Marpe, “HEVC/H.265 Video Coding Standard including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial), IEEE ICCE , Berlin, Germany, 6 – 9 Sept. 2015. The tutorial below is for personal use only. Password: a2FazmgNK Tut7. D. Grois, B. Bross and D. Marpe, “HEVC/H.265 Video Coding Standard (Version 2) including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial) Sunday 27 Sept 2015, IEEE ICIP, Quebec City, Canada, 27 – 30 Sept. 2015. https://datacloud.hhi.fraunhofer.de/owncloud/public.php?service=files&t=8edc97d26d46d4458a9c1a17964bf881 Tut8. Please find the links to YouTube videos on the tutorial - HEVC/H.265 Video Coding Standard including the Range Extensions Scalable Extensions and Multiview Extensions below: https://www.youtube.com/watch?v=TLNkK5C1KN8 Tut9. HEVC tutorial by I.E.G. Richardson, www.vcodex.com/h265.html [31] L. Guo et al, “Evaluation of palette mode coding on HM-12.0+RExt-4.1,” JCTVC- O0218, Geneva, Switzerland, Oct 2013. 18
THANK YOU