WYNER-ZIV VIDEO CODING WITH CLASSIED CORRELATION NOISE ESTIMATION AND KEY FRAME CODING MODE SELECTION Present by fakewen.

Slides:



Advertisements
Similar presentations
Low-Complexity Transform and Quantization in H.264/AVC
Advertisements

Packet Video Error Concealment With Auto Regressive Model Yongbing Zhang, Xinguang Xiang, Debin Zhao, Siwe Ma, Student Member, IEEE, and Wen Gao, Fellow,
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
MPEG4 Natural Video Coding Functionalities: –Coding of arbitrary shaped objects –Efficient compression of video and images over wide range of bit rates.
2004 NTU CSIE 1 Ch.6 H.264/AVC Part2 (pp.200~222) Chun-Wei Hsieh.
Ai-Mei Huang And Truong Nguyen Image processing, 2006 IEEE international conference on Motion vector processing based on residual energy information for.
{ Fast Disparity Estimation Using Spatio- temporal Correlation of Disparity Field for Multiview Video Coding Wei Zhu, Xiang Tian, Fan Zhou and Yaowu Chen.
Limin Liu, Member, IEEE Zhen Li, Member, IEEE Edward J. Delp, Fellow, IEEE CSVT 2009.
Ai-Mei Huang and Truong Nguyen Image Processing (ICIP), th IEEE International Conference on 1.
Ai-Mei Huang and Truong Nguyen Video Processing LabECE Dept, UCSD, La Jolla, CA This paper appears in: Image Processing, ICIP IEEE International.
Recursive End-to-end Distortion Estimation with Model-based Cross-correlation Approximation Hua Yang, Kenneth Rose Signal Compression Lab University of.
Reinventing Compression: The New Paradigm of Distributed Video Coding Bernd Girod Information Systems Laboratory Stanford University.
Probabilistic video stabilization using Kalman filtering and mosaicking.
Rate-Distortion Optimized Layered Coding with Unequal Error Protection for Robust Internet Video Michael Gallant, Member, IEEE, and Faouzi Kossentini,
Spatial and Temporal Data Mining
An Efficient Low Bit-Rate Video-coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam, Wan-Chi Siu IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS.
Distributed Video Coding Bernd Girod, Anne Margot Aagon and Shantanu Rane, Proceedings of IEEE, Jan, 2005 Presented by Peter.
Wyner-Ziv Coding of Motion Video
Analysis, Fast Algorithm, and VLSI Architecture Design for H
H.264 / MPEG-4 Part 10 Nimrod Peleg March 2003.
Losslessy Compression of Multimedia Data Hao Jiang Computer Science Department Sept. 25, 2007.
Decision Trees for Error Concealment in Video Decoding Song Cen and Pamela C. Cosman, Senior Member, IEEE IEEE TRANSACTION ON MULTIMEDIA, VOL. 5, NO. 1,
Scalable Wavelet Video Coding Using Aliasing- Reduced Hierarchical Motion Compensation Xuguang Yang, Member, IEEE, and Kannan Ramchandran, Member, IEEE.
CS :: Fall 2003 MPEG-1 Video (Part 1) Ketan Mayer-Patel.
Transform Domain Distributed Video Coding. Outline  Another Approach  Side Information  Motion Compensation.
Wyner-Ziv Residual Coding of Video Anne Aaron, David Varodayan and Bernd Girod Information Systems Laboratory Stanford University.
Source-Channel Prediction in Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Laboratory ECE Department University of California,
1 Department of Electrical Engineering Stanford University Anne Aaron, Shantanu Rane and Bernd Girod Wyner-Ziv Video Coding with Hash-Based Motion Compensation.
1 Department of Electrical Engineering, Stanford University Anne Aaron, Shantanu Rane, Eric Setton and Bernd Girod Transform-domain Wyner-Ziv Codec for.
Distributed Video Coding VLBV, Sardinia, September 16, 2005 Bernd Girod Information Systems Laboratory Stanford University.
Video Compression Concepts Nimrod Peleg Update: Dec
1. 1. Problem Statement 2. Overview of H.264/AVC Scalable Extension I. Temporal Scalability II. Spatial Scalability III. Complexity Reduction 3. Previous.
Image Compression - JPEG. Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation.
Video Coding. Introduction Video Coding The objective of video coding is to compress moving images. The MPEG (Moving Picture Experts Group) and H.26X.
Abhik Majumdar, Rohit Puri, Kannan Ramchandran, and Jim Chou /24 1 Distributed Video Coding and Its Application Presented by Lei Sun.
Image Processing and Computer Vision: 91. Image and Video Coding Compressing data to a smaller volume without losing (too much) information.
June, 1999 An Introduction to MPEG School of Computer Science, University of Central Florida, VLSI and M-5 Research Group Tao.
Low-Power H.264 Video Compression Architecture for Mobile Communication Student: Tai-Jung Huang Advisor: Jar-Ferr Yang Teacher: Jenn-Jier Lien.
Sub pixel motion estimation for Wyner-Ziv side information generation Subrahmanya M V (Under the guidance of Dr. Rao and Dr.Jin-soo Kim)
VIDEO COMPRESSION USING NESTED QUADTREE STRUCTURES, LEAF MERGING, AND IMPROVED TECHNIQUES FOR MOTION REPRESENTATION AND ENTROPY CODING Present by fakewen.
Compression There is need for compression: bandwidth constraints of multimedia applications exceed the capability of communication channels Ex. QCIF bit.
Compression of Real-Time Cardiac MRI Video Sequences EE 368B Final Project December 8, 2000 Neal K. Bangerter and Julie C. Sabataitis.
Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu CSVT
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
Ai-Mei Huang, Student Member, IEEE, and Truong Nguyen, Fellow, IEEE.
Fast motion estimation and mode decision for H.264 video coding in packet loss environment Li Liu, Xinhua Zhuang Computer Science Department, University.
IEEE Transactions on Consumer Electronics, Vol. 58, No. 2, May 2012 Kyungmin Lim, Seongwan Kim, Jaeho Lee, Daehyun Pak and Sangyoun Lee, Member, IEEE 報告者:劉冠宇.
Rate-distortion Optimized Mode Selection Based on Multi-path Channel Simulation Markus Gärtner Davide Bertozzi Project Proposal Classroom Presentation.
Wyner-Ziv Coding of Motion Video Presented by fakewen.
ELE 488 F06 ELE 488 Fall 2006 Image Processing and Transmission ( ) Image Compression Quantization independent samples uniform and optimum correlated.
Block-based coding Multimedia Systems and Standards S2 IF Telkom University.
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
C.K. Kim, D.Y. Suh, J. Park, B. Jeon ha 強壯 !. DVC bitstream reorganiser.
Chapter 8 Lossy Compression Algorithms. Fundamentals of Multimedia, Chapter Introduction Lossless compression algorithms do not deliver compression.
Li-Wei Kang and Chun-Shien Lu Institute of Information Science, Academia Sinica Taipei, Taiwan, ROC {lwkang, April IEEE.
Motion Estimation Multimedia Systems and Standards S2 IF Telkom University.
Ai-Mei Huang And Truong Nguyen Image processing, 2006 IEEE international conference on Motion vector processing based on residual energy information for.
Multi-Frame Motion Estimation and Mode Decision in H.264 Codec Shauli Rozen Amit Yedidia Supervised by Dr. Shlomo Greenberg Communication Systems Engineering.
Video Compression Video : Sequence of frames Each Frame : 2-D Array of Pixels Video: 3-D data – 2-D Spatial, 1-D Temporal Video has both : – Spatial Redundancy.
Complexity varying intra prediction in H.264 Supervisors: Dr. Ofer Hadar, Mr. Evgeny Kaminsky Students: Amit David, Yoav Galon.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
JPEG Compression What is JPEG? Motivation
Distributed Compression For Still Images
BITS Pilani Pilani Campus EEE G612 Coding Theory and Practice SONU BALIYAN 2017H P.
JPEG.
Context-based Data Compression
Standards Presentation ECE 8873 – Data Compression and Modeling
MPEG4 Natural Video Coding
強壯的進度 2011/12/28 我是強壯XD.
Presentation transcript:

WYNER-ZIV VIDEO CODING WITH CLASSIED CORRELATION NOISE ESTIMATION AND KEY FRAME CODING MODE SELECTION Present by fakewen

This paper appears in: Image Processing, IEEE Transactions onImage Processing, IEEE Transactions on Esmaili, G.; Cosman, P.; Esmaili, G.Cosman, P.

overview TRANSFORM DOMAIN WYNER-ZIV CODING CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS KEY FRAME ENCODING BASED ON FREQUENCY BAND CLASSIFICATION AND SIDE INFORMATION REFINEMENT HIERARCHICAL CODING SIMULATION RESULTS

Abstract We improve the overall rate-distortion performance of distributed video coding by efcient techniques of correlation noise estimation and key frame encoding. We propose a method to estimate the correlation noise by differentiating blocks within a frame based on the accuracy of the side information.

we propose a frequency band coding mode selection for key frames to exploit similarities between adjacent key frames at the decoder.

overview TRANSFORM DOMAIN WYNER-ZIV CODING CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS KEY FRAME ENCODING BASED ON FREQUENCY BAND CLASSIFICATION AND SIDE INFORMATION REFINEMENT HIERARCHICAL CODING SIMULATION RESULTS

TRANSFORM DOMAIN WYNER-ZIV CODING

blockwise 4 × 4 discrete cosine transform (DCT) is applied on Wyner-Ziv frames If there are N blocks in the image, X k (for k = 1 to 16) is a vector of length N obtained by grouping together the k th DCT coefcients from all blocks. N=4 4 X8X8

TRANSFORM DOMAIN WYNER-ZIV CODING Quantize function The coefcients of X k are quantized to form a vector of quantized symbols, q k. XkXk qkqk Length=N=4(ks range) Length=N=4 i= Bit plane M k,i Slepian-Wolf (Turbo or LDPCA) encoder parity bits (syndrome)

TRANSFORM DOMAIN WYNER-ZIV CODING I k is the maximum number of bit planes for frequency band k. where |v k | max is the highest absolute value within frequency band k.

Side information At the decoder, ˆW is the estimate of W (Wyner- Ziv frame) which is generated by applying extrapolation or interpolation techniques on decoded key frames.

Side information The Slepian-Wolf decoder and reconstruction block assume a Laplacian distribution to model the statistical dependency between X k and ˆX k. Although more accurate models such as generalized Gaussian can be applied, Laplacian is selected for good balancing of accuracy and complexity. where d denotes the difference between corresponding elements of X k and ˆX k.

Side information Alpha value for 16DCT for frequency band k the differences between corresponding elements in X k and ˆX k of several sequences are grouped to form a set F k. The α parameter is calculated by where σ k is the square root of the variance of the F k (X k - ˆX k )values.

overview TRANSFORM DOMAIN WYNER-ZIV CODING CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS KEY FRAME ENCODING BASED ON FREQUENCY BAND CLASSIFICATION AND SIDE INFORMATION REFINEMENT HIERARCHICAL CODING SIMULATION RESULTS

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS The main usage of correlation noise estimation is in the calculation of the conditional probability of the Slepian-Wolf decoder which in our case is the regular degree 3 LDPC accumulate (LDPCA) codes proposed in [26]. More accurate estimation of the dependency between source and side information means that fewer accumulated syndrome bits need to be sent, resulting in improved rate-distortion performance.

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS Traditional estimation of Laplacian distribution parameters treats all frames and blocks within a frame uniformly, even though the quality of the side information varies spatially and temporally.

MCFI(Motion Compensated Frame Interpolation) General MCFI methods are based on the assumption that the motion is translational and linear over time among temporally adjacent frames. This assumption often holds for relatively small motion, but tends to give a poor estimation for high motion regions.

MCFI The general approach to estimate a given block B in the interpolated frame F t is to nd the motion vector of the co-located block in F t+1 with reference to frame F t 1 where t, t 1 and t + 1 are time indexes. F t+1 F t F t-1 Vt-1Vt+1Vt

MCFI A spatial smoothing algorithm is then used to improve the accuracy of the motion eld.

MCFI F t+1 F t F t-1

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS The residual energy between FMCFI and BMCFI the residual between forward and backward interpolations was applied to estimate the correlation noise

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS is the square root of the variance of the elements of |T | By this method, we give different levels of condence to different blocks based on how well interpolated they are.

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS In our method, we divide our sample of data into several classes of residual energy. The residual energy between Forward and Backward interpolation of every block within a frame for all Wyner-Ziv frames of several sequences is calculated to form a set R.

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS We classify elements of this set into m different classes using m 1 thresholds T i where i {1,..., m 1}. Class i is chosen when T i < r < T i+1 where r R. In our simulation, the number of classes is set to 8 Threshold values are calculated ofine for each quantization parameter, separately.

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS Table I shows the computed lookup table for quantization parameter equal to 0.4. Each row represents the α parameter of different DCT bands of a given class.

CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS

overview TRANSFORM DOMAIN WYNER-ZIV CODING CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS KEY FRAME ENCODING BASED ON FREQUENCY BAND CLASSIFICATION AND SIDE INFORMATION REFINEMENT HIERARCHICAL CODING SIMULATION RESULTS

KEY FRAME ENCODING BASED ON FREQUENCY BAND CLASSIFICATION AND SIDE INFORMATION REFINEMENT Intra coding Coding mode selection and side information renement

Intra coding Huffman and run length coding tables are borrowed from the JPEG standard.

Coding mode selection and side information renement

Coding mode selection and side information renement(cont.) In conventional transform domain Wyner-Ziv coding, the temporal correlation between adjacent key frames is not exploited To extend the Wyner-Ziv coding idea to key frames to exploit similarities between them, previously decoded key frames can be used as the side information.

Coding mode selection and side information renement(cont.) The previously decoded key frame is used to generate the side information for low frequency bands. To provide the corresponding side information for each frequency band, a DCT is applied on the previously reconstructed key frame

block matching Once the decoder receives and decodes all low bands, a block matching algorithm is used for motion estimation of each block with reference to the previously decoded key frame. Since here only reconstructed low bands of the new key frame are available at the decoder, the best match is found using the mean squared error (MSE) of low frequency components.

block matching(cont.) K low is the total number of low bands and U and V are the DCT transform of A and B(n*n block) The motion compensated frame is the new side information for the remaining frequency bands.

Coding mode selection and side information renement To select the proper coding method for high frequency bands we need to estimate the accuracy of the side information.

Coding mode selection and side information renement(cont.) Since the side information is a noisy version of the source, measuring the distortion between decoded low bands of the current key frame and those of the motion compensated one at the decoder can help to give an estimation of the distortion for high bands.

Coding mode selection and side information renement(cont.) where ´X k (WZ)denotes the reconstructed X k at the decoder and ˜X k (intra)denotes a vector formed by grouping the kth DCT coefcient from all blocks of the motion compensated frame at the decoder. L is the number of elements in each frequency band which is the number of DCT blocks in a frame.

Coding mode selection and side information renement(cont.) If D is less than a threshold T D, the side information is likely accurate enough that Wyner-Ziv coding can outperform Intra coding for high frequency bands. The whole process of Wyner-Ziv coding of low bands, side information renement, and nding the proper coding method for high bands is called adaptive coding for the rest of this paper. Otherwise, Intra coding is applied for them.

Coding mode selection and side information renement(cont.)

overview TRANSFORM DOMAIN WYNER-ZIV CODING CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS KEY FRAME ENCODING BASED ON FREQUENCY BAND CLASSIFICATION AND SIDE INFORMATION REFINEMENT HIERARCHICAL CODING SIMULATION RESULTS

HIERARCHICAL CODING Key frames WZ-4 frames WZ-2 frames

HIERARCHICAL CODING

Key frames key frames occur every 4 frames and they are used to generate side information corresponding to WZ-4 frames The rst key frame is intra coded Applying the proposed adaptive coding method in Section IV will be very helpful to exploit temporal correlation of key frames in high frame rate videos or low motion sequences.

WZ-4 frames 2 frame distance from key frames and 4 frame distance from each other. apply the proposed correlation noise classication method Once low bands are reconstructed at the decoder, they are used to rene the side information, and the rest of the frequency bands are Wyner-Ziv encoded with the rened side information.

WZ-2 frames MCFI method proposed is applied on their key frame and WZ-4 frame immediate neighbors Since here the frame distance is only 1 frame from each side, the obtained side information is more accurate than for WZ-4 Empirically, for WZ-2 frames, having low bands is not very helpful to provide more accurate side information than the one attained by MCFI. So, the renement step is not applied for them.

HIERARCHICAL CODING

overview TRANSFORM DOMAIN WYNER-ZIV CODING CORRELATION NOISE CLASSIFICATION BASED ON MATCHING SUCCESS KEY FRAME ENCODING BASED ON FREQUENCY BAND CLASSIFICATION AND SIDE INFORMATION REFINEMENT HIERARCHICAL CODING SIMULATION RESULTS

Fig. 7-9 show the rate-distortion performance for the test sequences Claire, Mother-daughter,Foreman, and Carphone QCIF (176 × 144) sequences at 30 fps, 15fps and 10fps. Fig. 8 (e) shows the rate-distortion performance for the Soccer QCIF sequence at 15 fps.

SIMULATION RESULTS In all ofine processes such as setting threshold values and correlation noise classication lookup tables, training video sequences are different from test video sequences. Our training sequences are Container, Salesman, Coastguard and Akiyo. In our simulation QP ={0.4, 0.85, 1.5, 2, 3, 3.5, 4} for each one of these quantization parameters, a threshold value is set.

SIMULATION RESULTS We tried different values between 50 and 1800 with step sizes 20 to 100 for several video sequences at different quantization parameters. The value of the step size depends on the quantization parameter, with larger step sizes for larger quantization parameters.

SIMULATION RESULTS Threshold values [T 1, T 2,..., T 7 ] =[200, 290, 740, 860, 1200, 1500, 1800] corresponding to quantization parameters QP = [0.4, 0.85, 1.5, 2, 3, 3.5, 4], were chosen as they work well for the training sequences with different characteristics. For correlation noise classication, for each type of Wyner-Ziv frame and each quantization step, a different lookup table is calculated.

SIMULATION RESULTS Table II shows the average number of times that key frame high bands are Wyner-Ziv coded in the Adaptive coding method.

SIMULATION RESULTS Wyner-Ziv coding equipped with correlation noise classication is applied for Wyner-Ziv frames of the conventional method :Conventional W Z+ adaptive method+correlation noise classication:AdaptiveW Z+ Hierarchical key intra Hierarchical key adaptive using intra method as low complexity as JPEG: H.264 intra and H.264 I-B-I

SIMULATION RESULTS applying the correlation noise classication proposed in Section III results in up to 2 dB improvement over Conventional 1 dB improvement over the best proposed method (coefcient level) in [21] (Claire 10 fps at 240 Kbps).

SIMULATION RESULTS The gain is more for low motion and higher frame rate sequences where the inter-correlation is high.

SIMULATION RESULTS Fig. 8 and 9 (c) for Foreman, as a high motion sequence at 15 fps and 10 fps, the performance of Adaptive-WZ+ is very close to that of Conventional-WZ+ but with a slight degradation. For very high motion sequences like Soccer at 15 fps where the MCFI method gives a poor side information,

SIMULATION RESULTS Hierarchical key adaptive is capable of beating all methods for most cases and results in up to 1 dB additional improvement. The exceptions are Foreman at 15 fps, 10 fps and Soccer at 15 fps. For these high motion and low frame rate cases, since in the hierarchical structure, key frames are 4 frames apart, the temporal correlation between key frames is very low.

SIMULATION RESULTS As shown in Fig. 8-9, Hierarchical key intra can beat Hierarchical key adaptive for these cases. Although even Hierarchical key intra results in degradation for Soccer as the whole idea of Wyner-Ziv coding fails for this sequence.

Complexity Since in this work all methods are using an intra method as low complexity as JPEG, to have a fair comparison, intra predictions, Hadamard transform and context adaptive binary arithmetic coding (CABAC) are turned off for I frames of H.264 intra and H.264 I-B-I.

Complexity

In the context of Wyner-Ziv video coding, the main goal is providing a low cost and low complexity encoder. Although most of the H.264 encoder complexity is due to motion estimation, the computational requirements of CABAC and intra prediction modes may be still too high for some applications

CONCLUSION We proposed three new techniques to improve the overall rate distortion performance of Wyner-Ziv video coding (1) a new method of correlation noise estimation based on block matching classication at the decoder, (2) an advanced mode selection scheme for frequency bands of key frames followed by side information renement, (3) a hierarchical Wyner-Ziv coding approach including the other two schemes.

The end Thank you~