Download presentation
Presentation is loading. Please wait.
Published byCathleen Norris Modified over 9 years ago
1
New Direction in Wyner-Ziv Video Coding: On the Importance of Modeling Virtual Correlation Channel (VCC) Xin Li LDCSEE, WVU Email: xin.li@ieee.org “ If you can’t solve a problem, then there is an easier problem you can solve: find it.” - George Pólya
2
Formulation of a Simpler Problem x 2t x 2t-2 x 2t-1 key frames WZ frames I frames B frames Conventional video coding ( source coding) Wyner-Ziv video coding ( joint source-channel coding) x 2t-1 x 2t x 2t-2 Assuming I or key frames are coded by the same intra-frame encoder, can we achieve comparable coding efficiency on WZ frames to H.264 (state-of- the-art techniques of coding B frames)?
3
Outline of Our Attack Motivating observations –Characterizing the nonstationary virtual correlation channel (VCC) by a mixture model Theoretical derivation –Classification gain (dual to that in conventional source coding) Classification-based DVC algorithm –Approximate solution to the simplified problem Experimental results –Comparable R-D performance to H.264 JM11.0 (for certain type of video sequences: slow motion) Discussions and perspectives –Dualities between conventional and distributed video coding –DVC=video modeling + DSC (Rate) + Estimation (Distortion)
4
Motivations Learn from the conventional wisdom: What is the major factor contributing to the success of existing image/video coding standards such as JPEG2000 and H.264? –It is the source classification principle and its subtle implications rooted in the earlier pioneering works such as EZW/SPIHT and multi-hypothesis MCP Therefore, by following the duality, it is natural to consider the idea of classifying the virtual correlation channel in distributed source coding –Unlike conventional video coding, motion estimation (ME) is done at the decoder instead of encoder side in WZ video coding (we have addressed this issue separately under a different context 1 ) 1 X. Li, “Video processing via implicit and mixture motion model,” IEEE Trans. on Cir. Sys. for Video Tech., vol. 17, no. 8, pp. 953-963, Aug. 2007.
5
Modeling Non-stationary VCC Why is the virtual correlation channel is non- stationary? –Misaligned edges, deformable motion, illumination variations are all spatio-temporally varying phenomena Mixture modeling of virtual correlation channel WT of Interpolated WZ frames (side information) WT of original WZ frames additive errors (e.g., significant vs. insignificant wavelet coefficients) (e.g., significant vs. insignificant temporal interpolation errors)
6
Summary of Theoretical Results Rate-Distortion optimization problem formulation s.t. Conventional source codingDistributed source coding R-D function Rate allocation Classification gain
7
Implications into WZ Video Coding In conventional source coding, classification gain implies that subsource of larger variance be assigned a higher priority in rate allocation In distributed source coding, similar conclusion can be made except that the variance of “subsource” is now determined by the virtual correlation channel OR
8
Conclusion: the class of significant coefficients that are poorly motion compensated have the largest R-D slope (they should be coded first: where are they? and what are they?)
9
Rate Control Dilemma How can we estimate the second-order statistics of VCC: z 2 (the accuracy of side information y t generated by temporal interpolation)? –At the encoder, we have access to x t (original WZ frames) but not y t (side information) 1 –At the decoder, we have access to y t (side information) but not x t (original WZ frames) 2 –We have adopted decoder-based approach based on a feedback channel and scale invariance assumption about z t (an approximate but tractable solution) 1 Berkeley’s PRISM scheme allows simple temporal dependency estimation at the encoder. 2 Stanford’s researchers suggested the use of feedback channel for rate control.
10
Feedback via Scale Invariance of Interpolation Errors x 2t x 2t+2 x 2t-2 x 2t-1 key frames WZ frames oracleactual hall foreman Block-based significance map of z t Fine-resolution: oracle S.I. Coarse-resolution: Interpolated Key frame
11
wavelet transform advanced temporal interpolation Joint Exploitation WZ frames SW lossless coding of significance map SW lossless coding of significance coeff. SI CI decoded WZ frames decoded I frames EncoderDecoder block-based classification Classification-based WZ Video Coding System In a nutshell, we only allocate bits to the class of poorly motion x 2 and z 2 compensated significant coefficients: both x 2 and z 2 are large WT feedback channel
12
Joint Exploitation of Side and Coded Information at the Decoder z~N(0, z 2 ) y SI CI=Q(x) Target of estimation: E[x|y,Q(x)] Latent variable: z (we don’t know z 2 ) x Update estimate of x Update estimate of z 2 initial guess
13
Justification of Distortion Reduction foreman-qcif, block size 16 16, 18.3% blocks are coded SI alone SI+CI
14
Coding Experiments Setup Parameter setting –Block size: –Block size: 16 16, WT: Daubechies’ 9-7, Slepian-Wolf lossless encoder: LDPC-based 1, uniform quantizer (∆=8) – –Rate control: th x, th z - significance thresholds for x and z respectively – –SI generation: Implicit MC vs. Explicit MC Benchmark: H.264 JM11.0 implementation (QP of I frames is small and fixed ) 1 Liveris, A.D.; Zixiang Xiong; Georghiades, C.N., "Compression of binary sources with side information at the decoder using LDPC codes," IEEE Communications Letters, vol.6, no.10, pp. 440-442, Oct 2002
15
Comparison of Temporal Interpolation Foreman-qcif, ad-hoc fusion by simple averaging 1 X. Li, “Video processing via implicit and mixture motion model,” IEEE Trans. on Cir. Sys. for Video Tech., vol. 17, no. 8, pp. 953-963, Aug. 2007. Implicit MC 1 Explicit MC 2 2 Tourapis, A.M.; Hye-Yeon Cheong; Liou, M.L.; Au, O.C., "Temporal interpolation of video sequences using zonal based algorithms," Proc. of ICIP, pp.895-898 vol.3, 2001
16
R-D Performance Comparison (I) Foreman-qcif, 30framesHall-qcif, 30frames
17
R-D Performance Comparison (II) Container-qcif, 30frames Football-qcif, 30frames
18
Dualities between Conventional and WZ Video Coding Exploitation of motion-related temporal dependency –In traditional video coding, prediction is based on original frames (overhead is involved) –In WZ video coding, interpolation is based on reconstructed key frames (no overhead) –Importance of SI generation 1 R-D optimization shifted from encoder to decoder –In traditional video coding, decoder is often fixed but encoder enjoys considerable flexibility –In WZ video coding, rate control through the feedback channel offers great flexibility to the decoder without touching encoder 2 –Importance of matching SW lossless encoder with the statistics of virtual correlation channel (UEP is desirable) 1 L. Lu, D. He, A. Jagmohan, “Robust Multi-Frame Side Information Generation For Distributed Video Coding”, Proc. Of ICIP’2007 2 Girod, B.; Aaron, A.M.; Rane, S.; Rebollo-Monedero, D., "Distributed Video Coding,“ Proceedings of the IEEE, vol.93, no.1, pp.71-83, Jan. 2005
19
Acknowledgement Ligang Lu and Dake He for inviting me to participate this special session Zixiang Xiong for sharing with me his students’ implementation of LDPC-based Slepian-Wolf coding algorithm E. Simoncelli for stimulating discussions on distributed motion representations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.