Download presentation
Presentation is loading. Please wait.
Published byPriscilla Bradley Modified over 9 years ago
1
Arko Barman Computer Vision & Artificial Intelligence Lab Department of Electrical Engineering Indian Institute of Science, Bangalore
2
New paradigm for video compression Based on results from Information Theory proposed in 1970s A radical departure from traditional video compression techniques Well suited for applications which require many encoders and a single decoder
3
Low-complexity, low-power encoder Possibly higher complexity decoder Should achieve coding efficiency similar to that of conventional video compression techniques Should try to achieve the Rate-Distortion performance of conventional schemes
4
Number of encoders usually much higher than number of decoders (usually one) Partially overlapping areas in multiple video sequences Should exploit correlation between multiple encoded video sequences at the decoder Low-complexity encoders required Decoder may be of higher complexity Wireless low-power surveillance networks
5
Wireless Mobile Video
6
Both encoder and decoder must be of low cost and low complexity Encoder must be Wyner-Ziv for low complexity Decoder must be MPEG-x or H.26x for low complexity A base station receives Wyner-Ziv encoded bitstream from transmitter, decodes it, re-encodes it as MPEG-x or H.26x and transmits to receiver Wireless Mobile Video
7
Multi-view Acquisition
8
Neighbouring cameras of a large camera-array capture overlapping, and hence, correlated video sequences Independent encoding of videos in individual cameras Joint decoding at a central station – must exploit correlation between different views Used for image-based rendering (3D reconstruction with texture-mapping) Multi-view Acquisition
9
Applications like tracking a person throughout an environment, monitoring of activities, tracking events and creating alarms Multiple sensors with video-acquisition capabilities – must be of low-cost, low-power and low- complexity Central decoding device with high computational capabilities and storage Video-based Sensor Networks
10
Entropy: where, Joint Entropy: Conditional Entropy: Information Theory Fundamentals
11
Lower bound on the bitrate of signals: Lower bound on total bitrate: Information Theory Fundamentals
12
Slepian-Wolf Theorem
13
Consider two statistically dependent sequences X and Y separately encoded but jointly decoded Is it possible to recover these dependent sequences with arbitrarily low reconstruction error probability? In 1973, Slepian and Wolf determined possible rate combinations of R X and R Y for reconstruction of X and Y with an arbitrarily small error probability These bounds are given by the conditional entropies of the signals X and Y, and their joint entropy Slepian-Wolf Theorem
14
The bounds on the rates are determined to be Slepian-Wolf Theorem
15
Even when encoding of correlated sources performed independently, a total bitrate equal to the joint entropy is enough Theoretically, separate encoding in distributed video coding schemes does not need to have any loss in compression efficiency compared to conventional video coding techniques Defines an achievable rate region for reconstruction of dependent sequences with arbitrarily small probability of error Slepian-Wolf Theorem
17
Wyner-Ziv Theorem
18
In 1976, Wyner and Ziv studied a special case of Slepian-Wolf coding corresponding to the rate point Deals with source coding of a sequence X considering the sequence Y (known as side information) to be available at the decoder Known as lossy compression with decoder side information Wyner-Ziv Theorem
19
Source values X encoded without access to side information Y Decoder has access to Y and obtains a reconstruction of the source values Distortion is acceptable Wyner-Ziv Rate-Distortion function is the achievable lower bound for the bitrate for a distortion D Wyner-Ziv Theorem
20
Mathematically, where is the minimum rate necessary to encode X when Y is available at the encoder i.e. statistical dependency between X and Y is utilized while encoding X, for an average distortion D. Wyner-Ziv Theorem
21
Note that for no distortion i.e. D=0, we get the same result as Slepian-Wolf Theorem i.e. Inequality of Wyner-Ziv Theorem reduces to equality for Gaussian memory-less sources and mean squared error distortion function i.e. Wyner-Ziv Theorem
22
In 1996, Zamir proved that for general statistics and mean-squared error distortion function, the rate loss is less than 0.5 bits/sample i.e. Combining with the Wyner-Ziv Theorem, we have Wyner-Ziv Theorem
23
The term ‘distributed’ refers to the encoding operation mode and not location Coding of two or more dependent sources in an independent way i.e. associating a separate independent encoder to each source Independent bitstream sent from each encoder – signals are encoded without exploiting the correlation between them A single decoder performs joint decoding of all received bitstreams using statistical dependencies between them
24
Pixel-domain Codec
25
Quantizer divides signal space into cells May consist of non-contiguous sub-cells mapped into same quantizer index Q Practical implementations of Lloyd Algorithm for optimal vector quantizers lack in performance or are prohibitively complex Unfortunately, code cell contiguity precludes optimality of quantizers in general Quantization & Dequantization
26
Introduction of a rate measure that depends on both quantization index and side information divorces dimensionality of the quantizer from block length of Slepian-Wolf coder – fundamental requirement for practical system design At high rates and certain other conditions, lattice quantizers are optimal for Wyner-Ziv Coding Disconnected quantization cells need not be mapped into the same index Asymptotically, there is no performance loss by not having access to the side information at the encoder Quantization & Dequantization
27
Unconventional video coding system Encodes individual frames independently, but decodes them conditionally Only intra-frame processing required at encoder Inter-frame processing only at decoder Slepian-Wolf Encoder & Decoder
28
Previously decoded frames used as side information for decoding a Wyner-Ziv coded frame Performance closer to conventional inter-frame coding (MPEG) than conventional intra-frame coding (Motion-JPEG) Encoding may be in pixel domain or transform domain Slepian-Wolf Encoder & Decoder
29
Slepian-Wolf codec can be implemented using any of the following: DISCUS (DIstributed Source Coding Using Syndromes) Turbo codes, like RCPT (Rate-Compatible Punctured Turbo code) LDPC (Low-Density Parity-Check) codes IRA (Irregular-Repeat-Accumulate) codes Slepian-Wolf Encoder & Decoder
30
Pixel-domain Codec using RCPT
31
A subset of frames, regularly spaced in the video sequence, selected as keyframes, K Keyframes are encoded and decoded using conventional intraframe 8x8 Discrete Cosine Transform (8x8 DCT) Frames between keyframes are called “Wyner-Ziv frames” Wyner-Ziv frames are intraframe-encoded but interframe-decoded Pixel-domain Codec
32
For each Wyner-Ziv frame, S, each pixel value is uniformly quantized with intervals Subtractive dithering done to avoid contouring and improve subjective quality of reconstructed image Sufficiently large block of quantizer indices q provided to Slepian-Wolf encoder Pixel-domain Codec
33
RCPT provides rate-flexibility Rate adapts to changing statistics between side information and frame to be encoded In this system, rate of RCPT is chosen by decoder and relayed to encoder through feedback For each Wyner-Ziv frame, decoder generates side information,, by using previously decoded keyframes, and possibly previously decoded Wyner-Ziv frames Pixel-domain Codec using RCPT
34
To exploit side information, decoder assumes a statistical model of the ‘correlation channel’ Laplacian distribution of difference between individual pixel values S and is assumed Decoder estimates parameter of Laplacian distribution by observing the statistics from previously decoded frames Pixel-domain Codec using RCPT
35
Turbo decoder combines side information and received parity bits to recover symbol stream If decoder cannot reliably decode original symbols, it requests additional parity bits from encoder buffer through feedback This “request-and-decode” process is repeated until an acceptable probability of symbol reconstruction error is achieved Pixel-domain Codec using RCPT
36
Using side information, decoder predicts the quantization bin q For this, decoder needs to request bits to establish which of the bins a pixel belongs to With calculated values of and, decoder calculates MMSE reconstruction of the original frame, S Pixel-domain Codec using RCPT
37
If side information is within reconstructed bin, then reconstructed pixel takes a value close to side information value Otherwise, is outside and the reconstruction function forces to lie within the bin Magnitude of reconstruction error limited to a maximum value determined by quantizer coarseness – perceptually desirable property since it eliminates large errors, which might me annoying to the viewer Pixel-domain Codec using RCPT
38
Compared to conventional motion-compensated coding, pixel-domain WZ coding is much less complex Motion estimation, prediction and DCT not required for encoding of WZ frames Slepian-Wolf encoder requires two feedback shift registers and an interleaver Pixel-domain Codec using RCPT
39
Transform-domain Codec
40
Block-wise DCT is applied to WZ frame W in the encoder to generate transformed signal X Transform coefficients are grouped together to form coefficient bands, where k denotes the coefficient number Each transform coefficient band is then encoded independently Transform-domain Encoding using RCPT
41
For each, coefficients are quantized using uniform scalar quantizer with levels Quantized symbols, are converted to fixed- length binary codewords Corresponding bitplanes are blocked together forming bit-plane vectors Each bit-plane vector coded by Slepian-Wolf encoder Transform-domain Encoding using RCPT
42
Slepian-Wolf coder is implemented using RCPT RCPT, combined with feedback, provides rate flexibility which is essential in adapting to changing statistics between side information and frame to be encoded Parity bits produced by turbo encoder are stored in a buffer Buffer transmits a subset of these parity bits to decoder on request Transform-domain Encoding using RCPT
43
Decoder takes previously reconstructed frames to form side information, an estimate of W Block-wise DCT of is taken to generate Transform coefficients from are grouped together to form coefficient bands (side information corresponding to ) To be able to use at turbo decoder and reconstruction block, a statistical dependency model is assumed between and Transform-domain Decoding using RCPT
44
Given a coefficient band, the turbo decoder successively decodes bit-planes starting from most significant bit-plane Decoder uses received subset of parity bits corresponding to that bit-plane and side- information to decode current bit-plane If decoder cannot reliably decode the bits, it requests additional parity bits from the encoder buffer through feedback Transform-domain Decoding using RCPT
45
This “request-and-decode” process continues until an acceptable probability of reconstruction error is achieved Probabilities generated for current bit-plane are used for decoding lower significance bit-planes By using side information and successively decoding bitplanes, decoder needs to request bits to decode which of the bins a transform coefficient belongs to Transform-domain Decoding using RCPT
46
When all bitplanes are decoded, bits are regrouped and the quantized symbol stream is reconstructed as Reconstructed coefficient band is calculated as Assuming is error free, this reconstruction function bounds magnitude of reconstruction distortion to a maximum value depending on quantizer coarseness Transform-domain Decoding using RCPT
47
This property is desirable since it eliminates large positive or negative errors for a given transform coefficient Fewer errors are perceptible to the viewer and subjective quality of reconstructed video is improved Finally reconstructed WZ frame is generated by taking IDCT of the reconstructed coefficient bands Transform-domain Decoding using RCPT
48
Motion compensated side information is generated at the decoder As a result, decoders are more complex than encoders Here we consider every odd frame to be a keyframe and every even frame to be a WZ frame Frames may or may not be decoded in their actual sequence (similar to conventional video coding techniques) Motion-Compensated Side Information
49
Side information for a WZ frame at time index t is generated by motion-compensated interpolation using decoded keyframes at time and Involves symmetrical bi-directional block matching, smoothness constraints for estimated motion and overlapped block motion compensation Since next keyframe is needed for interpolation, frames are decoded out-of-order (similar to B frames in predictive video coding) Motion-compensated Interpolation (MC-I)
50
To generate side information for WZ frame at time index t, we estimate motion between previously decoded WZ frame at time and previously decoded keyframe at time using block matching and a smoothness constraint Estimated motion is extrapolated to time t and side information is generated by performing overlapped motion compensation using pixel values from previous key frame Motion-compensated Extrapolation (MC-E)
51
Since a previously decoded WZ frame is used for motion estimation, reconstruction errors from all the previously decoded WZ frames can accumulate and degrade the reliability of motion compensation Unlike MC-I, all frames can be decoded sequentially Motion-compensated Extrapolation (MC-E)
52
Simplified interpolation or extrapolation scheme to reduce decoder complexity at the expense of reduced compression efficiency 1. Average Interpolation (Ave-I): Side information for WZ frame is generated by averaging pixel values from keyframes at and 2. Previous Frame Extrapolation (Prev-E): Previous keyframe is used directly as side information Low-complexity side information
54
B. Girod, A. Aaron, S. Rane, D. Rebollo-Monedero, “Distributed Video Coding” (Invited Paper), Proc. IEEE Special Issue on Advances in Video Coding and Delivery, 2005 Catarina Isabel Carvalheiro Brites, “Advances on Distributed Video Coding”, MSc. Thesis, Technical University of Lisbon, Institute of Superior Technology A. Aaron, R.Zhang, B. Girod, “Wyner-Ziv Coding of Motion Video”, Asilomar Conference on Signals and Systems, Pacific Grove, CA, Nov 2002 A. Aaron, E. Setton, B.Girod, “Towards practical Wyner-Ziv Coding of Video”, Proc. IEEE International Conference on Image Processing, Barcelona, Spain, Sept 2003 A. Aaron, S. Rane, E. Setton, B. Girod, “Transform-domain Wyner-Ziv Codec for Video” in Proc. SPIE Visual Communications and Image Processing, San Jose, CA, Jan. 2004
55
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.