Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California.

Similar presentations


Presentation on theme: "Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California."— Presentation transcript:

1 Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California at San Diego IEEE Transactions on Image Processing, July 2007

2 Outline Introduction End-to-End Delay Effect of branch removal from HIER coders Delay due to encoder output buffer Proposed framework for rate allocation Motivation Theoretical background Proposed estimate Rate allocation algorithm Rate control Results Conclusion

3 Introduction Constraining delay is critical for real-time communication and live event broadcast Compression efficiency can be improved by Increasing the buffering delay (bit rate allocated to each frame can vary) More flexible motion-compensated prediction structures When temporal correlation among several neighboring frames is better exploited, additional delay is incurred Example: Tradeoffs of delay and compression in MCTF, which delay was reduced by selectively removing the update step Delay is an issue for hierarchical bi-predictive structures, as well The delay in the hierarchical case depends on the GOP size Delay can’t be reduced by removing update steps while keeping the GOP size intact But in this work, it can be reduced by removing the MPC branches

4 Introduction One can also have increased delay when using a single- direction (forward) prediction One codec that employs two reference frames, short-term (ST) and long-term (LT) At constant transmission bit rate, the LT frames will take longer to transmit, introducing delay Compression efficiency can improved for certain sequences, but how about delay? A key element of a delay-constrained video encoder is the rate control scheme Rate control algorithms: Test Model 5 for MPEG-2, TM5 of replacing the block variance with the block SAD, and quadratic rate distortion model adopted in H.264

5 Introduction When the bit rate is distributed unevenly among the frames, extra buffering delay at the encoder output and decoder input is incurred So, given constraints on bit rate and buffering delay, such a R-D model can yield an efficient rate allocation To obtain a model for hierarchical prediction, we need to account for the temporal prediction distance In [16], the rate and distortion were calculated as functions of the power spectral density of the prediction error This model introduced the concept of MC accuracy In this work, we tend to use the accuracy to model the temporal prediction distance [16] B. Griod, “The efficiency of motion-compensating prediction for hybrid coding of video sequences”

6 End-to-End Delay The encoder is free to allocate the rate within the frames of the time unit as to optimize some quality criterion, while making sure that the unit as a whole adheres to the CBR target rate The encoder output buffer determines how tightly the rate allocation and rate control must operate Allowing the encoder output buffer to be larger leads to higher video quality

7 End-to-End Delay Bits buffered in decoder input buffer is the same as the encoder output buffer The encoder buffer fullness and the decoder buffer fullness are always complementary to each other and have a constant sum equal to the max size of each buffer The source coding end-to-end delay D e2e is : Assuming the size of the encoder output buffer and the decoder input buffer is B, we can obtain

8 Five types of encoders (1) (2)

9 Five types of encoders (3)

10 Five types of encoders (4)(5) IBBBP coder, where all B- coded pictures are disposable and use only I-and P-coded pictures as references

11 End-to-End Delay For all codecs apart from IBBBP For IBBBP Source coding end-to-end delay For IPPPP and PULSE codecs (N GOP = 1), the end- to-end delay is

12 Effect of branch removal from HIER coders

13 Truncated branch brings down the structural delay by one half The structure is similar to a GOP size 2 structure Differences: Still have 3 hierarchical levels and allows more granular temporal scalability or network condition Frame 4 is predicted frame 0, instead of being predicted from frame 2 as for GOP size 2. This means the compression performance will be worse than for a GOP size 2 structure For hierarchical B-pictures, the tradeoff of compression efficiency for delay becomes a tradeoff of compression efficiency for increased temporal scalability and bit-stream resilience and decreased delay

14 Delay due to encoder output buffer To avoid a buffer overflow during encoding, the necessary condition is The encoder can estimate the encoder output buffer length from the bit rate allocation A useful and intuitive lower bound is It translates the delay constraint into a rate allocation constraint

15 Proposed Framework for Rate Allocation

16 Motivation In the JM reference software uses a single QP for the entire frame however, rate allocation under tight delay constraints can’t use the same QP for the entire frame Extend the rate control algorithm to offer per-block decisions of QP, and seek to avoid buffer overflow and underflow and satisfy the target rate Goal : establish the bit rate allocation for different hierarchical levels with B-coded pictures We found through experiments that the efficiency of the bi-directional prediction of a frame depends on the distance from its references

17 Motivation Assumption a) Frames within a temporal decomposition level have similar entropy → can be afforded the same number of bits We seek a solution that doesn’t depend on video content: fixed proportion of bits for each temporal decomposition level The requirement for fixed ratios is a result of computational and delay constrains

18 Motivation Assumption b) Closed-loop coding Refers to using as references the previously reconstructed version of the frames Approaches for rate allocation in open loop MCTF are based on temporal propagation of the error didn’t take into account the temporal distance between the frames and lack any delay constraints Not appropriate for this work since we can’t afford the delay and the computational complexity

19 Motivation Assumption c) High-rate operation Closed-loop prediction at high rates doesn’t alter the signal significantly Effect of quantization error on prediction efficiency can be neglected for fine quantization Then, using a closed-loop video coder with the optimal open-loop rate allocation performs close the optimal closed-loop rate allocation

20 Theoretical background Rate-distortion (R-D) modeling scheme from [16] and [24] [16] B. Girod, “The efficiency of motion-compensating prediction for hybrid coding of video sequences” [24] B. Girod, “Efficiency analysis of multihypothesis motion-compensated prediction for video coding”

21 Theoretical background The prediction error:

22 Theoretical background Assume the p.d.f. of the displacement and is a function of the temporal prediction distance If the power spectral density of the prediction error is known, then the error variance is given as The well-known rate distortion function for memoryless coding is

23 Theoretical background Power spectrum is calculated for N- hypothesis prediction in [24] N=1 N=2

24 Theoretical background The power spectrum of the signal s is found in (19) in [17] as The noise power spectrum is The displacement error p.d.f Fourier transforms of the above p.d.f

25 Theoretical background and represent the Fourier transform of the spatial filter for single and double hypothesis For N=1 hypotheses, For N=2 hypotheses, Continue the derivation of the power spectral densities

26 Proposed estimate seems to be a logarithmic function of the temporal distance

27 Proposed estimate Replace with the expression derived the error variance Term produces approximately logarithmically spaced rate- distortion function

28 Proposed estimate For fixed the standard deviation of the motion compensation displacement error varies approximately linearly with the temporal prediction distance The final rate-distortion model is written as The motivation behind adding the term to the denominator of R l is that hybrid video coding is closed-loop and thus a case of dependent video coding

29 Rate allocation algorithm

30 An alternative approach to calculate the rate ratios was proposed by a reviewer for the HIER structure with N GOP = 4 Constraining D t, solve the equation (21) to find D r. After D r has been calculated, the calculation of R 0, R 1, and R 2 is straightforward

31 Rate control For the IPPPP codec, the rate control algorithm is included in JM 10.1 reference software is directed used To ensure accurate rate control under tight delay constraints we adopt the rate control approach of the PULSE codec, with multiple rate-control paths, each of which maintains its own quadratic model For a hierarchical stream, the number of rate control bins is equal to the number of temporal decomposition levels

32 Results Video sequences: Akiyo : very static image sequence Carphone : include localized motion of various kinds. The majority of activity is due to the instability of the camera inside the car. There is repetitive translational global motion Flower : high freq. content, and the motion is global and follows mainly the affine model Football : extremely active with local object motion Mobile : substantial high freq. content and the motion is mostly global due to the horizontal camera pan Stefan : sports clip featuring a tennis court with very high motion

33 Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Akiyo” CIF 352x288 at 15 fps. Initial QP 39. (b) “Akiyo” CIF at 30 fps. Initial QP 30.

34 Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Carphone” QCIF 176x144 at 15 fps. Initial QP 31. (b) “Carphone” QCIF at 30 fps. Initial QP 29

35 Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Flower-Garden” SIF 352x240 at 30 fps. Initial QP 29. (b) “Football” SIF at 30 fps. Initial QP 33

36 Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Mobile-Calendar” QCIF 176x144 at 30 fps. Initial QP 29. (b) “Stefan” CIF 352x288 at 30 fps. Initial QP 31.

37 Results

38

39 Conclusion The study of the delay tradeoffs yielded the following conclusions: IPPPP performs well for low delay applications and for sequences with high motion PULSE is advantageous for relatively static sequences with repetitive content N GOP > 1 structures benefit from static sequences and from sequences with global motion As N GOP increases, the gain is nontrivial only if the sequence is either static, or if the global motion is translational For the sequences we evaluated, the delay thresholds are as follows: between 40 and 80 ms, IPPPP is the best choice, between 80 and 125 ms PULSE performs well, the large space between 125 and 270 ms is dominated by N GOP = 2, and for delays larger than 270 ms, then N GOP = 4 is the best choice. Delays larger than 270 ms are only however useful in cases of live event broadcast or streaming of stored content. They are prohibitive for real-time interactive communication The truncated N GOP = 4 codec underperforms the N GOP = 2 codec but has similar delay with the added advantage of increased temporal scalability


Download ppt "Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California."

Similar presentations


Ads by Google