A Comparative Study of Depth Map Coding Schemes for 3D Video

A Comparative Study of Depth Map Coding Schemes for 3D Video
Harsh Nayyar, Nirabh Regmi, Audrey Wei March 10th, 2011 EE 398A: Image and Video Compression Professor Girod

Overview Background & Motivation Research Methodology
Results & Performance Comparisons Block Transforms (DCT, KLT) Block Truncation Coding (BTC) Conclusion Questions

Background & Motivation
3D Compression Issue: Bit rate scales linearly with number of views Proposed solution: Code 2-3 views along with depth maps to synthesize intermediate views [Wiegand et al.] Requires good depth maps Depth Maps Desirable to preserve edges Not typical images

Research Methodology Block Transform Coding Block Truncation Coding
DCT and KLT Block Truncation Coding Constant and adaptive block sizes Distortion calculated based on synthesized view from uncompressed depth maps

System Overview Left Image (Compressed) Left Depth Map View Synthesis
Intermediate Image Right Image (Compressed) Right Depth Map

Evaluation Methodology
Test Sequences: Balloons & Kendo Depth Maps: Cameras 1 & 3 Synthesized Views: Camera 2 Acknowledgement: Tanimoto Lab, Nagoya University

Discrete Cosine Transform (DCT)
Block Matrix Sizes: M = 8, 16 Uniform Quantizer Step Sizes: Entropy Coding Type used: DCT-II

Discrete Cosine Transform (cont.)
Quantizer step size = 21 Quantizer step size = 28

Discrete Cosine Transform (cont.)
balloons error, M = 8, Q = 128

Karhunen-Loeve Transform (KLT)
Block Matrix Sizes: M = 8, 16 Uniform Quantizer Step Sizes = Entropy Coding Training Set: composed from both views M m x n x p x M M

Karhunen-Loeve Transform (cont.)
Quantizer step size = 21 Quantizer step size = 28

Karhunen-Loeve Transform (cont.)
balloons error, M = 8, Q = 128

Block Truncation Coding (BTC)
Good at preserving edges Quantized values per block: a & b Block Matrix Sizes: M = 2, 4, 8, 16, 32, 64 Entropy Coding if , output = a if , output = b where q = # of Xi’s > for i = 1, 2, … , M2

Block Truncation Coding (cont.)
M = 8 M = 4 ~1.1dB

balloons error, M = 64

Adaptive BTC Spend bits where necessary
Large blocks handle background (low rate) Small blocks handle edges (high rate) Make block size selection based on Lagrangian cost function

Adaptive BTC (cont.) Lagrangian cost function,
Joint cost of both depth maps Distortion (D) processed from synthesized view , = 20 – 28 Bit rate (R) calculation 6 Block sizes (M=2-64): 3 bits Quantized values, a & b: Entropy coding Positions of a & b in the block: Run Length Coding & Entropy coding 1 a b

Adaptive BTC (cont.) as Mmax increases

Final Results

Final Results (cont.) Balloons error (frame 1)
Scheme: DCT (M = 8, Q = 64) PSNR = dB Rate = bpp

Scheme: Fixed BTC (M=32) PSNR = dB Rate = bpp

Scheme: A-BTC (Mmax=64,Q=32) PSNR = dB Rate = bpp

Final Results (cont.)

Conclusion Depth Maps Not ordinary images Important to preserve edges Adaptive BTC technique can optimally trade off rate and synthesized distortion Fixed BTC outperforms DCT, KLT without side information about synthesized distortion Adaptive BTC outperforms DCT, KLT, Fixed BTC

Future Work Adaptive BTC
Joint Lagrangian cost based on all possible ways of breaking down blocks in pair of views Our implementation is sub-optimal Investigate heuristics to perform block sub-division top-down rather than bottom-up Preserve higher moments in BTC Only preserved 2nd moment Larger block sizes Only used up to Mmax = 64

References N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans. Compiti., vol. C-23, pp , 1974. Balloons & Kendo Sequences, Nagoya University Tanimoto Laboratory , E. Delp and O. Mitchell, “Image Compression Using Block Truncation Coding,” Communications, IEEE Transactions on., vol. 27, no. 9, pp , Sep Z. Li and M. Drew, ”Karhunen-Loeve Transform,” in Fundamentals of Multimedia. Upper Saddle River. Pearson Education, 2004, ch. 8, sec pp P. Merkle, Y. Morvan, A. Smolic, D. Farin, K. Muller, P. H. N. de With, and T. Wiegand, “The effects of multiview depth video compression on multiview rendering,” Signal Process., Image Commun., vol. 24, no. 1+2, pp. 7388, Jan K. Mller, P. Merkle, and T. Wiegand, “3-D video representation using depth maps,” Proceedings of the IEEE, vol. PP, no. 99, pp. 1-14, 2010.

A Comparative Study of Depth Map Coding Schemes for 3D Video

Similar presentations

Presentation on theme: "A Comparative Study of Depth Map Coding Schemes for 3D Video"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Comparative Study of Depth Map Coding Schemes for 3D Video

Similar presentations

Presentation on theme: "A Comparative Study of Depth Map Coding Schemes for 3D Video"— Presentation transcript:

Similar presentations

About project

Feedback