1 / 28

A Comparative Study of Depth Map Coding Schemes for 3D Video

A Comparative Study of Depth Map Coding Schemes for 3D Video. Harsh Nayyar, Nirabh Regmi, Audrey Wei March 10 th , 2011 EE 398A: Image and Video Compression Professor Girod. Overview. Background & Motivation Research Methodology Results & Performance Comparisons

kirti
Download Presentation

A Comparative Study of Depth Map Coding Schemes for 3D Video

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Comparative Study of Depth Map Coding Schemes for 3D Video Harsh Nayyar, Nirabh Regmi, Audrey Wei March 10th, 2011 EE 398A: Image and Video Compression Professor Girod

  2. Overview • Background & Motivation • Research Methodology • Results & Performance Comparisons • Block Transforms (DCT, KLT) • Block Truncation Coding (BTC) • Conclusion • Questions

  3. Background & Motivation • 3D Compression • Issue: Bit rate scales linearly with number of views • Proposed solution: Code 2-3 views along with depth maps to synthesize intermediate views [Wiegand et al.] • Requires good depth maps • Depth Maps • Desirable to preserve edges • Not typical images

  4. Research Methodology • Block Transform Coding • DCT and KLT • Block Truncation Coding • Constant and adaptive block sizes • Distortion calculated based on synthesized view from uncompressed depth maps

  5. System Overview Left Image (Compressed) Left Depth Map View Synthesis Intermediate Image Right Image (Compressed) Right Depth Map

  6. Evaluation Methodology • Test Sequences: Balloons & Kendo • Depth Maps: Cameras 1 & 3 • Synthesized Views: Camera 2 Acknowledgement: Tanimoto Lab, Nagoya University

  7. Discrete Cosine Transform (DCT) • Block Matrix Sizes: M = 8, 16 • Uniform Quantizer • Step Sizes: 21 - 28 • Entropy Coding • Type used: DCT-II

  8. Discrete Cosine Transform (cont.) Quantizer step size = 21 Quantizer step size = 28

  9. Discrete Cosine Transform (cont.) balloons error, M = 8, Q = 128

  10. Karhunen-Loeve Transform (KLT) • Block Matrix Sizes: M = 8, 16 • Uniform Quantizer • Step Sizes = 21 - 28 • Entropy Coding • Training Set: composed from both views M m x n x p x M M

  11. Karhunen-Loeve Transform (cont.) Quantizer step size = 21 Quantizer step size = 28

  12. Karhunen-Loeve Transform (cont.) balloons error, M = 8, Q = 128

  13. Block Truncation Coding (BTC) • Good at preserving edges • Quantized values per block: a & b • Block Matrix Sizes: M = 2, 4, 8, 16, 32, 64 • Entropy Coding if , output = a if , output = b for i = 1, 2, … , M2 where q = # of Xi’s >

  14. Block Truncation Coding (cont.) M = 8 M = 4 ~1.1dB

  15. Block Truncation Coding (cont.) balloons error, M = 64

  16. Block Truncation Coding (cont.) balloons error, M = 16

  17. Block Truncation Coding (cont.) balloons error, M = 2

  18. Adaptive BTC • Spend bits where necessary • Large blocks handle background (low rate) • Small blocks handle edges (high rate) • Make block size selection based on Lagrangian cost function

  19. Adaptive BTC (cont.) • Lagrangian cost function, • Joint cost of both depth maps • Distortion (D) processed from synthesized view • , = 20 – 28 • Bit rate (R) calculation • 6 Block sizes (M=2-64): 3 bits • Quantized values, a & b: Entropy coding • Positions of a & b in the block: Run Length Coding & Entropy coding

  20. Adaptive BTC (cont.) as Mmax increases

  21. Final Results

  22. Final Results (cont.) Balloons error (frame 1) Scheme: DCT (M = 8, Q = 64) PSNR = 37.65 dB Rate = 0.07465 bpp

  23. Final Results (cont.) Balloons error (frame 1) Scheme: Fixed BTC (M=32) PSNR = 38.6070 dB Rate = 0.0703 bpp

  24. Final Results (cont.) Balloons error (frame 1) Scheme: A-BTC (Mmax=64,Q=32) PSNR = 41.4849 dB Rate = 0.0622 bpp

  25. Final Results (cont.)

  26. Conclusion • Depth Maps • Not ordinary images • Important to preserve edges • Adaptive BTC technique can optimally trade off rate and synthesized distortion • Fixed BTC outperforms DCT, KLT without side information about synthesized distortion • Adaptive BTC outperforms DCT, KLT, Fixed BTC

  27. Future Work • Adaptive BTC • Joint Lagrangian cost based on all possible ways of breaking down blocks in pair of views • Our implementation is sub-optimal • Investigate heuristics to perform block sub-division top-down rather than bottom-up • Preserve higher moments in BTC • Only preserved 2nd moment • Larger block sizes • Only used up to Mmax = 64

  28. References • N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans. Compiti., vol. C-23, pp. 90-93, 1974. • Balloons & Kendo Sequences, Nagoya University Tanimoto Laboratory , http://www.tanimoto.nuee.nagoya-u.ac.jp/. • E. Delp and O. Mitchell, “Image Compression Using Block Truncation Coding,” Communications, IEEE Transactions on., vol. 27, no. 9, pp. 1335-1342, Sep. 1979. • Z. Li and M. Drew, ”Karhunen-Loeve Transform,” in Fundamentals of Multimedia. Upper Saddle River. Pearson Education, 2004, ch. 8, sec. 5.2. pp. 220-222. • P. Merkle, Y. Morvan, A. Smolic, D. Farin, K. Muller, P. H. N. de With, and T. Wiegand, “The effects of multiview depth video compression on multiview rendering,” Signal Process., Image Commun., vol. 24, no. 1+2, pp. 7388, Jan. 2009. • K. Mller, P. Merkle, and T. Wiegand, “3-D video representation using depth maps,” Proceedings of the IEEE, vol. PP, no. 99, pp. 1-14, 2010.

More Related