COMPARISON OF 8 × 8 INTEGER DCTs USED IN H.264, AVS-CHINA AND VC-1 VIDEO CODECS

COMPARISON OF 8 × 8 INTEGER DCTs USED IN H.264, AVS-CHINA AND VC-1 VIDEO CODECS Submitted by, Ashwini Ursand Sharath Patil Under guidance of Dr.K.R.Rao

Introduction

Integer DCT • KLT is the statistically optimal transform. • The performance of DCT is close to the performance of KLT [1]. • DCT is a well-known transform and is widely used by majority of coding standards. • Though integer DCT contains only integers, it has similar energy-packing ability as that of DCT [1].

Integer DCT (Continued) • Integer cosine transform does not involve floating point computations and hence is used in video coding standards such as H.264 [2], VC-1 [3] and AVS [4]. • Integer cosine transform has been implemented with transform sizes of 4, 8 and 16 [1]. • Even larger size transforms (up to 64) have been used for high resolution videos to achieve higher coding gain [1].

Integer DCTs compared

Integer DCT matrix for AVS-China, H.264 and VC-1 AVS-China [2] H.264 [3] VC-1 [4]

Integer DCT matrix for AVS-China, H.264 and VC-1 • The orthogonality of the 3 matrices was checked by evaluating [INTDCTi] x [INTDCTi]*T. • The orthogonalisedmatrices are: • AVS-China = diag(512, 442, 464, 442, 512, 442, 464, 442) • H.264 = diag(512, 578, 320, 578, 512, 578, 320, 578) • VC-1 = diag(1152, 1156, 1168, 1156, 1152, 1156, 1168, 1156)

Order-16 Integer DCT matrix used in AVS-China [26]

Comparison of the properties of integer DCTs

Comparison of interger DCT matrices • The properties of the 3 integer DCT matrices were compared by considering a covariance matrix Rfor a Markov-I process with ρ = 0.95 and N=8. • Rjk = [ρ|j-k|] for j, k = 0, 1,…, N-1, where ρ is the adjacent correlation coefficient. • Covariance matrix in transform domain is given by where DOT is discrete orthogonal transform and [Σ] is the covariance matrix in spatial

Properties used for comparison of integer DCTs • Variance distribution: The diagonal elements of correspond to the variances in the transform domain [7]. • Rate versus distortion: RD is the minimum average rate (bits/sample) for coding a signal at a specified distortion D [7]. For fixed average distortion D, rate distortion function RD is computed as Choose values of θ betweent 0.1 and 1. For the same values of θ, D and RD are calculated [7].

Properties used for comparison of integer DCTs • Normalized basis restriction error, Jm: The compaction of energy in a few transform coefficients can be represented by the normalized basis restriction error defined as [7]: where are arranged in decreasing order [7].

Properties used for comparison of integer DCTs • Residual correlation: An indication of the extent of decorrelation in transform domain can be gauged by correlation left undone by the discrete transform, which is measured by the absolute sum of cross-covariance (off diagonal elements) in the transform domain i.e., for N = 8 as a function of ρ [7].

Properties used for comparison of integer DCTs • Transform coding gain GTC: Transform coding gain is defined as the ratio of arithmetic mean to geometric mean of variances where is the variance of the ith co-efficient in the transform domain. • As sum of all the variances is in invariant under orthogonal transformation, by minimizing geometric mean GTC can be maximized [7].

Results and Conclusion

Variance distribution versus N

Rate versus distortion

Normalized basis restriction error versus samples retained m

Residual correlation versus correlation co-efficient

Conclusion • Variance distribution, normalized basis restriction error and transform coding gain of these 3 codecs are almost comparable. • Transform coding gain, GTC for AVS, H.264 and VC-1 are 8.2916, 8.0155 and 7.5477 respectively. From this, we observe that AVS achieves maximum GTC. • For a fixed average distortion D, the rate distortion function characteristics of H.264 and AVS are indistinguishable. • The residual correlation for ρ > 0.5 is indistinguishable for these 3 codecs.

References [1] C. Fong and W. Cham, “Simple order-16 integer transform for video coding”, The Chinese university of Hong Kong, Shatin, Hong Kong. [2] S.K.Kwon, A.Tamhankar and K.R.Rao, “Overview of H.264 / MPEG-4 Part 10” J. Visual Communication and Image Representation, vol. 17, pp.186-216, April 2006. [3] S. Srinivasan , et al, “Windows Media Video 9: overview and applications”, Signal Processing: Image Communication, vol. 19, Issue 9, pp. 851-875, Oct. 2004 [4] W. Gao et al., “AVS – The Chinese next-generation video coding standard,” National association of broadcasters, Las Vegas, 2004 [5] R. Joshi, Y. Reznik and M. Karczewicz, “Efficient large size transforms for high-performance video coding”, Qualcomm Inc., San Diego, CA, USA. [6] “Integer DCT for AVS China”, INTDCT6 - http://www-ee.uta.edu/dip/Courses/EE5355/ee5355.htm.

References [7] “Comparison of discrete transforms”, http://www-ee.uta.edu/dip/Courses/EE5355/ee5355.htm. [8] N.Ahmed, T.Natarajan and K.R.Rao, “Discrete cosine transform”, IEEE trans. computers, Vol. X, pp.90-93, 1974. [9] A.K.Jain, “Fundamentals of digital image processing”, Prentice hall, 1989. [10] A.T. Hinds, “Design of high-performance fixed-point transforms using the common factor method”, Ricoh I infoprint solutions company, Boulder, CO, USA. [11] T.Wiegand, et al “Overview of the H.264/AVC video coding standard”, IEEE Trans. on Circuit and Systems for Video Technology, vol.13, pp. 560-576, July 2003. [12] T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”, IEEE Signal Processing Magazine, vol. 24, pp. 148-153, March 2007.

References [13] D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC standard and its applications”, IEEE Communications Magazine, vol. 44, pp. 134-143, Aug. 2006. [14] A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal processing: image communication, vol. 19, pp. 793-849, Oct. 2004. [15] M.Fieldler, “Implementation of basic H.264/AVC decoder”, seminar paper at Chemnitz university of technology, June 2004. [16 ]R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC standard”, EBU Technical Review, Jan. 2003. [17]D. Marpe, T. Wiegand, and S. Gordon, "H.264/MPEG4-avc fidelity range extensions: tools, profiles, performance, and application areas," in, IEEE international conference on image processing, vol. 1, pp. I-593-6, 2005. [18] S. Saponara et al, "The JVT advanced video coding standard: complexity and performance analysis on a tool-by-tool basis," in Packet Video Workshop, Nantes, France, April 2003. [19] VC-1 technical overview - http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx

References [20] S. Srinivasan and S. L. Regunathan, “An overview of VC-1”, SPIE / VCIP, vol. 5960, pp. 720-728, July 2005. [21] AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part 2: Video (AVS1-P2 JQP FCD 1.0),” Audio Video Coding Standard Group of China (AVS), Doc. AVS-N1538, Sept. 2008. [22] AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part 3: Audio,” Audio Video Coding Standard Group of China (AVS), Doc. AVS-N1551, Sept. 2008. [23] L Yu et al., “Overview of AVS-Video: Tools, performance and complexity,” SPIE VCIP, vol. 5960, pp. 596021-1~ 596021-12, Beijing, China, July 2005. [24] L. Fan, S. Ma and F. Wu, “Overview of AVS video standard,” IEEE Int’l Conf. on Multimedia and Expo, ICME '04, vol. 1, pp. 423–426, Taipei, Taiwan, June 2004. . [25] Special issue on 'AVS and its Applications' Signal processing: image communication, vol. 24, pp. 245-344, April 2009. [26] C. K. Fong and W. K. Cham, “Simple order-16 integer transform for video coding”, http://www-ee.uta.edu/Dip/Courses/EE5355/INTDCT5.pdf

COMPARISON OF 8 × 8 INTEGER DCTs USED IN H.264, AVS-CHINA AND VC-1 VIDEO CODECS