FAST MULTI-REFERENCE MOTION ESTIMATION VIA STATISTICAL LEARNING FOR H.264/AVC

FAST MULTI-REFERENCE MOTION ESTIMATION VIA STATISTICAL LEARNING FOR H.264/AVC Chen-Kuo Chiang and Shang-Hong Lai Department of Computer Science, National TsingHuaUniversity, Taiwan Presenter: Yeh, Ta-Li

Introduction_1 • H.264/AVC • Video coding standard of Joint Video Team (JVT) • Efficiency and quality  • Variable block size motion compensation • Multiple reference frames • Directional spatial intra prediction • In-loop deblocking filtering • Computational complexity 

Introduction_2 • Multiple reference frames • High video coding quality  • Not every reference frame is useful  • Two methods to solve the problems • Rule-based approach • Criteria to eliminate unnecessary reference frames • Whether it is necessary to search more frames? • Inter SATD, intra SATD and motion vector compactness are examined (Y.-W. Huang,2003) • Check temporal and spatial content information in macroblock (MB level) • Speed up the search process (Q. Sun, 2007;T.-Y. Kuo , 2008) • Semi-statistical learning approach • Appropriate number of reference frames (statistical) (P. Wu, 2003)

Purpose • Statistical learning approach • Decide the best reference frame number • Choose representative feature • Train SVM (Support vector machine) for classification • Off-line pre-classification approach • Complete machine learning approach

Analysis of multi-reference motion estimation (MRME) • MRME have higher coding efficiency, but not all the sequences • Search more reference frames is helpful when: • A smaller block partition is chosen (variable-block-size motion estimation) • Occlusion or uncovering occurs • Marcoblock is across object boundaries • Marcoblock contains complicated texture (Y.-W. Huang,2003; Y. P. Su ,2006)

Feature selection_1 • Representative features for each macroblock: • Block partition • Best inter-SAD • lower indicates higher probability of using only one reference frame • Motion vector difference (MVD) magnitude (MVM) • MVD: • Smoothness • Small MVD = similar motion and unlikely to cross object boundaries • MVM: • Small MVM= unlikely to cross object boundaries

Feature selection_2 • Best intra-SAD and gradient magnitude • Intra-SAD: • Minimum SAD value after intra prediction of an macroblock • Large SAD = complicated texture • Gradient magnitude • Summation of gradient magnitudes of all pixels inside the macroblock • Reflect whether the texture is strong

Fig. 1. The probability of reference frame 1~5 with respect to (a) best inter-SAD, (b) gradient magnitude (a)&(b) indicate the dropping of the probability of reference frame 1 as the amount of the specified features increase to some levels.

Fig. 1. The probability of reference frame 1~5 with respect to (c) MVD, (d) MVM (c)&(d) show that the decreasing probability of reference frame 1 and the increasing probability of reference 4 and 5 as the amount of features increases

Fig. 1. The probability of reference frame 1~5 with respect (e) best intra-SAD and (f) block partition. (e)&(f), the probability of reference 1 is rather high in all conditions. The best intra-SAD and Block Partition features may NOTbe so effective.

Fast multi-reference motion estimation vs. statistical learning • Problem of multi-reference motion estimation (ME) • Classification problem • Solution: • ME on the first reference frame • Predict the number of necessary frames based on the 6 features

Formulation of reference frame selection • In ME • The number of reference frames is set to 5 • Up to 16 • Each MB define 5 classes (use 1-5 ref. frames) • Reference 2,3,4,5 are similar probability distribution than 1. (fig 1) • Two binary classifiers

Training and pre-classification • Support vector machine (SVM) • Use for solving limited training samples • Training data • Obtained by applying H.264 ref. code JM 11.0 • To 3 video sequences • News, container and coastguard videos • Pre-classification • Run time classification spend too much time • Generate all possible combinations features • Training and storing the result • Search look-up table for the corresponding result

Flow chart of the proposed algorithm Fig. 2. Flow chart of the proposed reference frame prediction algorithm for motion estimation.

Experimental results_1

Experimental results_2

Conclusion • Present a multi-reference ME algorithm based on statistical learning • To decide the best reference frame number • The feature analysis shows :provide good discriminating feature • Execution time is 3 times faster than the existing fast ME method • Future work: • Investigate more reliable features to improve classification rate • Variety of videos of different motion patterns (fast, median and slow) can be included into training data

Reference [1] Y.-W. Huang, et al., “Analysis and reduction of reference frames for motion estimation in MPEG-4 AVC/JVT/H.264,” in Proc. IEEE ICASSP, Apr. 2003. [2] Q. Sun, X. H. Chen, X. Wu, and L. Yu, “A content-adaptive fast multiple reference frames motion estimation in H.264,” in Proc. IEEE ISCAS, pp. 3651-3654, May 2007. [3] T.-Y. Kuo and H.-J. Lu, “Efficient reference frame selector for H.264,” IEEE Trans. on Circuits & Systems for Video Technology, vol. 18, no. 3, March 2008. [4] P. Wu, C.-B. Xiao, “An adaptive fast multiple reference frames selection algorithm for H.264/AVC,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Apr. 2008 [5] Y. P. Su and M. -T. Sun, “Fast multiple reference frame motion estimation for H.264/AVC,” IEEE Trans. on Circuits & Systems for Video Technology, vol. 16, pp. 447–452, Mar 2006 [6] Corinna Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, pp.273-297, 1995. [7] T. M. Cover, Information Theory, Wiley-Interscience, 1991

Thanks for your attention!!

FAST MULTI-REFERENCE MOTION ESTIMATION VIA STATISTICAL LEARNING FOR H.264/AVC