Complexity reduction in VP6 to H.264 transcoder using motion vector (MV) reuse

Complexity reduction in VP6 to H.264 transcoder using motion vector (MV) reuse Jay R Padia Electrical Engineering Graduate Student The University of Texas at Arlington Supervising Professor Dr. K. R. Rao Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Contents • Introduction • VP6 • H.264 • VP6 & H.264 comparison • Cascaded architecture • Proposed technique • Conclusions • Future work Algorithm for Adaptive Grid Generation and its Application

Transcoding Definition: Conversion of video from one format to another • Bitrate conversion - Spatial resolution change • Temporal conversion - Format change Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Why Transcoding? • Multimedia applications on different devices and platforms • Different bitrates, frame rates, spatial resolution & complexity • Different video standards; communication & inter-operability Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

On2 Truemotion VP6 • Developed by On2 Technologies • Licensed by Adobe for Flash video in 2005 • Fundamentals • YUV 4:2:0 input (?) − MB (16x16) based coding • 8x8 DCT (adaptive int DCT) − Uniform quantization • ¼ pixel MV resolution − MV search range: max 16 pixels • Reference frames: previous frame and golden frame • No bidirectional prediction • Entropy coding: Huffman and Arithmetic coding (BoolCoder) Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Flash video & adoption of H.264 • Significance of VP6 due to Flash player outreach • Flash player has wide outreach – more than 90% computers • Major websites – Youtube, Facebook, Google video, Yahoo! video, metacafe, Reuters.com, etc. • A lot of streaming video content on internet in VP6 • Adobe adopted H.264 for Flash video in 2007. • Termed as one of the biggest thing to happen to web video Algorithm for Adaptive Grid Generation and its Application

VP6: Block diagram Encoder Decoder Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

VP6: Golden frames • Special frame buffer • Holds last I-frame by default • Any part of the frame can be updated later I – Intra frame P – predicted frame Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

VP6: Golden frame • Static backgrounds; update the golden frame with the non-moving background blocks – background reproduced from golden frame reference • A frame which references only golden frame helps in recovery in case of data loss Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

VP6: Prediction loop filter • No H.264 like loop filter in the reconstruction buffer • Supports filtering of pixels adjacent to 8x8 block boundaries • When prediction block straddles an 8x8 block boundary • 2 filter options • Deblocking filter : (1, -3, 3, 1) • Deringing filter : Deringing and deblocking characteristics Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

VP6: DCT • Modified non-standard fixed-point integer DCT • DCT complexity adjusted as a function of target quantization • Faster performance for coarser quantization • To simplify the inverse DCT the zero coefficients can be clubbed together • Possible using scan ordering at encoder and reordering at the decoder Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

VP6: Scan ordering • Process of providing customized scanning order • 8x8 block – 64 coefficients – 0 to 63 • New ordering specified by a 64 element array • Default scan order – zigzag scan order (figure) • Custom scan order Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

VP6: Custom scan ordering 8 x 8 coefficients block Zig-zag scan ordered Custom scan ordered (0, 1, 2, 3, 4, 5, 6) (0, 1, 2, 3, 4, 6, 5) Algorithm for Adaptive Grid Generation and its Application

MB modes in VP6 Algorithm for Adaptive Grid Generation and its Application

Nearest & Near blocks • Nearest and Near blocks • First 2 non (0,0) MVs encountered in the order as shown in the figure • first Nearest • second Near • Undefined if no such non (0,0) MVs can be found from the first 12 blocks as shown • Intra: fixed DC prediction • CODE_INTER_FOURMV: all 4 luma blocks have different MVs Algorithm for Adaptive Grid Generation and its Application

H.264: Overview • Open, licensed standard, latest block-oriented motion-compensation-based codec. • Good video quality at substantially lower bit rates. • Better rate-distortion performance and compression efficiency. • Wide variety of applications such as video broadcasting, video streaming, video conferencing, D-Cinema, HDTV. • Adopted by Adobe for Flash video in 2007. Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

H.264: Fundamentals • Uses hybrid block based video compression techniques • Includes the following features: • Intra-picture prediction • 4x4 and 8x8 integer transform • Multiple reference pictures • Variable block sizes for ME / MC • Quarter pel precision for motion compensation • In-loop de-blocking filter • Improved entropy coding Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

H.264: Encoder block diagram Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

VP6 & H.264 comparison Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Complexity reduction in VP6 • No B-frames; display order same as coding order; no re-ordering delay (?) • Single reference frame – hence no weighted prediction No bidirectional prediction • No weighted prediction reduces the ME complexity by ½ in VP6 • 9 intra-prediction modes in H.264 reduce spatial redundancy only low-cost DC prediction in VP6 for intra-prediction • H.264 intra-prediction process at the encoder can become twice to 16 times more complex for different prediction modes Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Complexity reduction in VP6 • H.264: 5 tap in-loop deblocking filter for all 4x4 blocks VP6: 4 tap filter on ME blocks that straddle 8x8 boundaries • Applying deblocking filter to all blocks in H.264 is 4 times more complex than applying filter to all 8x8 blocks in VP6 • H.264 interpolation filter for quarter-pixel prediction – 6 tap VP6 interpolation filter – 2 tap / 4 tap • Less taps in filtering reduces VP6 interpolation filtering by ½ of H.264 • BoolCoder – context probabilities adjusted at frame level CABAC – context probabilities adjusted for each symbol • Entropy coding in H.264 1.25 to 1.5 times more complex than VP6 Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Motion estimation comparison Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Motion estimation complexity • Number of reference frames – large in H.264 Previous frame or golden frame reference in VP6 • Search time very high in H.264 due to multiple reference frames • Smaller search range in VP6 for matching block • Interpolation filter for sub-pixel ME simpler in VP6 • Fewer block sizes compared to H.264. Larger block sizes for search reduces search time Motion estimation takes up to 70% of the encoder complexity. So significant complexity reduction in VP6 Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Performance comparison • H.264 decoding process also comparatively complex (MacPro 4 cores) • High resolution video playback smooth for VP6-S codec • On lower end machines on which VP6 plays smooth, H.264 stalls in playback Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Algorithm for Adaptive Grid Generation and its Application

Output quality comparison Akiyo sequence (QCIF) – PSNR in dB Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Akiyo sequence (QCIF) – PSNR in dB Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Stefan sequence (CIF) – PSNR in dB Algorithm for Adaptive Grid Generation and its Application

Algorithm for Adaptive Grid Generation and its Application

Transcoding • Step 1: Cascaded decoder and encoder architecture • Simplest implementation • Used as the basis of comparison • Comparison parameters: Re-encoding time & Output quality for a given bitrate • Step 2: Reuse of motion information from VP6 • Aim: The output quality should be comparable to that of cascaded architecture with significant reduction in re-encoding time Algorithm for Adaptive Grid Generation and its Application

Cascaded architecture • Simplest architecture • Decode a frame completely and re-encode it • Includes complete motion estimation again; very high complexity • Devoid of drift errors • Only errors are from lossy encoding on already reconstructed frame having errors from previous encoding Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Akiyo sequence (QCIF) – PSNR in dB Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Motion estimation reuse • Maximum encoding complexity comes from motion estimation • VP6 motion estimation information can be reused • Avoid complete motion estimation on the H.264 re-encoding process • VP6 motion vectors ¼ pixel resolution like H.264 • Smaller search range: 16 pels compared to 32 pels in H.264 • Fewer block sizes compared to H.264 • So VP6 motion information is a subset of H.264 EXCEPT golden frame motion vectors Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Propose MB modes reuse Input mode H.264 mode - Intra -Intra - Inter (previous frame) - MV reused (16x16 B) - Inter (8x8 MV) - MV reused (8x8 B) - Golden frame - Recalculated (sizes: 16x16 or 8x8) - Golden frame prediction used only 11% Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Proposed Technique Algorithm for Adaptive Grid Generation and its Application

Foreman sequence (QCIF) – PSNR in dB Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Foreman sequence (QCIF) – SSIM Algorithm for Adaptive Grid Generation and its Application

Stefan sequence (CIF) – PSNR in dB Algorithm for Adaptive Grid Generation and its Application

Stefan sequence (CIF) – SSIM Algorithm for Adaptive Grid Generation and its Application

Conclusions Comparison • H.264 has better quality at a given bitrate • H.264 complexity is higher Transcoding • The motion vectors and MB mode information available from the encoded VP6 bitstream can be used in encoding the MB information of H.264 transcoded bitstream • The proposed technique of reusing motion vectors results in to minute loss of quality with significant reduction in time complexity in the encoding process Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Future work • The proposed technique does not consider motion vector refinement • Motion vector refinement on the re-encoding side can improve the accuracy • Research in [12] and [15] gives an overview of different technique that can be used for refinement of approximate motion vector values Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Motion vector refinement • Refinement of this MV in a small search window gives better results Algorithm for Adaptive Grid Generation and its Application

Software used Software used • On2 VP6 Software Development Kit (SDK) available from On2 Technologies (free license for educational and research purposes) • JM reference software for H.264 • JM software is an open source H.264 reference software • The version used for the project is JM version 17.0 Algorithm for Adaptive Grid Generation and its Application

References • ITU-T Recommendation H.264 – Advanced Video Coding for Generic Audio-Visual services. • A. Tamahankar and K. R. Rao, “An overview of H.264 / MPEG-4 part 10,” Proc 4th EURASIP Conference focused on Video / Image Processing and Multimedia Communications, Zegreb, Croatia, pp. 1-51, July 2003. • “Adobe Extends Web Video Leadership with H.264 Support”, Adobe press release, August 21, 2007. • [27] “VP6 bitstream and decoder specification,” On2 Technologies Inc., Aug 2006. • [28] M. Vetterli and A. Ligtenberg “A Discrete Fourier-Cosine Transform Chip,” IEEE Journal on Selected Areas of Communications, vol. SAC-4 No.1, pp. 49-61, Jan. 1986. Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

[29] I. Ahmad, et al, “Video Transcoding: An Overview of Various Techniques and Research Issues”, IEEE Transactions on Multimedia, vol. 7, pp. 793-804, October 2005 • [30] J. Xin, C. Lin and M. Sun, “Digital Video Transcoding”, Proceedings of the IEEE, vol. 93, pp. 84-96, January 2005 • [34] On2 Technologies Inc., “VP6 bit-stream overview – presentation.” • [35] On2 Technologies Inc., “On2 VP6 and H.264 for Adobe Flash Player,” http://support.on2.com/files/h264_and_flash_faq.pdf, August 2007. • G. Sullivan, “Overview of international video coding standards (preceding H.264/AVC),”ITU-T VICA workshop, Geneva, July 2005. • T. Shanabelah and M. Ghanbari, “Heterogeneous video transcoding to low spatial temporal resolutions and different encoding formats,” IEEE Transactions Multimedia, vol. 2, no. 2, pp. 101-110, Jun. 2000. • J.-N. Hwang and T.-D. Wu, “Motion vector re-estimation and dynamic frame-skipping for video transcoding,” Conf Rec. 32ndAsilomar Conf. Signals, Systems and Computer, vol. 2, pp 1606-1610, 1998. Complexity reduction in VP6 to H.264 transcoder with motion vector reuse

Complexity reduction in VP6 to H.264 transcoder using motion vector (MV) reuse

Complexity reduction in VP6 to H.264 transcoder using motion vector (MV) reuse

Presentation Transcript

Low Complexity Transform and Quantization in H.264/AVC

Low Complexity H.264 Encoder using Machine Learning.

LOW COMPLEXITY H.264 ENCODER USING MACHINE LEARNING FOR STREAMING APPLICATIONS

Vertical Circular Motion

H.264 to VC 1 Transcoding

IMPLEMENTATION OF COMPLEXITY REDUCTION ALGORITHM FOR INTRA MODE SELECTION IN H.264/AVC

COMPLEXITY REDUCTION OF H.264 USING PARALLEL PROGRAMMING

MOTION VECTOR PROCESSING USING THE COLOR INFORMATION

An MPEG-2 To H.264 Transcoder In Baseline Profile

H.264

H.264

Video Transcoding in H.264

Low-Complexity Transform and Quantization in H.264/AVC

H.264/AVC

H.264/AVC Baseline Profile Decoder Complexity Analysis

Mode Decision and Fast Motion Estimation in H.264

MRF-BASED TRUE MOTION ESTIMATION USING H.264 DECODING INFORMATION

Complexity and MV Repair Risk for SAM

H.264 decoder

Performance Complexity Trade-Offs in H.264 Motion Search

Complexity reduction for time domain H matrix feedback

Video Transcoding in H.264