1 / 52

MPEG-2 to H.264/AVC Transcoding Techniques

MPEG-2 to H.264/AVC Transcoding Techniques. Jun Xin Xilient Inc. Cupertino, CA. Digital Video Transcoder. “A” and “B” may differ in many aspects: coding formats: e.g. MPEG-2 to H.264/AVC bit-rate, frame rate, resolution … features: error resilience features

Download Presentation

MPEG-2 to H.264/AVC Transcoding Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA

  2. Digital Video Transcoder • “A” and “B” may differ in many aspects: • coding formats: e.g. MPEG-2 to H.264/AVC • bit-rate, frame rate, resolution … • features: error resilience features • contents: e.g. logo insertion Coded digital video bit-stream “A” Coded digital video bit-stream “B” Transcoder Digital Video Transcoding

  3. Applications • Media Storage • Transcode broadcasting MPEG-2 video to H.264/AVC format: enable long-time recording • Effective for multi-channel recording • Home Gateway • Provide connection to IPTV set-top box • Box only supports H.264/AVC • Over wireless network with bandwidth limitation • Other potential uses: • Export to mobile • Internet streaming • … … Digital Video Transcoding

  4. Goals and Challenges • H.264/AVC: latest video compression standard • Promises same quality as MPEG-2 at half the bit-rate • Is being widely adopted • HD Consumer Storage, e.g., HD-DVD and Blu-Ray • Mobile Devices, e.g., Apple iPod, iPhone, Sony PSP • Convert MPEG-2 video to H.264/AVC format • More efficient storage, export to mobile devices, etc. • Challenges • Yield similar quality as full re-encoding, but with much lower cost • Key to lower-cost/high-quality: how to intelligently reuse available information from the incoming bitstream • May be loosely considered as a “two-pass coder” • Could achieve better quality than full re-encoding given same complexity Digital Video Transcoding

  5. Outline • Intra-only transcoding techniques • Efficient compressed domain processing • Inter transcoding techniques • Motion mapping / motion reuse Digital Video Transcoding

  6. Intra Transcoding Techniques

  7. H.264 Entropy Coding Intra Prediction (Pixel-domain) Pixel Buffer Mode decision Intra Transcoder – Pixel Domain Input MPEG-2 Bitstream VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform VLD/ IQ IDCT HT Q Inverse Q Inverse HT Digital Video Transcoding

  8. H.264 Entropy Coding Intra Prediction (Comp-domain) Coeff Buffer Mode decision Compressed Domain Processing? Input MPEG-2 Bitstream VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform VLD/ IQ Q Inverse Q Digital Video Transcoding

  9. AVC 4x4 Transform • Motivation: • DCT requires real-number operations, which may cause inaccuracies in inversion • Better prediction means less spatial correlation – no strong need for real-number operations • H.264 uses a simple integer 4x4 transform • Approximation to 4x4 DCT • Transform and inverse transform • note: ½ in inverse transform represents right shift, so it is non-linear Digital Video Transcoding

  10. Intra Prediction in H.264/AVC • Motivation: intra-frames are natural images, so they exhibit strong spatial correlation • Pixels in intra-coded frames are predicted based on previously-coded ones • Prediction can be based on 4x4 blocks or 16x16 macroblocks (or 8x8 blocks for high profile) • An encoded mode specifies which neighbor pixels should be used to predict, and how Digital Video Transcoding

  11. 4x4 Intra Prediction Example • Current block: • Prediction blocks: Vertical Horizontal Diagonal_Down_Right Digital Video Transcoding

  12. Compressed Domain Processing? • Challenges • Different transforms • MPEG-2 uses DCT, floating point • H.264/AVC uses an integer transform • New prediction modes in H.264/AVC • Can prediction be performed in compressed domain? • Goals • Simpler computation and architecture Digital Video Transcoding

  13. H.264 Entropy Coding Intra Prediction (Comp-domain) Coeff Buffer Mode decision Compressed Domain Processing? Input MPEG-2 Bitstream VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform VLD/ IQ Q Inverse Q Digital Video Transcoding

  14. Entropy Coding Intra Prediction (HT-domain) Pixel Buffer Mode decision (HT-domain) Intra Transcoder – Proposed VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform Input MPEG-2 Bitstream VLD/ IQ DCT-to-HT conversion (S-Transform) Q Inverse Q Inverse HT Digital Video Transcoding

  15. Techniques • DCT-to-HT conversion • Compressed (HT) domain prediction • Very simple for some prediction modes • Compressed domain distortion calculation in mode decision • Advantages • lower computational complexity • No quality loss Digital Video Transcoding

  16. DCT-to-HT Conversion Digital Video Transcoding

  17. DCT-to-HT Conversion:Transform Kernel Matrix Digital Video Transcoding

  18. Fast Algorithm (1D) Digital Video Transcoding

  19. Complexity Analysis • Transform-domain DCT-to-HT (S-Transform): 704 operations • 352 multiplications • 352 additions • Pixel-domain mapping (IDCT* followed by HT): 992 operations • 256 multiplications • 64 shifts • 672 additions • Advantage • 29% saving in total operations • Two-stage vs. six-stage implementation • Better performance: no intermediate rounding * W.H. Chen, C.H. Smith, and S.C. Fralick, ``A Fast Computational Algorithm for the Discrete Cosine Transform,'' IEEE Trans. on Communications, Vol. COM-25, pp. 1004-1009, 1977 Digital Video Transcoding

  20. Entropy Coding Intra Prediction (HT-domain) Pixel Buffer Mode decision (HT-domain) Intra Transcoder – Proposed VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform Input MPEG-2 Bitstream VLD/ IQ DCT-to-HT conversion (S-Transform) Q Inverse Q Inverse HT Digital Video Transcoding

  21. SATD Cost RD Cost Conventional Mode Decisions • Given all possible prediction modes, encoder needs to decide which one to use • Low-complexity mode decision rule (RDO_Off): or • High-complexity mode decision rule with rate distortion optimization (RDO_On): Digital Video Transcoding

  22. Conventional RD Cost Computation • Entire encoding/decoding need to be performed for every mode Digital Video Transcoding

  23. Motivation & Previous Approaches • RD_Cost based mode decision gives best performances, but very expensive to compute • Previous efforts in fast intra mode decisions • Directional field • Edge histogram • Other pixel-domain approaches • They all lead to lower coding performance • Our approach is based on transform domain processing – no loss in coding performance Digital Video Transcoding

  24. Transform Domain RD Cost Computation • No inverse transform • Transformations of some prediction signals are easy to compute • Distortion calculated in transform domain Digital Video Transcoding

  25. HT of DC Prediction HT • No HT needs to be performed • Pdc has only one non-zero elements Digital Video Transcoding

  26. HT of Horizontal Prediction • Only one 1-D HT is needed • Ph has only four non-zero elements (the first column) Digital Video Transcoding

  27. HT of Vertical Prediction • Only one 1-D HT is needed • Pv has only four non-zero elements (the first row) Digital Video Transcoding

  28. Calculate Distortion in Transform Domain Distortion in pixel domain: Distortion in transform domain: Digital Video Transcoding

  29. Ranking-based Fast Mode Decision • Two cost functions: SATD_Cost & RD_Cost • Observation: the best mode according to RD_Cost usually has smaller SATD_Cost • Proposed algorithm (mode reduction): to rank different modes using SATD_Cost, then calculate RD_Cost for top several modes • Algorithm can be conducted in transform domain Digital Video Transcoding

  30. Verification Experiment • Count the percentage of times when the best mode according to RD_Costare within the best k modes ranked by SATD_Cost • k fixed as 3 in all simulations Digital Video Transcoding

  31. Simulation Conditions • Three transcoders • PDT – reference pixel domain transcoder, with fast IDCT implemented • TDT – transform domain transcoder • TDT-R – transform domain transcoder with ranking-based mode decision • Test sequences • 100 frames, CIF size, 30 fps • Input: MPEG-2 all-I at 6Mbps Digital Video Transcoding

  32. Simulation – “Mobile” Digital Video Transcoding

  33. Simulation – “Stefan” Digital Video Transcoding

  34. Complexity: Run-time Results Digital Video Transcoding

  35. Summary of Intra Transcoding • Efficient transcoder architecture • Efficient mode decision • Transform domain distortion calculation • Ranking-based mode decision • Achieved virtually same quality as reference transcoder with significantly lower complexity Digital Video Transcoding

  36. Inter Transcoding Techniques

  37. Prediction Transcoder Architecture entropy coding HT/Q Inverse Q/ Inverse HT MPEG-2 decoder Deblocking filter Decoded picture and macroblock data Pixel buffers Motion and modes Motion/mode mapping Digital Video Transcoding

  38. Assumptions • Input • MPEG-2 frame pictures • Output • H.264/AVC baseline profile (no B slices) and main profile • Frame pictures, MBAFF not considered • Block partition sizes considered for motion compensation: 16x16, 16x8, 8x16 and 8x8 Digital Video Transcoding

  39. Motion Mapping: Problems Digital Video Transcoding

  40. Motion Mapping Algorithm • Field-to-frame mapping: convert MPEG-2 field motion vectors (if any) to frame vector • Reference picture mapping: for B to P frame type conversion • Block size mapping: map the MPEG-2 motion vectors to target H.264/AVC motion vectors of different block size • Algorithm: distance weighted average (DWA) • Motion refinement: (1+1/2+1/4) around estimated motion vectors for all block partitions • Note: for B slice output, the above mapping is performed for motion vectors of both directions Digital Video Transcoding

  41. Field-to-frame Conversion Digital Video Transcoding

  42. ti=3 Input I B B P Output I P P P to=1 MVcol MVi,forw MVi,back Input I B B P Output I P P P MVo Reference Picture Mapping Digital Video Transcoding

  43. Block Size Mapping: 16x8 8x16 Digital Video Transcoding

  44. Block Size Mapping: 8x8 Digital Video Transcoding

  45. Simulation Conditions • Test sequences: • 1920x1080i, 30fps, 450 frames • MPEG-2 input: • 30 Mbps, (30,3) • H.264/AVC output: • UVLC, output bit-rate of interest ~10 Mbps • Baseline profile (needs to convert B pictures to P slices) & Main profile • Comparison points • Mapping algorithm • B slices • RD optimization Digital Video Transcoding

  46. Baseline output: no B slices Digital Video Transcoding

  47. Baseline output: no B slices Digital Video Transcoding

  48. Main Output: with B slices Digital Video Transcoding

  49. Main Output: with B slices Digital Video Transcoding

  50. Complexity: Run-time Results Digital Video Transcoding

More Related