Multiplexing h 264 avc video with mpeg aac audio
1 / 39

Multiplexing H.264/AVC Video with MPEG-AAC Audio - PowerPoint PPT Presentation

  • Updated On :

Multiplexing H.264/AVC Video with MPEG-AAC Audio Harishankar Murugan University of Texas at Arlington Outline : Multiplexing: Areas of applications Why H.264 and AAC? Multiplexing De-multiplexing Synchronization and Playback Results Conclusions Future work References

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Multiplexing H.264/AVC Video with MPEG-AAC Audio' - emily

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Multiplexing h 264 avc video with mpeg aac audio l.jpg

Multiplexing H.264/AVC Video with MPEG-AAC Audio

Harishankar Murugan

University of Texas at Arlington

Outline l.jpg
Outline :

  • Multiplexing: Areas of applications

  • Why H.264 and AAC?

  • Multiplexing

  • De-multiplexing

  • Synchronization and Playback

  • Results

  • Conclusions

  • Future work

  • References

Multiplexing areas of applications l.jpg
Multiplexing : Areas of applications

  • DVB : DVB-C, DVB-T

  • ATSC

  • IPTV

Why h 264 video l.jpg
Why H.264 Video?

  • Up to 50% in bit rate savings: Compared to H.263v2 (H.263+) or MPEG-2 Simple Profile.

  • High quality video: H.264 offers consistently good video quality at high and low bit rates.

  • Error resilience: H.264 provides the tools necessary to deal with packet loss in packet networks and bit errors in error-prone wireless networks.

  • Wide areas of application streaming mobile TV, HDTV, and storage options for the home user

Important features of h 264 l.jpg
Important features of H.264

  • IDR (Instantaneous decoder refresh) picture:

    Anchor picture with only I-slices.

  • Sequence parameter set:

    • profile and level indicator.

    • decoding or playback order.

    • number of reference frames.

    • aspect ratio or color space details.

  • Picture parameter set:

    • entropy coding mode used.

    • slice data partitioning and macroblock reordering.

    • Flags indicating the usage of weighted (bi) prediction.

    • Quantization parameter details.

  • Aac audio l.jpg
    AAC Audio

    • Advanced Audio Coding is a standardized, lossy compression scheme for audio.

    Encoder Block diagram of AAC

    Aac audio8 l.jpg
    AAC Audio

    • Profiles :

      • Low Complexity (LC) - the simplest and most widely used;

      • Main Profile (MAIN) - LC profile with backwards prediction;

      • Sample-Rate Scalable (SRS) – LC profile with gain control tool;

  • Bit stream Formats:

    • ADIF - Audio Data Interchange Format:

      Only one header in the beginning of the file followed by raw data blocks

    • ADTS - Audio Data Transport Stream

      Separate header for each frame enabling decoding from any frame

  • Why aac audio l.jpg
    Why AAC Audio?

    • Supports Sample frequencies from 8 kHz to 96 kHz (official MP3: 16 kHz to 48 kHz)

    • Higher coding efficiency and simpler filterbank (pure MDCT ) as compared to mp3 (hybrid filter bank )

    • Improved compression provides higher-quality audio with smaller bit rates .

    • Superior performance at bit rates > 64 kbps and at bit rates reaching as low as 16 kbps.

    Factors to be considered for multiplexing and transmission l.jpg
    Factors to be considered for Multiplexing and Transmission

    • Split the video and audio coded bit streams into smaller data packets

    • Multiplex with equal priority given to all elementary streams

    • Detect packet losses and errors

    • Additional information to help synchronize audio and video

    Packetization l.jpg

    H264 Encoder




    MPEG encoded stream













    • 2 layers of packetization :

      • PES - Packetized Elementary stream :

      • Transport Stream :


    Packetized elementary stream pes l.jpg
    Packetized Elementary stream (PES)

    • Elementary streams (ES):

      • Encoded video stream

      • Encoded audio stream

      • Data stream (Optional)

  • PES contains access units that are sequentially separated and packetized

  • PES headers distinguish different ES and contain timestamp information

  • Packet size varies with the size of access units

  • Packetized elementary stream pes13 l.jpg
    Packetized Elementary stream (PES)








    Pes header description l.jpg
    PES Header Description

    • 3 bytes of start code – 0x000001

    • 1 byte of stream ID

    • 2 bytes of packet length

    • 2 bytes of time stamp (Frame number)

    Frame number as time stamp l.jpg
    Frame number as time stamp

    • Video frame rate : constant (25/30/.. fps)

      time = frame number/fps

    • Audio sampling rate : constant (8 – 96 kHz)

      Number of samples/frame (AAC) : 1024

      time = 1024*frame number/(sampling rate)

    Advantages over the method that uses clock samples as time stamps l.jpg
    Advantages over the method that uses clock samples as time stamps

    • Saves the extra header bytes used for sending program clock reference (PCR) information periodically

    • No synchronization problem due to clock jitters

    • No propagation of delay between audio and video

    • Less complex and more suitable for software implementation

    Transport packets l.jpg
    Transport Packets stamps

    • PES from various elementary sources are broken into smaller packets called transport packets

    • Transport packets have a fixed length of 188 bytes

    • Constraints

      • Each packet can have data from only one PES

      • PES header should be the first byte of the transport packet payload.

      • Stuffing bytes are added if the above constraints are not met

    Transport stream l.jpg
    Transport stream stamps



    PES Payload








    Packet header20 l.jpg
    Packet Header stamps

    • PID (Packet identifier) :

      Each elementary stream has a unique PID. Some are reserved for NULL packets and PSI (Program Specific Information).

    • PSI (Program specific information) :

      Sequence parameter set and picture parameter set are sent as PSI at frequent intervals.

    • Payload unit start indicator :

      1 bit flag to indicate presence of PES header in the payload.

    • Adaptation field control :

      1 bit flag to indicate presence of any data other than PES data in payload.

    Packet header21 l.jpg
    Packet Header stamps

    • Continuity counter :

      4 bit rolling counter which is incremented by 1 for each consecutive TS packet of the same PID. To detect packet loss.

    • Payload Byte offset :

      If adaptation field control bit is ‘1’, byte offset value of the start of the payload or the length of adaptation field is mentioned here.

    • Adaptation field :

      • Stuffing bytes , if PES data < TS packet size

      • Additional header information

    Multiplexing method adopted l.jpg
    Multiplexing method adopted stamps

    • Multiplexing method affects buffer fullness at the de-multiplexer and in turn playback

    • Video and audio timing counters are used to ensure proper multiplexing

    • Timing counters are incremented according to the playback time of each packet multiplexed

    • PES with the least timing counter value is always given preference during packet allocation

    Multiplexing method adopted23 l.jpg
    Multiplexing method adopted stamps

    fps = 25

    Video PES

    PES length = 570

    => 1/25 = 40 ms

    # of TS = round(570/185)

    => 40/4 = 10 ms

    4 TS packets

    Multiplexed transport stream l.jpg
    Multiplexed transport stream stamps

    Video PES

    Audio PES





















    Transport stream


    15 16 16 16 15 1024 16

    Synchronization and playback l.jpg
    Synchronization and playback stamps

    • During playback, data is loaded from the buffer

    • IDR frame is searched from the top of the video buffer

    • Frame number of IDR frame is extracted

    • Corresponding audio frame number is calculated as follows

      Aframe number = ( Vframe number * sampling rate) / (1024*fps)

    Synchronization and playback28 l.jpg
    Synchronization and playback stamps

    • If a non-integer value, frame number is rounded off and the corresponding audio frame is searched.

    • The audio and video contents from the corresponding frame numbers are decoded with PSI and played back.

    • Then the audio and video buffers are emptied and incoming data gets buffered and the process continues.

    • If corresponding audio frame is not found, next IDR frame is searched and same process is repeated.

    Results l.jpg
    Results stamps

    Results30 l.jpg
    Results stamps

    Conclusions l.jpg
    Conclusions stamps

    • Synchronization of audio and video is achieved by starting de-multiplexing from any TS packet.

    • Visually there is absolutely no lag between video and audio

    • Bit rate can be changed by using rate control module in the H.264 encoder

    Test conditions l.jpg
    Test Conditions stamps

    • Single program Transport stream is generated

    • Input raw video : YUV format

    • Input raw audio : WAVE format

    • Profiles used :

      • H.264 : Main profile

      • AAC : Low complexity profile (ADTS format)

    • GOP : IBBPBB (IDR forced)

    • Video frame rate: 25fps

    • Audio sampling frequency : 48 kHz

    Future work l.jpg
    Future work stamps

    • Extension of the algorithm to multiplex multiple program streams

    • Error correction method

    • Reduce initial buffering time

    References l.jpg
    References stamps

    Books and Papers:

    • [1]MPEG–2 advanced audio coding, AAC. International Standard IS 13818–7, ISO/IEC JTC1/SC29 WG11, 1997.

    • [2]MPEG. Information technology — generic coding of moving pictures and associated audio information, part 3: Audio .International Standard IS 13818–3, ISO/IEC JTC1/SC29 WG11, 1994.

    • [3]MPEG. Information technology — generic coding of moving pictures and associated audio information, part 4: Conformance testing .International Standard IS 13818–4, ISO/IEC JTC1/SC29 WG11, 1998.

    • [4]Information technology—Generic coding of moving pictures and associated audio—Part 1: Systems, ISO/IEC 13818-1:2005, International Telecommunications Union.

    • [5] MPEG-4: ISO/IEC JTC1/SC29 14496-10: Information technology – Coding of audio-visual objects - Part 10: Advanced Video Coding, ISO/IEC, 2005.

    • [6] P. V. Rangan, S. S. Kumar, and S. Rajan, “Continuity and Synchronization in MPEG,” IEEE Journal on Selected Areas in Communications, Vol. 14, pp. 52-60, Jan. 1996.

    • [7] B.J. Lechner et. al “The ATSC Transport Layer, Including Program and System Information Protocol (PSIP)”, Proc of the IEEE, vol. 94, no. 1,pp 77-101, January 2006

    References36 l.jpg
    References stamps

    • [8] Hari Kalva et. al “Implementing Multiplexing, Streaming,and Server Interaction for MPEG-4”, IEEE transactions on circuits and systems for video technology, vol 9, No.8, pp 1299-1311,december 1999.

    • [9] M. Bosi and M. Goldberg “Introduction to digital audio coding and standards”, Boston : Kluwer Academic Publishers, c2003.

    • [10] D. K. Fibush, “Timing and Synchronization Using MPEG-2 Transport Streams,” SMPTE Journal, pp. 395-400,July, 1996.

    • [11]K. Brandenburg, “MP3 and AAC Explained”, AES 17th International Conference, Florence, Italy, September 1999.

    • [12] S-k. Kwon, A. Tamhankar and K.R. Rao ”Overview of H.264 / MPEG-4 Part 10”, J. Visual Communication and Image Representation, vol. 17, pp.183-552, April 2006.

    • [13]A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-4

    • AVC compression standard”, Signal Processing: Image Communication, vol. 19, issue 9, pp. 793-849, Oct 2004.

    • [14] T. Wiegand et. al “Overview of the H.264/AVC Video Coding Standard,” IEEE Trans. CSVT, Vol. 13, pp. 560-576, July 2003.

    Reference l.jpg
    Reference stamps

    • [15] R. Hopkins, “United States digital advanced television broadcasting standard,” SPIE/IS & T, Photonics West, vol. CR61,pp 220-226, San Jose, CA, Feb. 1996.

    • [16] Z. Cai et. al “A RISC Implementation of MPEG-2 TS Packetization”, in the proceedings of IEEE HPC conference, pp 688-691, May 2000.

    • [17] M.Fieldler, “Implementation of basic H.264/AVC Decoder”, seminar paper at Chemnitz university of technology, June 2004

    • [18] R.Linneman, “Advanced audo coding on FPGA”, BS honours thesis, October 2002, School of Information Technology, Brisbane.

    • [19] J. Watkinson, “The MPEG Handbook” , Second Edition , Oxford ; Burlington, MA : Elsevier/Focal Press, 2004.

    • [20] I.E.G.Richardson, “H.264 and MPEG-4 Video Compression: Video Coding

    • for Next Generation Multimedia”, John Wiley & Sons, 2003.

    • [21]Proceedings of the IEEE, Special issue on Global Digital Television: Technology and Emerging Services, vol.94,pp 5-7, Jan. 2006.

    • [22] P.D Symes “Digital video compression“, McGraw-Hill, c2004

    • [23] C. Wootton, “Practical guide to video and audio compression : from sprockets and rasters to macro blocks”, Oxford : Focal, 2005.

    References38 l.jpg
    References stamps

    • [24] “FAAC and FAAD AAC software, website

    • [25] MPEG official website

    • [26] Alternative AAC software from

    • [27] H.264 software JM (10.2) from

    • [28] Bauvigne G. “MPEG-2/MPEG-4 AAC”, MP3 Tech Website,

    • [29] Whittle R., “Comparing AAC and MP3”, Website

    • [30] Public discussion forum website for a/v containers:

    • [32] JVT documents website:

    • [33]Audio test files website

    • [34]Reference for H.264 website

    Slide39 l.jpg

    Video stamps







    Transport stream

    Timestamp information