slide1 l.
Skip this Video
Loading SlideShow in 5 Seconds..
Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback PowerPoint Presentation
Download Presentation
Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback

Loading in 2 Seconds...

play fullscreen
1 / 51

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback - PowerPoint PPT Presentation

  • Uploaded on

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback. Naveen Siddaraju Contents: . Introduction : Need for multiplexing Overview of codecs used Transport protocols Multiplexing

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback


  • Introduction : Need for multiplexing
  • Overview of codecs used
  • Transport protocols
  • Multiplexing
  • De-multiplexing and synchronization
  • Results
  • Conclusions
  • Future work
  • References
introduction need for multiplexing
Introduction: need for multiplexing
  • Digital television broadcasting
      • ATSC- M/H [17]
      • DVB- H
      • DVB- T
  • Internet streaming
    • IPTV, YouTube etc .
choice of codecs
Choice of CODECs
  • Depends on the application.
    • Transport bandwidth

- ATSC-M/H channel bandwidth 19.6Mbps

- DVB-H channel bandwidth 14 Mbps

    • Processing power of the target device
h 264 avc
H.264/ AVC
  • Defined in MPEG4 part 10
  • Jointly developed by ITU – T VCEG and MPEG group of ISO/IEC.
  • Provides better compression than its predecessors like MPEG 2 video and MPEG 4 part 2.
  • Suitable for a wide variety of applications.
  • Adopted standard in ATSC-M/H, DVB etc
  • Used in Blu-ray discs, DVDs, iTunes, flash player, video conferencing applications etc
frame types
Frame types
  • Three basic types
    • Intra predictive (I) frame
    • Predictive (P) frame
    • Bi predictive (B) frame
  • IDR frame is a special type of I frame.

- indicates the start of a video sequence.

bitstream syntax of h 264
Bitstream syntax of H.264
  • Data is organized into two layers
      • VCL (video coding layer)
      • NAL (network abstraction layer)
  • NAL formatting of VCL and non-VCL data [6]

Forbidden bit

  • NRI - 2bits
  • Type - 5 bits

NAL unit format[6]

important nal unit types
Important NAL unit types
  • IDR frames

- indicates start a of new video sequence

  • Sequence parameter sets (SPS)

- contains parameters common to entire sequence

- profile, level, size of the video, no of reference frames

  • Picture parameter sets (PPS)

- contains parameters that to a frame or some frames in a sequence

- entropy coding , quantization parameters etc .

  • Also called enhanced aac plus
  • Developed by coding technologies for very low bitrate applications .
  • Defined in MPEG4 part 3 amendment 2
  • Enables coding in mono, stereo and multi channels (up to 48 channels )
  • Is a combination of AAC, SBR, PS
  • Provides highest perceptible quality for the lowest bitrate
  • Adopted as audio standard in ATSC- M/H, DVB, XM satellite radio
  • Can exist in a variety of file formats like mp4, m4a.
  • Controlled testing conducted by 3gpp [27] indicates that HEAACv2 provides good quality audio at 24kbps.
aac advanced audio codec
AAC (advanced audio codec)
  • Successor of the MP3 format
  • Defined both in MPEG2 [3] and MPEG4 [2]
  • Achieves better sound quality than MP3 for same bitrates.
  • AAC is also the standard audio format for apple iPhone, iPod, iPad, Sony playstation etc .
  • Up to 48 channels (MP3 supports up to two channels in MPEG-1 mode and up to 5.1channels in MPEG-2 mode)
  • More sampling frequencies (from 8 to 96 kHz) than MP3 (16 to 48 kHz)
  • Achieves good quality audio at 128 kbps for stereo.
sbr spectral band replication 2
SBR (spectral band replication) [2]
  • SBR is a bandwidth expansion technique
  • Exploits the correlation between the high and low frequencies.
  • Using SBR, along with AAC, high quality stereo sound can be achieved at 48 kbps.
high band reconstruction through sbr 28
High band reconstruction through SBR [28]
  • Original audio signal [28].
  • High band reconstruction through SBR [28].
ps parametric stereo 2
PS (parametric stereo) [2]
  • Only used for low bitrate applications ( < 32kbps)
  • Parameterizes the stereo image such as time/phase differences, interchannel intensity differences etc.
  • Only monaural version of the stereo is encoded by the AAC encoder.
  • At the decoder side the monaural signal is decoded first, and then stereo signal is reconstructed using the PS parameters
  • Using PS along with AAC and SBR , reasonable quality stereo sound can be achieved at 24 kbps.
heaacv2 bitstream formats
HEAACv2 bitstream formats
  • ADIF (audio data interchange format)

- has just one header for the whole stream

- used in storage media.

  • ADTS (audio data transport stream)

- used in transport stream.

- has headers in every access unit.

transport protocols
Transport protocols
  • Most multimedia applications involve communication channels or storage.
    • RTP (real time protocol)

- transport over IP networks

    • MPEG2 systems

- digital television broadcast

- storage (asset management)

mpeg2 systems
MPEG2 systems
  • Defines two types streams

- Program stream (PS)

- used for storage , ex. DVD

- Transport stream (TS)

- used for digital broadcast

  • Two layers of packetization

- PES (packetized elementary streams)

- TS (transport stream)

pes packetized elementary stream
PES (packetized elementary stream)
  • First layer of packetization
  • Separates audio video elementary streams into access units.
  • Variable length
  • Contains a header and payload (frame) data.
  • Add fields like time stamp, stream ID, packet length
frame number as time stamp
Frame number as time stamp
  • For video, fps is a constant through out the sequence.
  • For audio, sampling frequency is a constant through out the sequence.
ts packets
TS packets
  • Second layer of packetization
  • Fixed length (188 bytes)
  • PES is logically broken down in to 188 byte packets
  • Three byte header contains packet ID, payload unit start flag, continuity counter etc.
ts header description
TS header description:
  • payload unit start indicator (PUSI) flag

- indicates payload has PES header.

  • Adaptation field control (AFC) flag

- indicates payload is less than 185 bytes

  • Continuity counter (CC) (4 bits)

- 4 bit counter, used to check for any packet losses, out of sequences etc .

  • Packet ID (PID) (10 bits)

- uniquely identifies the particular ES , the packet belongs to

  • Optional offset byte :

- contains the offset value is AFC is set.

  • What is multiplexing ?
  • Multiplexing is a process of transmitting TS packets belonging to different elementary streams .
  • Muxing is a processes of how effectively the TS packets are interleaved in the TS stream , so that both audio and video contents get transmitted simultaneously.
  • Buffer overflow/ underflow

- Can cause picture loss, skip during audio video playback.

calculation of presentation time of a ts packet
Calculation of presentation time of a TS packet:
  • For video TS packet
  • For audio TS packet
de multiplexing
  • The transport stream (TS) input to a receiver is separated into a video elementary stream and audio elementary stream.
  • These ES are initially written in to video and audio buffers respectively.
  • Once one of the buffers is full, the elementary stream is reconstructed from the point of synchronization.
audio video synchronization
Audio- video synchronization
  • Once video buffer is full, it is searched for the next occurring IDR frame in the video buffer.
  • Corresponding audio frame is calculated from the equation
  • Elementary streams are reconstructed from that point.

merged in to a container format (using mkv merge), then played back.

test conditions
Test conditions :
  • Video
    • H.264 baseline profile
    • Resolution: 416X240
    • GOP: IPPP (IDR forced)
    • Fps: 24
  • Audio
    • HEAACv2
    • ADTS format
    • Sampling frequency: 24,000Hz
  • buffer fullness was effectively handled with maximum buffer difference observed was around 20ms of media content
  • audio-video synchronization was achieved with a maximum skew of 13ms.
future work
Future work
  • Expand the multiplexing algorithm to multiplex multiple programs
  • Implement the same multiplexing algorithm for other transport protocols like RTP/IP
  • Add error correction to TS stream.
  • [1] MPEG-4: ISO/IEC JTC1/SC29 14496-10: Information technology – Coding of audio-visual objects - Part 10: Advanced Video Coding, ISO/IEC, 2005.
  • [2] MPEG-4: ISO/IEC JTC1/SC29 14496-3: Information technology — coding of audio-visual objects — Part 3: Audio, AMENDMENT 4: Audio Lossless Coding (ALS), new audio profiles and BSAC extensions
  • [3] MPEG–2: ISO/IEC JTC1/SC29 13818–7, advanced audio coding, AAC. International Standard IS WG11, 1997.
  • [4]MPEG-2: ISO/IEC 13818-1 Information technology—generic coding of moving pictures and associated audio—Part 1: Systems, ISO/IEC: 2005.
  • [5] Soon-kak Kwon et al. “Overview of H.264 / MPEG-4 Part 10 (pp.186-216)”, Special issue on “
  • Emerging H.264/AVC video coding standard”, J. Visual Communication and Image Representation, vol.
  • 17, pp.183-552, April 2006.
  • [6] A. Puri et al. “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing:
  • Image Communication, vol.19, pp 793-849, Oct 2004.
  • [7] MPEG-4 HE-AAC v2 — audio coding for today's digital media world, article in the EBU technical review (01/2006) giving explanations on HE-AAC. Link:
  • [8]ETSI TS 101 154 “Implementation guidelines for the use of video and audio coding in broadcasting applications based on the MPEG-2 transport stream”.
  • [9] 3GPP TS 26.401: General Audio Codec audio processing functions; Enhanced aacPlus General Audio Codec; 2009
  • [10] 3GPP TS 26.403: EnhancedaacPlusgeneral audio codec; Encoder Specification AAC part.
  • [11] 3GPP TS 26.404 : EnhancedaacPlusgeneral audio codec; Encoder Specification SBR part.
  • [12] 3GPP TS 26.405: Enhanced aacPlus general audio codec; Encoder Specification Parametric Stereo part.


  • [14]MPEG Transport Stream. Link:
  • [15] MPEG-4: ISO/IEC JTC1/SC29 14496-14 : Information technology — coding of audio-visualobjects — Part 14 :MP4 file format, 2003
  • [16] DVB-H : Global mobile TV. Link :
  • [17] ATSC-M/H. Link :
  • [18] Open mobile vidéo coalition. Link :
  • [19] VC-1 Compressed Video Bitstream Format and Decoding Process(SMPTE 421M-2006), SMPTE Standard, 2006 (
  • [20] Henning Schulzrinne's RTP page. Link:
  • [21] G.A Davidson et al, “ATSC video and audio coding”, Proc. IEEE, vol 94, pp. 60-76, Jan. 2006 (
  • [22] I. E.G.Richardson, “H.264 and MPEG-4 video compression: video coding for next-generation multimedia”,Wiley, 2003.
  • [23] European Broadcasting Union,
  • [24] Shintaro Ueda, et, al “NAL level stream authentication for H.264/AVC” , IPSJ Digital courier, Vol 3 , Feb 2007.
  • [25] World DMB: link:
  • [26] ISDB website. Link:

[27] 3gpp website. Link:

  • [28] “Audio compression gets better and more complex” MihirModi, link :
  • [29]”MPEG-2: Overview of systems layer”, by PA Sarginson. Link:
  • [30] MPEG-2 ISO/IEC 13818-1: GENERIC CODING OF MOVING PICTURES AND AUDIO: part 1- SYSTEMS Amendment 3: Transport of AVC video data over ITU-T Rec H.222.0 |ISO/IEC 13818-1 streams, 2003
  • [31] MKV merge software. Link:
  • [32] VLC media player. Link:
  • [33] Gom media player. Link:
  • [34] H. Murugan, “Multiplexing H264 video bit-stream with AAC audio bit-stream, demultiplexing and achieving lip sync during playback”, M.S.E.E Thesis, University of Texas at Arlington, TX May 2007.
  • [34] GeroldBlakowski “A Media Synchronization Survey: Reference Model, Specification, and Case Studies”, IEEE Journal on selected areas in communications, VOL. 14, NO. 1, JANUARY 1996
  • [35] H.264/AVC JM Software link:
  • [36] 3GPP Enhanced aacPlus reference software. Link:
  • [37] H.264 bitstream link: