slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback PowerPoint Presentation
Download Presentation
Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback

Loading in 2 Seconds...

play fullscreen
1 / 51

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback - PowerPoint PPT Presentation


  • 274 Views
  • Uploaded on

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback. Naveen Siddaraju naveen.siddaraju@mavs.uta.edu. Contents: . Introduction : Need for multiplexing Overview of codecs used Transport protocols Multiplexing

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback' - vlad


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback

NaveenSiddaraju

naveen.siddaraju@mavs.uta.edu

contents
Contents:
  • Introduction : Need for multiplexing
  • Overview of codecs used
  • Transport protocols
  • Multiplexing
  • De-multiplexing and synchronization
  • Results
  • Conclusions
  • Future work
  • References
introduction need for multiplexing
Introduction: need for multiplexing
  • Digital television broadcasting
      • ATSC- M/H [17]
      • DVB- H
      • DVB- T
  • Internet streaming
    • IPTV, YouTube etc .
choice of codecs
Choice of CODECs
  • Depends on the application.
    • Transport bandwidth

- ATSC-M/H channel bandwidth 19.6Mbps

- DVB-H channel bandwidth 14 Mbps

    • Processing power of the target device
h 264 avc
H.264/ AVC
  • Defined in MPEG4 part 10
  • Jointly developed by ITU – T VCEG and MPEG group of ISO/IEC.
  • Provides better compression than its predecessors like MPEG 2 video and MPEG 4 part 2.
  • Suitable for a wide variety of applications.
  • Adopted standard in ATSC-M/H, DVB etc
  • Used in Blu-ray discs, DVDs, iTunes, flash player, video conferencing applications etc
frame types
Frame types
  • Three basic types
    • Intra predictive (I) frame
    • Predictive (P) frame
    • Bi predictive (B) frame
  • IDR frame is a special type of I frame.

- indicates the start of a video sequence.

bitstream syntax of h 264
Bitstream syntax of H.264
  • Data is organized into two layers
      • VCL (video coding layer)
      • NAL (network abstraction layer)
  • NAL formatting of VCL and non-VCL data [6]
slide10

Forbidden bit

  • NRI - 2bits
  • Type - 5 bits

NAL unit format[6]

important nal unit types
Important NAL unit types
  • IDR frames

- indicates start a of new video sequence

  • Sequence parameter sets (SPS)

- contains parameters common to entire sequence

- profile, level, size of the video, no of reference frames

  • Picture parameter sets (PPS)

- contains parameters that to a frame or some frames in a sequence

- entropy coding , quantization parameters etc .

heaacv2
HEAACv2
  • Also called enhanced aac plus
  • Developed by coding technologies for very low bitrate applications .
  • Defined in MPEG4 part 3 amendment 2
  • Enables coding in mono, stereo and multi channels (up to 48 channels )
  • Is a combination of AAC, SBR, PS
  • Provides highest perceptible quality for the lowest bitrate
  • Adopted as audio standard in ATSC- M/H, DVB, XM satellite radio
  • Can exist in a variety of file formats like mp4, m4a.
  • Controlled testing conducted by 3gpp [27] indicates that HEAACv2 provides good quality audio at 24kbps.
aac advanced audio codec
AAC (advanced audio codec)
  • Successor of the MP3 format
  • Defined both in MPEG2 [3] and MPEG4 [2]
  • Achieves better sound quality than MP3 for same bitrates.
  • AAC is also the standard audio format for apple iPhone, iPod, iPad, Sony playstation etc .
  • Up to 48 channels (MP3 supports up to two channels in MPEG-1 mode and up to 5.1channels in MPEG-2 mode)
  • More sampling frequencies (from 8 to 96 kHz) than MP3 (16 to 48 kHz)
  • Achieves good quality audio at 128 kbps for stereo.
sbr spectral band replication 2
SBR (spectral band replication) [2]
  • SBR is a bandwidth expansion technique
  • Exploits the correlation between the high and low frequencies.
  • Using SBR, along with AAC, high quality stereo sound can be achieved at 48 kbps.
high band reconstruction through sbr 28
High band reconstruction through SBR [28]
  • Original audio signal [28].
  • High band reconstruction through SBR [28].
ps parametric stereo 2
PS (parametric stereo) [2]
  • Only used for low bitrate applications ( < 32kbps)
  • Parameterizes the stereo image such as time/phase differences, interchannel intensity differences etc.
  • Only monaural version of the stereo is encoded by the AAC encoder.
  • At the decoder side the monaural signal is decoded first, and then stereo signal is reconstructed using the PS parameters
  • Using PS along with AAC and SBR , reasonable quality stereo sound can be achieved at 24 kbps.
heaacv2 bitstream formats
HEAACv2 bitstream formats
  • ADIF (audio data interchange format)

- has just one header for the whole stream

- used in storage media.

  • ADTS (audio data transport stream)

- used in transport stream.

- has headers in every access unit.

transport protocols
Transport protocols
  • Most multimedia applications involve communication channels or storage.
    • RTP (real time protocol)

- transport over IP networks

    • MPEG2 systems

- digital television broadcast

- storage (asset management)

mpeg2 systems
MPEG2 systems
  • Defines two types streams

- Program stream (PS)

- used for storage , ex. DVD

- Transport stream (TS)

- used for digital broadcast

  • Two layers of packetization

- PES (packetized elementary streams)

- TS (transport stream)

pes packetized elementary stream
PES (packetized elementary stream)
  • First layer of packetization
  • Separates audio video elementary streams into access units.
  • Variable length
  • Contains a header and payload (frame) data.
  • Add fields like time stamp, stream ID, packet length
frame number as time stamp
Frame number as time stamp
  • For video, fps is a constant through out the sequence.
  • For audio, sampling frequency is a constant through out the sequence.
ts packets
TS packets
  • Second layer of packetization
  • Fixed length (188 bytes)
  • PES is logically broken down in to 188 byte packets
  • Three byte header contains packet ID, payload unit start flag, continuity counter etc.
ts header description
TS header description:
  • payload unit start indicator (PUSI) flag

- indicates payload has PES header.

  • Adaptation field control (AFC) flag

- indicates payload is less than 185 bytes

  • Continuity counter (CC) (4 bits)

- 4 bit counter, used to check for any packet losses, out of sequences etc .

  • Packet ID (PID) (10 bits)

- uniquely identifies the particular ES , the packet belongs to

  • Optional offset byte :

- contains the offset value is AFC is set.

multiplexing
Multiplexing
  • What is multiplexing ?
  • Multiplexing is a process of transmitting TS packets belonging to different elementary streams .
  • Muxing is a processes of how effectively the TS packets are interleaved in the TS stream , so that both audio and video contents get transmitted simultaneously.
  • Buffer overflow/ underflow

- Can cause picture loss, skip during audio video playback.

calculation of presentation time of a ts packet
Calculation of presentation time of a TS packet:
  • For video TS packet
  • For audio TS packet
de multiplexing
De-multiplexing
  • The transport stream (TS) input to a receiver is separated into a video elementary stream and audio elementary stream.
  • These ES are initially written in to video and audio buffers respectively.
  • Once one of the buffers is full, the elementary stream is reconstructed from the point of synchronization.
audio video synchronization
Audio- video synchronization
  • Once video buffer is full, it is searched for the next occurring IDR frame in the video buffer.
  • Corresponding audio frame is calculated from the equation
  • Elementary streams are reconstructed from that point.

merged in to a container format (using mkv merge), then played back.

test conditions
Test conditions :
  • Video
    • H.264 baseline profile
    • Resolution: 416X240
    • GOP: IPPP (IDR forced)
    • Fps: 24
  • Audio
    • HEAACv2
    • ADTS format
    • Sampling frequency: 24,000Hz
conclusions
Conclusions
  • buffer fullness was effectively handled with maximum buffer difference observed was around 20ms of media content
  • audio-video synchronization was achieved with a maximum skew of 13ms.
future work
Future work
  • Expand the multiplexing algorithm to multiplex multiple programs
  • Implement the same multiplexing algorithm for other transport protocols like RTP/IP
  • Add error correction to TS stream.
references
References:
  • [1] MPEG-4: ISO/IEC JTC1/SC29 14496-10: Information technology – Coding of audio-visual objects - Part 10: Advanced Video Coding, ISO/IEC, 2005.
  • [2] MPEG-4: ISO/IEC JTC1/SC29 14496-3: Information technology — coding of audio-visual objects — Part 3: Audio, AMENDMENT 4: Audio Lossless Coding (ALS), new audio profiles and BSAC extensions
  • [3] MPEG–2: ISO/IEC JTC1/SC29 13818–7, advanced audio coding, AAC. International Standard IS WG11, 1997.
  • [4]MPEG-2: ISO/IEC 13818-1 Information technology—generic coding of moving pictures and associated audio—Part 1: Systems, ISO/IEC: 2005.
  • [5] Soon-kak Kwon et al. “Overview of H.264 / MPEG-4 Part 10 (pp.186-216)”, Special issue on “
  • Emerging H.264/AVC video coding standard”, J. Visual Communication and Image Representation, vol.
  • 17, pp.183-552, April 2006.
  • [6] A. Puri et al. “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing:
  • Image Communication, vol.19, pp 793-849, Oct 2004.
  • [7] MPEG-4 HE-AAC v2 — audio coding for today's digital media world, article in the EBU technical review (01/2006) giving explanations on HE-AAC. Link: http://tech.ebu.ch/docs/techreview/trev_305-moser.pdf
  • [8]ETSI TS 101 154 “Implementation guidelines for the use of video and audio coding in broadcasting applications based on the MPEG-2 transport stream”.
  • [9] 3GPP TS 26.401: General Audio Codec audio processing functions; Enhanced aacPlus General Audio Codec; 2009
  • [10] 3GPP TS 26.403: EnhancedaacPlusgeneral audio codec; Encoder Specification AAC part.
  • [11] 3GPP TS 26.404 : EnhancedaacPlusgeneral audio codec; Encoder Specification SBR part.
  • [12] 3GPP TS 26.405: Enhanced aacPlus general audio codec; Encoder Specification Parametric Stereo part.
slide49

[13] http://www.jeroenbreebaart.com/papers/aes/aes116_2.pdf

  • [14]MPEG Transport Stream. Link: http://www.iptvdictionary.com/iptv_dictionary_MPEG_Transport_Stream_TS_definition.html
  • [15] MPEG-4: ISO/IEC JTC1/SC29 14496-14 : Information technology — coding of audio-visualobjects — Part 14 :MP4 file format, 2003
  • [16] DVB-H : Global mobile TV. Link : http://www.dvb-h.org/
  • [17] ATSC-M/H. Link : http://www.atsc.org/cms/
  • [18] Open mobile vidéo coalition. Link : http://www.openmobilevideo.com/about-mobile-dtv/standards/
  • [19] VC-1 Compressed Video Bitstream Format and Decoding Process(SMPTE 421M-2006), SMPTE Standard, 2006 (http://store.smpte.org/category-s/1.htm).
  • [20] Henning Schulzrinne's RTP page. Link: http://www.cs.columbia.edu/~hgs/rtp/
  • [21] G.A Davidson et al, “ATSC video and audio coding”, Proc. IEEE, vol 94, pp. 60-76, Jan. 2006 (www.atsc.org).
  • [22] I. E.G.Richardson, “H.264 and MPEG-4 video compression: video coding for next-generation multimedia”,Wiley, 2003.
  • [23] European Broadcasting Union, http://www.ebu.ch/
  • [24] Shintaro Ueda, et, al “NAL level stream authentication for H.264/AVC” , IPSJ Digital courier, Vol 3 , Feb 2007.
  • [25] World DMB: link: http://www.worlddab.org/
  • [26] ISDB website. Link: http://www.dibeg.org/
slide50

[27] 3gpp website. Link: http://www.3gpp.org/

  • [28] “Audio compression gets better and more complex” MihirModi, link : http://www.eetimes.com/discussion/other/4025543/Audio-compression-gets-better-and-more-complex
  • [29]”MPEG-2: Overview of systems layer”, by PA Sarginson. Link: http://downloads.bbc.co.uk/rd/pubs/reports/1996-02.pdf
  • [30] MPEG-2 ISO/IEC 13818-1: GENERIC CODING OF MOVING PICTURES AND AUDIO: part 1- SYSTEMS Amendment 3: Transport of AVC video data over ITU-T Rec H.222.0 |ISO/IEC 13818-1 streams, 2003
  • [31] MKV merge software. Link: http://www.matroska.org/
  • [32] VLC media player. Link: http://www.videolan.org/
  • [33] Gom media player. Link: http://www.gomlab.com/
  • [34] H. Murugan, “Multiplexing H264 video bit-stream with AAC audio bit-stream, demultiplexing and achieving lip sync during playback”, M.S.E.E Thesis, University of Texas at Arlington, TX May 2007.
  • [34] GeroldBlakowski et.al “A Media Synchronization Survey: Reference Model, Specification, and Case Studies”, IEEE Journal on selected areas in communications, VOL. 14, NO. 1, JANUARY 1996
  • [35] H.264/AVC JM Software link: http://iphome.hhi.de/suehring/tml/download/.
  • [36] 3GPP Enhanced aacPlus reference software. Link: http://www.3gpp.org/ftp/
  • [37] H.264 bitstream link: http://sosori.com/