Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback Naveen Siddaraju

Slides:



Advertisements
Similar presentations
International Telecommunication Union Workshop on Standardization in E-health Geneva, May 2003 MPEG-4 video transmission for ambulatory application.
Advertisements

Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
MPEG Audio Formats Jason Leung Wednesday, February 5, 2014.
Swaminathan Sridhar Multimedia Processing Lab
Multiplexing H.264/AVC Video with MPEG-AAC Audio Harishankar Murugan University of Texas at Arlington.
A Brief Overview of the MPEG2 Standard Dr. David Corrigan.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.
BY MRUDULA WARRIER UNDER THE GUIDANCE OF DR K.R.RAO
Audiovisual digital documents Adolf Knoll National Library of the Czech Republic
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
COE 341: Data & Computer Communications (T062) Dr. Marwan Abu-Amara
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
CSc 461/561 CSc 461/561 Multimedia Systems 0. Introduction.
ATSC Digital Television
TCP/IP Protocol Suite 1 Chapter 25 Upon completion you will be able to: Multimedia Know the characteristics of the 3 types of services Understand the methods.
COE 341: Data & Computer Communications (T061) Dr. Marwan Abu-Amara Chapter 8: Multiplexing.
BY AMRUTA KULKARNI STUDENT ID : UNDER SUPERVISION OF DR. K.R. RAO Complexity Reduction Algorithm for Intra Mode Selection in H.264/AVC Video.
AUDIO VIDEO FLASH DIGITAL MEDIA: COMMUNICATION AND DESIGN
MPEG-2 Transport streams tMyn1 MPEG-2 Transport streams The MPEG-2 Systems Standard specifies two methods for multiplexing the audio, video and other data.
Audio CompressiontMyn1 Audio Compression Audio compression has become well entrenched in consumer and professional digital audio products such as the compact.
EE 5359 H.264 to VC 1 Transcoding Vidhya Vijayakumar Multimedia Processing Lab MSEE, University of Arlington Guided.
Moving PicturestMyn1 Moving Pictures MPEG, Motion Picture Experts Group MPEG is a set of standards designed to support ”Coding of Moving Pictures and Associated.
By Sudeep Gangavati ID EE5359 Spring 2012, UT Arlington
MPEG-2 Standard By Rigoberto Fernandez. MPEG Standards MPEG (Moving Pictures Experts Group) is a group of people that meet under ISO (International Standards.
MPEG: (Moving Pictures Expert Group) A Video Compression Standard for Multimedia Applications Seo Yeong Geon Dept. of Computer Science in GNU.
Multiplexing/De-multiplexing Dirac Video with AAC Audio bit-stream
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
Profiles and levelstMyn1 Profiles and levels MPEG-2 is intended to be generic, supporting a diverse range of applications Different algorithmic elements.
Computer Networks: Multimedia Applications Ivan Marsic Rutgers University Chapter 3 – Multimedia & Real-time Applications.
EE 5359 PROJECT PROPOSAL FAST INTER AND INTRA MODE DECISION ALGORITHM BASED ON THREAD-LEVEL PARALLELISM IN H.264 VIDEO CODING Project Guide – Dr. K. R.
By, ( ) Low Complexity Rate Control for VC-1 to H.264 Transcoding.
ON DATACASTING OF H.264/AVC OVER DVB-H Multimedia Signal Processing, 2005 IEEE 7th Workshop on Publication Date: Oct Nov Reporter: 陳志明.
HDTV Video and AC-3 Payload Formats Ladan Gharai Allison Mankin USC/ISI.
Dhatchaini Rajendran Student ID: Date :
DMB 서비스 기술 임 영 권 Characteristics of T-DMB Backward Compatibility Efficient use of bandwidth Convergence between broadcasting & communication.
- By Naveen Siddaraju - Under the guidance of Dr K R Rao Study and comparison of H.264/MPEG4.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
Chapter 2 MEDIA FORMAT INTEROPERABILITY. Section 2.1 Background.
Video Compression Standards for High Definition Video : A Comparative Study Of H.264, Dirac pro And AVS P2 By Sudeep Gangavati EE5359 Spring 2012, UT Arlington.
EE 5359 TOPICS IN SIGNAL PROCESSING PROJECT ANALYSIS OF AVS-M FOR LOW PICTURE RESOLUTION MOBILE APPLICATIONS Under Guidance of: Dr. K. R. Rao Dept. of.
Power saving control for the mobile DVB-H receivers based on H.264/SVC standard Eugeny Belyaev, Vitaly Grinko, Ann Ukhanova Saint-Petersburg State University.
- By Naveen Siddaraju - Under the guidance of Dr K R Rao Study and comparison between H.264.
Chapter 28. Network Management Chapter 29. Multimedia
IMPLEMENTATION OF H.264/AVC, AVS China Part 7 and Dirac VIDEO CODING STANDARDS Under the guidance of Dr. K R. Rao Electrical Engineering Department The.
SUBMITTED BY, SWAMINATHAN SRIDHAR MS EE, UTA EE 5359 Multimedia Processing Project Multiplexing of AVS part 2 video with.
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
Multimedia and weBLOGging Grade 7-9 | Cahaya Bangsa Classical School (C) 2010 Digital Media Production Facility 04 – Audio Basic.
AIMS’99 Workshop Heidelberg, May 1999 Assessing Audio Visual Quality P905 - AQUAVIT Assessment of Quality for audio-visual signals over Internet.
Study and Performance Comparison of H.264/AVC, Dirac and AVS China Part 7 EE5359 Project Fall 2010 Touseef Khan
Digital Video File Formats an overview. Introduction Digital Video & Audio files are also known as container formats. These “containers” are digital files.
IntroductiontMyn1 Introduction MPEG, Moving Picture Experts Group was started in 1988 as a working group within ISO/IEC with the aim of defining standards.
MOTION PICTURES EXPERT GROUP(MPEG)
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
1 Multimedia Outline Compression RTP Scheduling. 2 Compression Overview Encoding and Compression –Huffman codes Lossless –data received = data sent –used.
EE5359 Multimedia Processing Project Study and Comparison of AC3, AAC and HE-AAC Audio Codecs Dhatchaini Rajendran Student ID: Date :
By: Santosh Kumar Muniyappa ( ) Guided by: Dr. K. R. Rao Final Report Multimedia Processing (EE 5359)
Report on MPEG activities (WP4) Schema 5 th Technical Committee Meeting Ipswich, February 2004 Josep R. Casas, UPC.
Implementation and comparison study of H.264 and AVS china EE 5359 Multimedia Processing Spring 2012 Guidance : Prof K R Rao Pavan Kumar Reddy Gajjala.
Digital video - many technologies recording, processing, transmission, storage and playback of visual or audio-visual material in the digital domain.
By Deepika Sreenivasulu Pagala Under the Guidance of Dr. K. R. Rao
III Digital Audio III.7 (W Nov 04) The MP3 frame format.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Chap 3: Encoding Video Content and Transportation
III Digital Audio III.7 (F Oct 20) The MP3 frame format.
Video Compression - MPEG
III Digital Audio III.7 (Mo Oct 22) The MP3 frame format.
Standards Presentation ECE 8873 – Data Compression and Modeling
Chap 3: Encoding Video Content
CUI BIN AVS team of the MPL at UTA
Presentation transcript:

Multiplexing H.264 and HEAACv2 elementary streams, de-multiplexing and achieving lip synchronization during playback Naveen Siddaraju

Contents:  Introduction : Need for multiplexing  Overview of codecs used  Transport protocols  Multiplexing  De-multiplexing and synchronization  Results  Conclusions  Future work  References

Introduction: need for multiplexing  Digital television broadcasting  ATSC- M/H [17]  DVB- H  DVB- T  Internet streaming  IPTV, YouTube etc.

MPEG transport system [17]

Choice of CODECs  Depends on the application.  Transport bandwidth - ATSC-M/H channel bandwidth 19.6Mbps - DVB-H channel bandwidth 14 Mbps  Processing power of the target device

H.264/ AVC  Defined in MPEG4 part 10  Jointly developed by ITU – T VCEG and MPEG group of ISO/IEC.  Provides better compression than its predecessors like MPEG 2 video and MPEG 4 part 2.  Suitable for a wide variety of applications.  Adopted standard in ATSC-M/H, DVB etc  Used in Blu-ray discs, DVDs, iTunes, flash player, video conferencing applications etc

Different profiles of H.264[5]

Frame types  Three basic types  Intra predictive (I) frame  Predictive (P) frame  Bi predictive (B) frame  IDR frame is a special type of I frame. - indicates the start of a video sequence.

Bitstream syntax of H.264  Data is organized into two layers  VCL (video coding layer)  NAL (network abstraction layer)  NAL formatting of VCL and non-VCL data [6]

 Forbidden bit  NRI - 2bits  Type - 5 bits NAL unit format[6]

NAL unit types [1]

Important NAL unit types  IDR frames - indicates start a of new video sequence  Sequence parameter sets (SPS) - contains parameters common to entire sequence - profile, level, size of the video, no of reference frames  Picture parameter sets (PPS) - contains parameters that to a frame or some frames in a sequence - entropy coding, quantization parameters etc.

H.264 stream [37]

HEAACv2  Also called enhanced aac plus  Developed by coding technologies for very low bitrate applications.  Defined in MPEG4 part 3 amendment 2  Enables coding in mono, stereo and multi channels (up to 48 channels )  Is a combination of AAC, SBR, PS  Provides highest perceptible quality for the lowest bitrate  Adopted as audio standard in ATSC- M/H, DVB, XM satellite radio  Can exist in a variety of file formats like mp4, m4a.  Controlled testing conducted by 3gpp [27] indicates that HEAACv2 provides good quality audio at 24kbps.

HEAACv2 family of codecs [7]

AAC (advanced audio codec)  Successor of the MP3 format  Defined both in MPEG2 [3] and MPEG4 [2]  Achieves better sound quality than MP3 for same bitrates.  AAC is also the standard audio format for apple iPhone, iPod, iPad, Sony playstation etc.  Up to 48 channels (MP3 supports up to two channels in MPEG-1 mode and up to 5.1channels in MPEG-2 mode)  More sampling frequencies (from 8 to 96 kHz) than MP3 (16 to 48 kHz)  Achieves good quality audio at 128 kbps for stereo.

SBR (spectral band replication) [2]  SBR is a bandwidth expansion technique  Exploits the correlation between the high and low frequencies.  Using SBR, along with AAC, high quality stereo sound can be achieved at 48 kbps.

High band reconstruction through SBR [28]  Original audio signal [28].  High band reconstruction through SBR [28].

PS (parametric stereo) [2]  Only used for low bitrate applications ( < 32kbps)  Parameterizes the stereo image such as time/phase differences, interchannel intensity differences etc.  Only monaural version of the stereo is encoded by the AAC encoder.  At the decoder side the monaural signal is decoded first, and then stereo signal is reconstructed using the PS parameters  Using PS along with AAC and SBR, reasonable quality stereo sound can be achieved at 24 kbps.

HEAACv2 bitstream formats  ADIF (audio data interchange format) - has just one header for the whole stream - used in storage media.  ADTS (audio data transport stream) - used in transport stream. - has headers in every access unit.

ADTS header format[2][3]

Profile bits expansion [2] [3]

ADTS bit stream [3]

Transport protocols  Most multimedia applications involve communication channels or storage.  RTP (real time protocol) - transport over IP networks  MPEG2 systems - digital television broadcast - storage (asset management)

MPEG2 systems  Defines two types streams - Program stream (PS) - used for storage, ex. DVD - Transport stream (TS) - used for digital broadcast  Two layers of packetization - PES (packetized elementary streams) - TS (transport stream)

MPEG2 transport stream [22]

PES (packetized elementary stream)  First layer of packetization  Separates audio video elementary streams into access units.  Variable length  Contains a header and payload (frame) data.  Add fields like time stamp, stream ID, packet length

Conversion of an elementary stream into PES packets [29]

PES packet header format used [4]

Frame number as time stamp  For video, fps is a constant through out the sequence.  For audio, sampling frequency is a constant through out the sequence.

TS packets  Second layer of packetization  Fixed length (188 bytes)  PES is logically broken down in to 188 byte packets  Three byte header contains packet ID, payload unit start flag, continuity counter etc.

Transport stream (TS) packet format

TS header description:  payload unit start indicator (PUSI) flag - indicates payload has PES header.  Adaptation field control (AFC) flag - indicates payload is less than 185 bytes  Continuity counter (CC) (4 bits) - 4 bit counter, used to check for any packet losses, out of sequences etc.  Packet ID (PID) (10 bits) - uniquely identifies the particular ES, the packet belongs to  Optional offset byte : - contains the offset value is AFC is set.

Multiplexing  What is multiplexing ?  Multiplexing is a process of transmitting TS packets belonging to different elementary streams.  Muxing is a processes of how effectively the TS packets are interleaved in the TS stream, so that both audio and video contents get transmitted simultaneously.  Buffer overflow/ underflow - Can cause picture loss, skip during audio video playback.

Multiplexing flowchart

Calculation of presentation time of a TS packet:  For video TS packet  For audio TS packet

Video processing

Audio processing

De-multiplexing  The transport stream (TS) input to a receiver is separated into a video elementary stream and audio elementary stream.  These ES are initially written in to video and audio buffers respectively.  Once one of the buffers is full, the elementary stream is reconstructed from the point of synchronization.

Audio- video synchronization  Once video buffer is full, it is searched for the next occurring IDR frame in the video buffer.  Corresponding audio frame is calculated from the equation  Elementary streams are reconstructed from that point. merged in to a container format (using mkv merge), then played back.

Results : Buffer fullness

Test conditions :  Video  H.264 baseline profile  Resolution: 416X240  GOP: IPPP (IDR forced)  Fps: 24  Audio  HEAACv2  ADTS format  Sampling frequency: 24,000Hz

De-multiplexer output Test clip12 Clip length (sec)30 50 Video FPS24 Audio sampling frequency (Hz)24000 total video frames Total audio frames Video raw file (.yuv) size(kB) Audio raw file (.wav) size(kB) H.264 file size(kB) AAC file size (kB) Video compression ratio Audio compression ratio H.264 encoder bitrate(kBps) AAC encoder bitrate(kbps)32 Total TS packets Transport stream size(kB) Transport stream bitrate (kBps) Test clip size (kB) Reconstructed clip size (kB)

Skew observed

Conclusions  buffer fullness was effectively handled with maximum buffer difference observed was around 20ms of media content  audio-video synchronization was achieved with a maximum skew of 13ms.

Future work  Expand the multiplexing algorithm to multiplex multiple programs  Implement the same multiplexing algorithm for other transport protocols like RTP/IP  Add error correction to TS stream.

References:  [1] MPEG-4: ISO/IEC JTC1/SC : Information technology – Coding of audio-visual objects - Part 10: Advanced Video Coding, ISO/IEC,  [2] MPEG-4: ISO/IEC JTC1/SC : Information technology — coding of audio-visual objects — Part 3: Audio, AMENDMENT 4: Audio Lossless Coding (ALS), new audio profiles and BSAC extensions  [3] MPEG–2: ISO/IEC JTC1/SC –7, advanced audio coding, AAC. International Standard IS WG11,  [4]MPEG-2: ISO/IEC Information technology—generic coding of moving pictures and associated audio—Part 1: Systems, ISO/IEC:  [5] Soon-kak Kwon et al. “Overview of H.264 / MPEG-4 Part 10 (pp )”, Special issue on “  Emerging H.264/AVC video coding standard”, J. Visual Communication and Image Representation, vol.  17, pp , April   [6] A. Puri et al. “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing:  Image Communication, vol.19, pp , Oct  [7] MPEG-4 HE-AAC v2 — audio coding for today's digital media world, article in the EBU technical review (01/2006) giving explanations on HE-AAC. Link: HE-AAC v2 — audio coding for today's digital media worldEBUhttp://tech.ebu.ch/docs/techreview/trev_305-moser.pdf  [8]ETSI TS “Implementation guidelines for the use of video and audio coding in broadcasting applications based on the MPEG-2 transport stream”.  [9] 3GPP TS : General Audio Codec audio processing functions; Enhanced aacPlus General Audio Codec; 2009  [10] 3GPP TS : Enhanced aacPlus general audio codec; Encoder Specification AAC part.  [11] 3GPP TS : Enhanced aacPlus general audio codec; Encoder Specification SBR part.  [12] 3GPP TS : Enhanced aacPlus general audio codec; Encoder Specification Parametric Stereo part.

 [13]  [14] MPEG Transport Stream. Link:  [15] MPEG-4: ISO/IEC JTC1/SC : Information technology — coding of audio-visual objects — Part 14 :MP4 file format, 2003  [16] DVB-H : Global mobile TV. Link :  [17] ATSC-M/H. Link :  [18] Open mobile vidéo coalition. Link :  [19] VC-1 Compressed Video Bitstream Format and Decoding Process (SMPTE 421M-2006), SMPTE Standard, 2006 (  [20] Henning Schulzrinne's RTP page. Link: Schulzrinne's RTP pagehttp://  [21] G.A Davidson et al, “ATSC video and audio coding”, Proc. IEEE, vol 94, pp , Jan (  [22] I. E.G.Richardson, “H.264 and MPEG-4 video compression: video coding for next-generation multimedia”, Wiley,  [23] European Broadcasting Union,  [24] Shintaro Ueda, et, al “NAL level stream authentication for H.264/AVC”, IPSJ Digital courier, Vol 3, Feb  [25] World DMB: link:  [26] ISDB website. Link:

 [27] 3gpp website. Link:  [28] “Audio compression gets better and more complex” Mihir Modi, link :   [29]”MPEG-2: Overview of systems layer”, by PA Sarginson. Link:  [30] MPEG-2 ISO/IEC : GENERIC CODING OF MOVING PICTURES AND AUDIO: part 1- SYSTEMS Amendment 3: Transport of AVC video data over ITU-T Rec H |ISO/IEC streams, 2003  [31] MKV merge software. Link:  [32] VLC media player. Link:  [33] Gom media player. Link:  [34] H. Murugan, “Multiplexing H264 video bit-stream with AAC audio bit-stream, demultiplexing and achieving lip sync during playback”, M.S.E.E Thesis, University of Texas at Arlington, TX May  [34] Gerold Blakowski et.al “A Media Synchronization Survey: Reference Model, Specification, and Case Studies”, IEEE Journal on selected areas in communications, VOL. 14, NO. 1, JANUARY 1996   [35] H.264/AVC JM Software link:   [36] 3GPP Enhanced aacPlus reference software. Link:  [37] H.264 bitstream link:

Thank you