Roberta Eklund Consultant MPEG-4 AUDIO OVERVIEW. MPEG-4 Audio Overview Y Y Natural Audio Y Y T/F Y YCELP Y Y PARA Y Y Structured Audio Y YSAOL Y YSASL.

Roberta Eklund Consultant MPEG-4 AUDIO OVERVIEW

MPEG-4 Audio Overview Y Y Natural Audio Y Y T/F Y YCELP Y Y PARA Y Y Structured Audio Y YSAOL Y YSASL Y YSASBF Y YMIDI-DLS-version 2 Y YTTS Y Y Cross Tool(Algorithm) Functionality Y Y Pitch/tempo change Y Y Bitrate scalability Y Y Computation complexity scalability Y Y Error robustness Y Y Audio related effects Y Y Acoustic virtualization

Different Tools for Bitrates/Application

MPEG-4 Audio Tools PROFILES n Profile - defines the syntax of the bitstream for one single Object, that can represent a meaningful entity in the Audio or Visual scene. Elementary bitstream n Object Profile - defines the syntax of the bitstream for one single Object, that can represent a meaningful entity in the Audio or Visual scene. Elementary bitstream n Profile - defines which different Object Profiles can be combined in the Audio or Visual scene. Combinations of Elementary bitstreams. n Composition Profile - defines which different Object Profiles can be combined in the Audio or Visual scene. Combinations of Elementary bitstreams.

OBJECT PROFILES

Combination Profiles

MPEG-4 Encoder Structure

Encoder Configuration MPEG-4 T/F Encoder Configuration

MPEG-4 T/F Decoder Configuration

Block Diagram of CELP Encoder

Excitation signal generator: l codebook l regular pulse excitation (RPE) l multi-pulse excitation (MPE) Block Diagram of CELP Decoder

Block Diagram of PARA Encoder

Block Diagram of PARA Decoder

Two operating modes l l harmonic and noise components (HVXC) – –for speech coding at 2...4 kbps l l harm. & indiv. sinusoidal comp. + noise (HILN) – –for coding of music signals with low complexity content (e.g. single instruments) at 4...16 kbps l l combination of both modes – –support by syntax, defined transition – –automatic mode selector – –cross fade from one signal to another one PARA is Two Codecs in One

Text-to-Speech n Phonemic (language-independent) syntax n Prosody, timing cues n Language, dialect, gender, age parameters n Automatic synchronization with FBA n Exact TTS synthesis non-normative; only interface is specified

Structured Audio n Structured Audio - Sound coding using structured descriptions n Structured Audio decoder - music and sound-effect synthesis n MMA, Microsoft, EMU now collaborating on MIDI DLS-version 2 in MPEG4

SAOL n Downloadable BNF synthesis grammar n Header contains description of several synthesizers and effects processors control algorithms and routing instructions for audio flow of control n SAOL has 100 primitive processing instructions, signal generators and operators which fill wavetables with data.

SASL and MIDI n New format for describing control parameters - Basically a scheduler of audio events - Basically a scheduler of audio events - Designed to interface well with SAOL - Designed to interface well with SAOL - New Control Language Similar to MIDI - New Control Language Similar to MIDI n MIDI (Musical Instrument Digital Interface) –Simpler format for describing control –Included as alternate control method –Leverages existing authoring tools –Gives “backwards compatibility” to SA

DLS Level 2 n Aims at consistent synthetic audio playback across wide range of platforms n Defines a simple wavetable synthesizer n Bitstream includes sound samples n Score expressed in MIDI n Growing support from both software and hardware developers –DLS Part of DirectMusic in Microsoft’s DirectX 6.0

DLS-2 synthesizer model n Simple yet powerful structure much alike to many existing synthesizers in the market (eg in PC soundcards) –Uses loopable samples as sound sources (wavetable) –variable routing of control sources n 2 envelopes for amplitude control n 2 low frequency oscillators n 1-pole dynamic low-pass filter –Standardized response to MIDI controllers

Audio Bifs AudioSource Piano (SA) Finger snaps (Parametric) BIFS stuff Audio channels Bass (SA) AudioSource AudioMix AudioFX Synchronization with Visual! AudioFX AudioDelay AudioMix HRTF

Demo Audio BIFS

Conclusion n MPEG-4 Audio attempts to offer solutions to all spectra of sound. n Some of the tools are more stable, while others are still in Research and Development. n MPEG2-AAC is the best multi- channel lossy audio compression standard to date.

Acknowledgements I would like to thank the authors from the references for providing the material presented here today. I would like to thank the authors from the references for providing the material presented here today.

Definitions l T/F Time/Frequency (MDCT transform) l AAC Advanced Audio Coding l PARA Parametric l CELP Code Excited Linear Prediction l SA Structured Audio l PNS Perceptual Noise Substitution l HVXC Harmonic Vector eXcitation Coding l HILN Harmonic and Individual Line + Noise l SAOL Structured Audio Orchestra Language l SASL Structured Audio Score Language l MIDI Musical Instrument Digital Interface l TTS Text to Speech

More Definitions l CD Committee Draft l IS13818-7 Advanced Audio Coding l LC Low Complexity l BSAC Bit Sliced Arithmetic Coding l SSR Scalable Sample Rate l PNS Perceptual Noise Substitution l VBRVariable Bit Rate l TLSSTools for Large Step Scalability l SNHCSynthetic/Natural Hybrid Coding l DLSDownloadable Samples

Natural Audio Complexity

AAC Decoder Complexity Evaluation MPEG AAC DecoderComplexity MPEG AAC DecoderComplexity 2-channel Main Profile40% of 133 MHz Pentium 2-channel Low Complexity25% of 133 MHz Pentium 5-channel Main Profile90 sq. mm die, 0.5 micron CMOS 5-channel Low Complexity60 sq.mm die, 0.5 micron CMOS

AAC Test Results n Test at BBC and NHK according to ITU-R BS.1116 –triple-stimulus/hidden-reference/double-blind –ITU-R 5-point impairment scale –95% Confidence Intervals n MPEG AAC provides “indistinguishable” quality at 320 kb/s per five channels n MPEG AAC at 320 kb/s outperforms MPEG BC Layer II at 640 kb/s per five channels n Recent Stereo Tests at NHK Showed MPEG AAC provides “indistinguishable” quality at 128 kb/s per two channels

References n M. Bosi, E. Schrierer, B. Edler, Peter G. Schreiner MPEG-4 Seminar, Fribourg, Switzerland 1997 n S. Quackenbush, “Coding of Natural Audio in MPEG-4”, Proc IEEE ICASSP, Seattle, 1998 n B. Grill, B. Edler, I. Kaneko, Y. Lee, M. Nishiguichi, E. Scheirer, and M. Väänänen (Eds). ISO 14496-4(MPEG-4 Audio) Committee Draft. MPEG document N1903 n E. Schrier, “The MPEG-4 Structured Audio Standard”, Proc IEEE ICASSP, Seattle, 1998 Juergen Herre, “Updated Description for Perceptual Noise Substitution Tool”, MPEG Document M2692 n n E. Scheirer, R. Väänänen, J. Huopaniemi, “AudioBIFS: The MPEG-4 Standard for Effects Processing”, AES, SF, 1998 n n Overview: http://www.cselt.it/mpeg/standards/mpeg-4/mpeg-4.htm

Roberta Eklund Consultant MPEG-4 AUDIO OVERVIEW. MPEG-4 Audio Overview Y Y Natural Audio Y Y T/F Y YCELP Y Y PARA Y Y Structured Audio Y YSAOL Y YSASL.

Similar presentations

Presentation on theme: "Roberta Eklund Consultant MPEG-4 AUDIO OVERVIEW. MPEG-4 Audio Overview Y Y Natural Audio Y Y T/F Y YCELP Y Y PARA Y Y Structured Audio Y YSAOL Y YSASL."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Roberta Eklund Consultant MPEG-4 AUDIO OVERVIEW. MPEG-4 Audio Overview Y Y Natural Audio Y Y T/F Y YCELP Y Y PARA Y Y Structured Audio Y YSAOL Y YSASL.

Similar presentations

Presentation on theme: "Roberta Eklund Consultant MPEG-4 AUDIO OVERVIEW. MPEG-4 Audio Overview Y Y Natural Audio Y Y T/F Y YCELP Y Y PARA Y Y Structured Audio Y YSAOL Y YSASL."— Presentation transcript:

Similar presentations

About project

Feedback