Part II (MPEG-4) Audio TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Wideband Speech Coding for CDMA2000® Systems
Advanced Piloting Cruise Plot.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
ZMQS ZMQS
Speech Coding Techniques
ABC Technology Project
TSBK01 Image Coding and Data Compression Lecture 10 Jörgen Ahlberg
High Frequency Distortion in Power Grids due to Electronic Equipment Anders Larsson Luleå University of Technology.
VMR-WB – Operation of the 3GPP2 Wideband Speech Coding Standard M. Jelinek†, R. Salami‡ and S. Ahmadi * †University of Sherbrooke, Canada ‡VoiceAge Corporation,
Squares and Square Root WALK. Solve each problem REVIEW:
MPEG-4 CS Division University of California at Berkeley John Lazzaro John Wawrzynek June 18, 2001 Modified by Francois Thibault.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
2005/01/191/14 Overview of Fine Granularity Scalability in MPEG-4 Video Standard Weiping Li Fellow, IEEE IEEE Transactions on Circuits and Systems for.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
25 seconds left…...
Week 1.
Number bonds to 10,
We will resume in: 25 Minutes.
Tamara Berg Advanced Multimedia
Audio Compression ADPCM ATRAC (Minidisk) MPEG Audio –3 layers referred to as layers I, II, and III –The third layer is mp3.
Department of Computer Engineering University of California at Santa Cruz MPEG Audio Compression Layer 3 (MP3) Hai Tao.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 11 – MP3 and MP4 Audio (Part 7) Klara Nahrstedt Spring 2012.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
03/18/2005ENEE408G Spring 2005 Multimedia Signal Processing 1 ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 4: Digital.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Digital Audio Coding – Dr. T. Collins Standard MIDI Files Perceptual Audio Coding MPEG-1 layers 1, 2 & 3 MPEG-4.
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.
Ranko Pinter Simoco Digital Systems
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Roberta Eklund Consultant MPEG-4 AUDIO OVERVIEW. MPEG-4 Audio Overview Y Y Natural Audio Y Y T/F Y YCELP Y Y PARA Y Y Structured Audio Y YSAOL Y YSASL.
Lecture 14: Spring 2007 MPEG Audio Compression
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Audio Coding MPEG1 Layers I, II, III MPEG2MPEG4 Sherida Subrati Anthony Caliendo.
Audio CompressiontMyn1 Audio Compression Audio compression has become well entrenched in consumer and professional digital audio products such as the compact.
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
MPEG-2 Standard By Rigoberto Fernandez. MPEG Standards MPEG (Moving Pictures Experts Group) is a group of people that meet under ISO (International Standards.
Multimedia Data Speech and Audio Dr Sandra I. Woolley Electronic, Electrical and Computer Engineering.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
09/30/2005ENEE408G Fall 2005 Multimedia Signal Processing 1 ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 2: Digital Audio.
Concepts of Multimedia Processing and Transmission IT 481, Lecture #9 Hung Nguyen, Ph.D. 11 April, 2005 IT 481, Lecture #10 Dennis McCaughey, Ph.D. 13.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
MPEG-4 standard MPEG-4 Multimedia Standard Olivier Dechazal.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
MPEG-4 Structured Audio Mihir Anandpara EE 382C – Embedded Software Systems.
Interactive Multimedia Sound Mikael Fernström. Data sources Microphones and transducers –Sample acoustic reality Synthesis –Simulate reality (and beyond.
UNIT V. Linear Predictive coding With the advent of inexpensive digital signal processing circuits, the source simply analyzing the audio waveform to.
MP3 and MP4 Audio By: Krunal Tailor
Audio Compression.
Coding Technologies for Speech and Audio Signals
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

Part II (MPEG-4) Audio TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg

2 MPEG-4 Audio - Outline Psycho-acoustic models Overview of MPEG-4 Audio AAC - Advanced Audio Codec Specialized coders Synthetic (structured) audio

3 Psycho-acoustic models A psycho-acoustic model tells how humans perceive the sound. The main feature of the psycho-acoustic model in the compression context is that it tells what parts that we can remove.

4 Hearing Threshold dB Will not be heard anyway; discard! kHz

5 Frequency Masking Energy Frequency

6 Frequency Masking Energy Frequency

7 Temporal Masking Energy Time Strong sound (”masker”) Forward (post) masking Approx. 100 ms Backward (pre) masking < 10 ms

8 Psycho-acoustic Model: Demo Music without distortion Music with white noise Music with perceptually distributed noise

9 Parts of MPEG-4 Audio General natural audio – AAC BSAC TwinVQ – HILN (parametric) Natural speech – CELP – HVXC (parametric) Synthetic audio – TTS – SAOL – SASL Composition – Mixing – Re-sampling – 3D-rendering

10 Parts of MPEG-4 Audio (cont.) Error Protection – CRC – FEC Block code Convolution code – Interleaving Error Resilience – Error resilient bitstreams – Error concealment

11 Natural Audio Coders Quality Cellular Telephone AM FM CD kbit/s Parametric speech (HVXC) High quality speech (CELP) General audio (AAC, TwinVQ) Parametric audio (HILN)

12 MPEG-2/4 AAC: Advanced Audio Coder DCT-based time/frequency coder. Typically 16 – 64 kbit/s/channel. ”Expert listener quality” at 128 kbit/s. Added to MPEG-2, but without MPEG-4 features. Half the bitrate compared to mp3, mainly due to improved psycho-acoustic model. kbits/skHzHaydnTracy Chapman Mono16 Stereo3216 Stereo6432

13 MPEG-4 Extensions to the AAC TwinVQ (Transform-domain Weighted Interleave) – Improves performance for low bitrates (6-18 kbit/s). PNS (Perceptual Noise Substituion) – Allows coding ”noise-like” parts parametrically. LTP (Long-term prediction) – Allows ”tone-like” parts to be coded with higher accuracy to a lower bitrate.

14 MPEG-4 Extensions to the AAC BSAC (Bit-sliced Arithmetic Coder) – Adds scaleability to the bitstream. – 16 – 64 kbit/s in steps of 1 kbit/s. Demo: kbit/s

15 Other MPEG-4 Natural Audio Coders Speech coders – High bitrate speech coder (CELP) – Low bitrate speech coder (HVXC) HILN low bitrate parametric coder – Harmonic and Individual Lines plus Noise – kbit/s – Subband coder that codes each subband as a tone or as shaped noise.

16 MPEG-4 High Bitrate Speech Coder High quality CELP coder. 8 or 16 kHz sampling (NB or WB mode). 4 – 24 kbit/s. PCM (uncompressed) 16 kbit/s24 kbit/s Codebook index k LPC filter Perceptual w. filter e(n) gkgk x k (n) s(n) Basic principle of CELP coder

17 MPEG-4 Low Bitrate Speech Coder HVXC – Harmonic Vector eXcitation Coder. 8 kHz sampling, 2 – 4 kbit/s. Down to 1.2 kbit/s in variable rate mode. Sinusoidal coding for voiced parts and CELP coding for unvoiced part. HVXC can be combined with HILN. – Automatic switching between the coders – Produces one bitstream.

18 MPEG-4 Natural Audio Coders: Demo Original audio Music coder (TwinVQ) Music coder (HILN) Speech coder (CELP) Speech coder (HVXC) 6 kbit/s 2 kbit/s Speech Simple music Complex music

19 Speed Change Possibility to decode to arbitrary speed, without changing the pitch. Original Music ~20% faster

20 Synthetic Audio TTS – Text-To-Speech – MPEG-4 defines an interface, not the TTS itself SAOL - Structured Audio Orchestra Language – SAOL describes how to generate instruments SASL - Structured Audio Score Language – SASL describes which instruments to play when – MIDI is a subset of SASL Demo: – Orchestra: Initially 80 kB instrument descriptions (SAOL) – While playing: 1 kbit/s (SASL)

21 BIFS –Binary Format for Scene Description All the sound you hear is coded at 16 kbit/s. Initial voice coded using TTS. Current voice coded using parametric speech coder (HVXC). Background ”music” coded using Structured Audio. Post-production specified using BIFS, using the Structured Audio tools.

22 A Scene Graph AudioMix AudioFX AudioSource Mix the sounds Add reverb Hand claps (SA decoder) Speech (CELP-coder)

23 AudioMix AudioFX AudioDelay AudioFX AudioSource PianoBass (SA)Finger snaps

That was the last slide!