Audio Henning Schulzrinne Dept. of Computer Science

Slides:



Advertisements
Similar presentations
Wideband Speech Coding for CDMA2000® Systems
Advertisements

Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
Introduction to Digital Audio
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 11 – MP3 and MP4 Audio (Part 7) Klara Nahrstedt Spring 2012.
Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
Audio Coding Team Member: ChungMing Yan, Chun Tong.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Digital Audio Compression
N Team 15: Final Presentation Peter Nyberg Azadeh Bararsani Adie Tong N N multicodec minisip.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
Speech-Coding Techniques Chapter 3. Internet Telephony 3-2 Introduction Efficient speech-coding techniques Advantages for VoIP Digital streams of ones.
Codec requirements update Michael Knappe Co-chair, codec WG 1Michael Knappe IETF 77.
Speech codecs and DCCP with TFRC VoIP mode Magnus Westerlund
© 2006 AudioCodes Ltd. All rights reserved. AudioCodes Confidential Proprietary Signal Processing Technologies in Voice over IP Eli Shoval Audiocodes.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist.
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
EE 382 Processor DesignWinter 98/99Michael Flynn 1 Client and Server processors Client incorporates –Multi Media (sound and video) –Imaging (3D) –Security.
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Audio Coding MPEG1 Layers I, II, III MPEG2MPEG4 Sherida Subrati Anthony Caliendo.
COMP 249 :: Spring 2005 Slide: 1 Audio Coding Ketan Mayer-Patel.
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
Image Compression - JPEG. Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation.
CS :: Fall 2003 Audio Coding Ketan Mayer-Patel.
Introduction to Sound Sounds are vibrations that travel though the air or some other medium A sound wave is an audible vibration that travels through.
LE 460 L Acoustics and Experimental Phonetics L-13
Secure Steganography in Audio using Inactive Frames of VoIP Streams
MPEG: (Moving Pictures Expert Group) A Video Compression Standard for Multimedia Applications Seo Yeong Geon Dept. of Computer Science in GNU.
Audio. Why Audio Essential tool for – Interface – Narrative – Setting & Mood.
Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003.
Speaker : Chungyi Wang Advisor: Quincy Wu Date :
What’s new in Wideband Audio?
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
Pengantar Multimedia. Sound  Physical phenomenon – vibration.  Source = electrical – acoustic  Vibration – oscillation – wave  Wave periodical – song,
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Jason A. Hockman McGill University 24 January 2008
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
Voice Coding in 3G Networks
A UDIO B ANDWIDTH D ETECTION IN THE EVS C ODEC University of Sherbrooke, Canada VoiceAge Corporation, Montréal, Canada Fraunhofer IIS, Erlagen, Germany.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
Audio Coding Lecture 7. Content  Digital Audio Basic  Speech Compression  Music Compression.
Opus SW codec RTLAB Ki Eun Seong. What is the Opus Codec? Real-time interactive audio codec Targets interactive audio over the internet Aims to be royalty-free,
MP3 and AAC Trac D. Tran ECE Department The Johns Hopkins University Baltimore MD
MP3 and MP4 Audio By: Krunal Tailor
Opus, a free, high-quality speech and audio codec
Chapter 5 Analogue to Digital
Scalable Speech Coding for IP Networks
III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.
Vocoders.
Wenyu Jiang Henning Schulzrinne Columbia University
Introduction to Digital Audio
Sound Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman
Speech and Audio Processing
ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.
Introduction to Digital Audio
Understanding the Internet Low Bit Rate Coder
Scalable Speech Coding for IP Networks: Beyond iLBC
Introduction to Digital Audio
Introduction to Digital Audio
MPEG-1 Overview of MPEG-1 Standard
III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.
Audio Henning Schulzrinne Dept. of Computer Science
Govt. Polytechnic Dhangar(Fatehabad)
Introduction to Digital Audio
Presentation transcript:

Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003

Common narrowband audio codecs rate (kb/s) delay (ms) multi-rate em-bedded VBR bit-robust/ PLC remarks iLBC 15.2 13.3 20 30 --/X quality higher than G.729A no licensing Speex 2.15--24.6 X AMR-NB 4.75--12.2 X/X 3G wireless G.729 8 15 TDMA wireless GSM-FR 13 GSM wireless (Cingular) GSM-EFR 12.2 2.5G G.728 16 12.8 2.5 H.320 (ISDN videconferencing) G.723.1 5.3 6.3 37.537.5 X/-- H.323, videoconferences

Common wideband audio codecs rate (kb/s) delay (ms) multi-rate em-bedded VBR bit-robust/ PLC remarks Speex 4—44.4 34 X --/X no licensing AMR-WB 6.6—23.85 20 X/X 3G wireless G.722 48, 56, 64 0.125 (1.5) X/-- 2 sub-bands now dated

iLBC – MOS behavior with packet loss

Recent audio codecs iLBC: optimized for high packet loss rates (frames encoded independently) AMR-NB 3G wireless codec 4.75-12.2 kb/s 20 ms coding delay

Speex Open-source patent-free speech codec CELP (code-excited linear prediction) codec operating modes: narrowband (8 kHz sampling rate) 2.15 – 24.6 kb/s delay of 30 ms wideband (16 kHz sampling rate) 4-44.2 kb/s delay of 34 ms ultra-wideband (32 kHz sampling rate) intensity stereo encoding variable bit rate (VBR) possible voice activity detection (VAD)

Ogg Vorbis Similar in application to AAC, MP3, VQF, …, but claims to be free of patents Ogg = container format file (also for Speex, FLAC) Vorbis = music speech codec near CD quality = 160 kb/s forward-adaptive modified DCT (discrete cosine transform) overlapping windows floor: carries frequency representation as piecewise linear interpolated representation on a dB amplitude scale and linear frequency scale residue: subtract out floor  cascaded (multi-pass) vector quantization entropy (Huffman) coding carries codec parameters in header

Sound localization Human ear uses 3 metrics for stereo localization: intensity time of arrival (TOA) – 7 µs direction filtering and spectral shaping by outer ear For shorter wavelengths (4 – 20 kHz), head casts an acoustical shadow giving rise to a lower sound level at the ear farthest from the sound sources At long wavelength (20 Hz - 1 KHz) the, head is very small compared to wavelengths In this case localization is based on perceived Interaural Time Differences (ITD) UCSC CMPE250 Fall 2002

Audio samples http://www.cs.columbia.edu/~hgs/audio/codecs.html Speex: http://www.speex.org/audio/samples/ both narrowband and wideband