Speech and Audio Processing

Slides:



Advertisements
Similar presentations
Chapter 3: PCM Noise and Companding
Advertisements

ON THE REPRESENTATION OF VOICE SOURCE APERIODICITIES IN THE MBE SPEECH CODING MODEL Preeti Rao and Pushkar Patwardhan Department of Electrical Engineering,
Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
Using Multimedia on the Web Enhancing a Web Site with Sound, Video, and Applets.
CNIT 132 – Week 9 Multimedia. Working with Multimedia Bandwidth is a measure of the amount of data that can be sent through a communication pipeline each.
STQ Workshop, Sophia-Antipolis, February 11 th, 2003 Packet loss concealment using audio morphing Franck Bouteille¹ Pascal Scalart² Balazs Kövesi² ¹ PRESCOM.
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Advanced Speech Enhancement in Noisy Environments
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Ranko Pinter Simoco Digital Systems
4.2 Digital Transmission Pulse Modulation (Part 2.1)
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
Sampling and quantization Seminary 2. Problem 2.1 Typical errors in reconstruction: Leaking and aliasing We have a transmission system with f s =8 kHz.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
Communications & Multimedia Signal Processing Report of Work on Formant Tracking LP Models and Plans on Integration with Harmonic Plus Noise Model Qin.
Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November,
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Digital Signal Processing A Merger of Mathematics and Machines 2002 Summer Youth Program Electrical and Computer Engineering Michigan Technological University.
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing.
Communications & Multimedia Signal Processing Analysis of Effects of Train/Car noise in Formant Track Estimation Qin Yan Department of Electronic and Computer.
Vytautas Deksnys, Algimantas Čitavičius Kaunas University of Technology Dept. of Electronics Engineering.
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
10/6/2015 3:12 AM1 Data Encoding ─ Analog Data, Digital Signals (5.3) CSE 3213 Fall 2011.
XP Tutorial 8New Perspectives on HTML and XHTML, Comprehensive 1 Using Multimedia on the Web Enhancing a Web Site with Sound, Video, and Applets Tutorial.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
XP Tutorial 8New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Using Multimedia on the Web Enhancing a Web Site with Sound, Video, and.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Speech Recognition Feature Extraction. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimation method.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
[Advanced] Speech & Audio Signal Processing ES 157/257: Speech and Audio Processing Prof. Patrick Wolfe, Harvard DEAS 02 February 2006.
P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
A UDIO B ANDWIDTH D ETECTION IN THE EVS C ODEC University of Sherbrooke, Canada VoiceAge Corporation, Montréal, Canada Fraunhofer IIS, Erlagen, Germany.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
RATE SCALABLE VIDEO COMPRESSION Bhushan D Patil PhD Research Scholar Department of Electrical Engineering Indian Institute of Technology, Bombay Powai,
1 Introduction to Speech Coding What, Why, Where & How (First Part) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_2_Coding_1of2.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH GEORGE P. KAFENTZIS, YANNIS STYLIANOU MULTIMEDIA INFORMATICS LABORATORY DEPARTMENT OF COMPUTER SCIENCE.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Codec 2 ● open source speech codec ● low bit rate (2400 bit/s down to 1400 bit/s) ● applications include digital speech for HF and VHF radio ● fills gap.
Codec 2 ● open source speech codec ● low bit rate (2400 bit/s and below) ● applications include digital speech for HF and VHF radio ● fills gap in open.
Motivation ● The (Ham) world needs an open source, patent free speech codec at bit rates of less than 5000 bit/s ● I know how to build one!
Codec 2 open source speech codec
Using Multimedia on the Web
Voice Manipulator Department of Electrical & Computer Engineering
Early termination for tz search in hevc motion estimation
Scalable Speech Coding for IP Networks
Vocoders.
Spread Spectrum Audio Steganography using Sub-band Phase Shifting
Audio Henning Schulzrinne Dept. of Computer Science
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Trellis Coded Modulation
Chapter 1 Introduction to Digital Signal Processing
ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.
SystemView First Steps
Understanding the Internet Low Bit Rate Coder
Scalable Speech Coding for IP Networks: Beyond iLBC
Packet loss concealment using audio morphing
Chapter 3: PCM Noise and Companding
Vocoders.
Direct Sequence Spread Spectrum Modulation and Demodulation using Compressive Sensing Under the guidance of M.Venugopala Rao Submitted by K.Y.K.Kumari.
Chen Zhifeng Electrical and Computer Engineering University of Florida
Music Signal Processing
Presentation transcript:

Speech and Audio Processing Preeti Rao Department of Electrical Engineering, I.I.T. Bombay (e-mail: prao@ee.iitb.ac.in)

Research Activities Speech processing : low-rate coding, speech synthesis Audio signal processing : audio compression, audio content retrieval Major project: low-rate (< 2kbps) codec for telephone bandwidth speech (sponsored by B.E.L.)

Current Coding Standards A Perspective

LR-HNM Model: Parameters Windowed speech HNM Model Parameter Estimator Pitch Spectral amplitudes Voicing cutoff frequency

Speech Codec Block Diagram Frame Size = 20 ms Sampling Rate = 8 kHz

Performance Bit rate : 1.55 kbps Delay : 60 ms Complexity : 40 MIPS (tentative) Quality : average MOS = 3.0 (as obtained from objective measures) Robustness to background noise : enhancement preprocessor needed below 6-8 dB SNR

A “Query-by-Humming” System (under development, jointly with Prof. S A “Query-by-Humming” System (under development, jointly with Prof. S. Dutta Roy)