Speech and Audio Processing Preeti Rao Department of Electrical Engineering, I.I.T. Bombay (e-mail: prao@ee.iitb.ac.in)
Research Activities Speech processing : low-rate coding, speech synthesis Audio signal processing : audio compression, audio content retrieval Major project: low-rate (< 2kbps) codec for telephone bandwidth speech (sponsored by B.E.L.)
Current Coding Standards A Perspective
LR-HNM Model: Parameters Windowed speech HNM Model Parameter Estimator Pitch Spectral amplitudes Voicing cutoff frequency
Speech Codec Block Diagram Frame Size = 20 ms Sampling Rate = 8 kHz
Performance Bit rate : 1.55 kbps Delay : 60 ms Complexity : 40 MIPS (tentative) Quality : average MOS = 3.0 (as obtained from objective measures) Robustness to background noise : enhancement preprocessor needed below 6-8 dB SNR
A “Query-by-Humming” System (under development, jointly with Prof. S A “Query-by-Humming” System (under development, jointly with Prof. S. Dutta Roy)