Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 663 Mid Year Progress Report December 2008 Professor Radu Balan 1
Brief Reminder Goal: reduce additive white Gaussian noise degrading a speech signal Use short time analysis (frames) Algorithms: – Spectral subtraction (done) – Iterative Wiener filtering (next) Test objectively in lab conditions 2
Spectral Subtraction Estimate the magnitude of the noise spectrum when speech is absent Subtract the estimate from the magnitude of the spectrum of the noisy signal Keep the noisy phase Inverse Fourier to obtain enhanced signal in time domain Underlying statistical assumptions 3
Spectral Subtraction 4
SS – “Filtering” View 5
Spectral Subtraction Implemented Implementation issues – Choice of analysis and synthesis windows – Ensuring nonnegative magnitude – Estimation of noise spectrum – Choice of exponent Validation Variations 6
Choice of Windows 7
SS – Preliminary Results 8
Time (sec) Frequency (Hz) 9 Clean
SS – Preliminary Results Clean Noisy 10
SS – Preliminary Results 11
SS – Preliminary Results Noisy Enhanced 12
Evaluation of Exponent Values tested: p =.1,.5, 1, 1.5, 2, 3, 4, 5 13
Testing TIMIT – database of 6300 sentences for evaluation of speech processing algorithms Segmental SNR Segmental “Filtering” SNR Segmental “Filtering” distortion Future: Perceptual Evaluation of Speech Quality (PESQ) – telecomm standard 14
Shortcomings of SS Using the noisy phase (in low SNR) “musical noise” artifacts caused by – Error in noise spectrum estimation – Flooring of negative components – Fluctuations in signal spectrum Solutions: – Over subtraction – Smoothing of signal spectrum 15
Dealing with “Musical Noise” Over subtraction Smoothing signal spectrum 16
Next Steps Test more signals Evaluate using PESQ Implement Wiener filtering Further testing Compare performance 17
Bibliography [1] Deller, J., Hansen, J., and Proakis, J. (2000) Discrete Time Processing of Speech Signals, New York, NY: Institute of Electrical and Electronics Engineers [2] Quatieri, T. (2002) Discrete Time Speech Signal Processing, Upper Saddle River, NJ: Prentice Hall [3] Loizou, P. (2007) Speech Enhancement: Theory and Practice, Boca Raton, FL: Taylor & Francis Group [4] Rabiner, L., Schafer, R. (1978) Digital Processing of Speech Signals, Englewood Cliffs, NJ: Prentice Hall 18