Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) A TWO-STAGE DATA-DRIVEN SINGLE MICROPHONE SPEECH ENHANCEMENT WITH CEPSTRAL ANALYSIS PRE-PROCESSING Yu Rao, Chetan Vahanesa, Chandan K.A. Reddy, Issa M. S. Panahi
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) Outline of the presentation 1.Introduction 2.Review of temporal Cepstral smoothing method 3.Proposed method 4.Experimental results and performance evaluation 5.Real-time implementation 6.Conclusion 2 This research was supported by NIH-NIDCD Project No: 1R56DC
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) Problem Statement – We are living in the environment which is surrounded by different types of noise. Sometimes these noise will have negative effect in our daily lives. – Conventional single microphone speech enhancement methods do not perform well in all types of noise and may generate musical tones in some conditions. Sometimes this may degrade device’s performance Introduction PHOTO COURTESY:
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) 4 [1] C. Breithaupt, T. Gerkmann and R. Martin, “A novel a priori SNR estimation approach based on selective Cepstro-temporal smoothing,” in Proceeding IEEE International Conference on Acoustic, Speech and Signal Processing, ICASSP 2008, pp , April Review of temporal Cepstral smoothing method [1]
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) 5 Figure 1. Block diagram of first stage 3. Proposed method 1. TCS 2.A-Priori & Posteriori SNR Estimation 4. MMSE-LSA Estimator 3. Lookup Table 1
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) 6 Figure 1. Block diagram of first stage 1. TCS 2.A-Priori & Posteriori SNR Estimation 4. MMSE-LSA Estimator 3. Lookup Table 1 [2] J. S. Erkelens and R. Heusdens, “Tracking of nonstationary noise based on data-driven recursive noise power estimation,” IEEE Trans., Audio, Speech and Lang. Process., vol. 16, no. 6, pp , Aug, 2008 [3] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans., Acoust., Speech and Signal Process., vol.33, no. 2, pp , Apr
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) 7 Figure 2. Block diagram of second stage 1.A-Priori & Posteriori SNR Estimation 3. MMSE-LSA Estimator 2. Lookup Table 2 [2] J. S. Erkelens and R. Heusdens, “Tracking of nonstationary noise based on data-driven recursive noise power estimation,” IEEE Trans., Audio, Speech and Lang. Process., vol. 16, no. 6, pp , Aug, 2008
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) 8 4. Experimental results and performance evaluation Driving-CarWhiteSpeech-Shaped Figure 3. PESQ and NAL comparison MMSE-LSA using VAD based decision-directed method (DD), MMSE-LSA using data-driven recursive noise power tracking method (RNPT), proposed two-stage speech enhancement (PP)
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) 9 4. Experimental results and performance evaluation
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) 10 Figure 4. Block diagram of the proposed method 5. Real-time implementation on smartphone Figure 5. Smartphone screenshot
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) The main contributions of this work are listed as follows: Proposing a two stage speech enhancement algorithm using Temporal Cepstral smoothing method as pre-processing. Comparing the objective measurement result with the well-known single microphone speech enhancement method Introducing a real-time frame work of the proposing method and its real-time implementation on smartphone Conclusion
Statistical Signal Processing Research Laboratory(SSPRL) UT Acoustic Laboratory(UTAL) Thank you! For your time and participation 12