Noise Supression Techniques for Speech Enhancement Using Adaptive Filtering Derek Shiell 03/09/2006 ECE 463: Project Presentation Professor Michael Honig.

Slides:



Advertisements
Similar presentations
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Combining Heterogeneous Sensors with Standard Microphones for Noise Robust Recognition Horacio Franco 1, Martin Graciarena 12 Kemal Sonmez 1, Harry Bratt.
BLIND CROSSTALK CANCELLATION FOR DMT SYSTEMS
Microphone Array Post-filter based on Spatially- Correlated Noise Measurements for Distant Speech Recognition Kenichi Kumatani, Disney Research, Pittsburgh.
2004 COMP.DSP CONFERENCE Survey of Noise Reduction Techniques Maurice Givens.
Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis 1. EECS Department, Northwestern University 2. Advanced Technology Labs, Adobe Systems Inc. 3. University.
Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 664 Final Presentation May 2009 Dr. Radu Balan Department of Mathematics.
Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,
Application of Statistical Techniques to Neural Data Analysis Aniket Kaloti 03/07/2006.
Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November,
Single-Channel Speech Enhancement in Both White and Colored Noise Xin Lei Xiao Li Han Yan June 5, 2002.
Speech Enhancement Based on a Combination of Spectral Subtraction and MMSE Log-STSA Estimator in Wavelet Domain LATSI laboratory, Department of Electronic,
3/24/2006Lecture notes for Speech Communications Multi-channel speech enhancement Chunjian Li DICOM, Aalborg University.
Subband-based Independent Component Analysis Y. Qi, P.S. Krishnaprasad, and S.A. Shamma ECE Department University of Maryland, College Park.
HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR.
Advances in WP1 and WP2 Paris Meeting – 11 febr
ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.
1 Speech Enhancement Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
Yuan Chen Advisor: Professor Paul Cuff. Introduction Goal: Remove reverberation of far-end input from near –end input by forming an estimation of the.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Active Noise Control Architectures and Application Potentials Shawn Steenhagen - Applied Signal Processing, Inc. 3 Marsh Court Madison, WI Tele:
A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST
ERP DATA ACQUISITION & PREPROCESSING EEG Acquisition: 256 scalp sites; vertex recording reference (Geodesic Sensor Net)..01 Hz to 100 Hz analogue filter;
Ryan Irwin Intelligent Electronics Systems Human and Systems Engineering Center for Advanced Vehicular Systems URL:
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Introduction SNR Gain Patterns Beam Steering Shading Resources: Wiki:
-1- ICA Based Blind Adaptive MAI Suppression in DS-CDMA Systems Malay Gupta and Balu Santhanam SPCOM Laboratory Department of E.C.E. The University of.
Microphone Integration – Can Improve ARS Accuracy? Tom Houy
Digital Audio Signal Processing Lecture-4: Noise Reduction Marc Moonen/Alexander Bertrand Dept. E.E./ESAT-STADIUS, KU Leuven
By Asst.Prof.Dr.Thamer M.Jamel Department of Electrical Engineering University of Technology Baghdad – Iraq.
Eigenstructure Methods for Noise Covariance Estimation Olawoye Oyeyele AICIP Group Presentation April 29th, 2003.
Nico De Clercq Pieter Gijsenbergh Noise reduction in hearing aids: Generalised Sidelobe Canceller.
Heart Sound Background Noise Removal Haim Appleboim Biomedical Seminar February 2007.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Minimum Mean Squared Error Time Series Classification Using an Echo State Network Prediction Model Mark Skowronski and John Harris Computational Neuro-Engineering.
Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 663 Mid Year Progress Report December 2008 Professor Radu Balan 1.
IT-Master Thesis Themes 2008 Discrete Systems Lab Prof. Dr.-Ing. Volker Lohweg Contact:
2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 1) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.
Multiuser Detection (MUD) Combined with array signal processing in current wireless communication environments Wed. 박사 3학기 구 정 회.
Ali Al-Saihati ID# Ghassan Linjawi
Nico De Clercq Pieter Gijsenbergh.  Problem  Solutions  Single-channel approach  Multichannel approach  Our assignment Overview.
ICASSP Speech Discrimination Based on Multiscale Spectro–Temporal Modulations Nima Mesgarani, Shihab Shamma, University of Maryland Malcolm Slaney.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Signal Processing Algorithms for Wireless Acoustic Sensor Networks Alexander Bertrand Electrical Engineering Department (ESAT) Katholieke Universiteit.
Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimation method.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Speech Enhancement for ASR by Hans Hwang 8/23/2000 Reference 1. Alan V. Oppenheim,etc., ” Multi-Channel Signal Separation by Decorrelation ”,IEEE Trans.
S.Patil, S. Srinivasan, S. Prasad, R. Irwin, G. Lazarou and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi.
Digital Audio Signal Processing Lecture-3 Noise Reduction
Presentation Outline Introduction Principals
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Nonlinear State Estimation
Independent Component Analysis Independent Component Analysis.
Suppression of Musical Noise Artifacts in Audio Noise Reduction by Adaptive 2D filtering Alexey Lukin AES Member Moscow State University, Moscow, Russia.
Comparative Analysis of Spectral Unmixing Algorithms Lidan Miao Nov. 10, 2005.
Motorola presents in collaboration with CNEL Introduction  Motivation: The limitation of traditional narrowband transmission channel  Advantage: Phone.
语音与音频信号处理研究室 Speech and Audio Signal Processing Lab Multiplicative Update of AR gains in Codebook- driven Speech.
Speech Enhancement Summer 2009
LECTURE 11: Advanced Discriminant Analysis
PCA vs ICA vs LDA.
Missing feature theory
Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion.
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

Noise Supression Techniques for Speech Enhancement Using Adaptive Filtering Derek Shiell 03/09/2006 ECE 463: Project Presentation Professor Michael Honig

Overview Objective/Problem Description Applications Overview of Noise Reduction Methods System Description Filter analysis Linear methods Wiener approximation KLT preprocessing Signal subspace embedding Kalman filter based methods Non-linear methods Current results Future work Implementation/ practical considerations Conclusions

Objective/Problem Description The goal of my project was to research noise reduction techniques specifically for automatic speech recognition system front-end processing on a single microphone without an independent noise recording or clean reference signal.

Applications Cell phone speech enhancement Automatic speech recognition Speaker identification Biomedical signal processing (1) (2) (3) (1) (2) (3)

Overview of Speech Enhancement Microphone Array Processing Utilizing multiple microphones, blind source separation (BSS) techniques such as independent component analysis (ICA) may be used to distinguish one speaker from other directional or diffuse noises. Active echo/noise cancellation (ANC) In this case, the echo or noise is estimated and re-generated with opposite phase to destructively interfere with the original echo or noise. Blind noise suppression In this case, there is a single speech signal corrupted by noise, no separate noise recording with which to make noise estimates, and no source signal to reference.

System Descriptions ANC BSS/ICA Blind Noise Reduction BSS based on frequence domain ICA [6] Active Noise Cancellation with single microphone/speaker [4] Blind noise reduction schematic [1]

Filter Analysis (1) Linear MMSE (Wiener approximation) MMSE cost function Reduces to (frame length N):

Filter Analysis (2) Linear Estimation (continued) Signal is estimated from a linear filtering of the corrupted signal Minimizing the MMSE cost function with respect to w the result is as follows: This is an approximation to the Wiener solution where we are estimating the crosscorrelation vector p with ( r y – r n ) (similar to spectral subtraction)

Filter Analysis (3) Linear estimation with Karhunen Lòeve Transform (KLT) Preprocessing the signal using KLT (or PCA) separates the signal into its directions of greatest variance. Using the transform the signal can be mapped into a lower dimensional space which helps decorrelate the signal from noise. For a changing signal this requires that U be adaptively updated. Define U the KLT transform as the eigenvectors of R y the autocorrelation matrix of the noisy signal. Using this transformation we can define the transformed y k as: The resulting closed form solution for the weight vector is:

Filter Analysis (4) Signal subspace embedding This method allows for a matrix of gain factors, W, rather than simply a weight vector, w (MIMO) so that a simultaneous block estimate of can be made. In addition the matrix Q can be chosen as either I or to taper the tap weights by some factor(s) such that is emphasized more in the minimization phase. MMSE cost function: Update Equations for the filter matrix and transform basis can be found iteratively:

Filter Analysis (5) Kalman Filtering Approaches Kalman filters are widely used in speech enhancement and much theoretical work has been done analyzing Kalman filters. The Kalman filter is the minimum mean-square estimator of the state of a linear dynamical system and can be used to derive many types of RLS filters. Extended Kalman filters can be expanded to handle nonlinear models through a linearization process. Kalman filters have the advantages that they are: more robust (stationarity not assumed) require only the previous estimate for the next estimation (versus all passed values for instance) computationally efficient Standard linear state-space model for Kalman filter

Filter Analysis (6) Nonlinear filtering Many nonlinear filtering methods exist to suppress noise in noisy speech. Examples include filters based on neural networks or phase space reconstruction. In general, they are very complex to analyze, but do not require estimation of noise or speech spectra and are not characterized by “musical tone” artifacts. (1) Feed forward neural network (1) Phase space reconstruction for different speech phonemes [9]

Typical Results Comparison of segmental SNR performance for different noise sources: 1) White noise (SNR 6.08 dB) 2) Pink noise (SNR 4.34 dB) 3) Factory noise (SNR 5.16 dB) 4) F16 noise (SNR 4.61 dB) a) Linear estimation b) linear estimation with KLT preprocessing c) signal subspace embedding d) weighted signal subspace embedding e) NN with KLT f) linear with clean target g) nonlinear with clean target h) standard spectral subtraction method (3dB segmental SNR ~ 5dB SNR) [1] Segmental SNR results (left) and SNR results (below) for various linear and nonlinear noise reduction methods [8] Noisy Speech Signal (white noise) Ephraim Filtered Wiener Filtered

Future Work Perform ASR after noise reduction filtering AVICAR database Data collected in a car environment Time varying SNR No independent noise recording (detecting speech is difficult) Experiments KLT preprocessing + linear estimation (Wiener) Ephraim filter (ML short time spectral amplitude estimator) Nonlinear methods

Implementation/ Practical Considerations Real-time processing Applications require computationally efficient algorithms to be feasible. Determining noise sample Single microphone, speech detection to estimate noise statistics is difficult. Use visual information to detect speech or nonlinear noise reduction methods

Conclusions Noise suppression methods have become increasingly important due to the proliferation of mobile devices, ASR systems, and biometrics/ bioinformatics Speech enhancement is a very broad field Array processing for source separation, noise cancellation Interested in blind noise reduction Linear, Linear + KLT preprocessing, Signal subspace embedding Kalman filter based methods, Non-linear methods Using state-of-the-art noise reduction methods, typical SNR improvements are ~5 dB Proposed experiments to test ASR improvement

References 1. Eric A. Wan and Rudolph van der Merwe, “Noise-Regularized Adaptive Filtering for Speech Enhancement,” Proc. Eurospeech, pp , Ki Yong Lee., Byung-Gook Lee, Iickho Song, and Souguil Ann, “Robust Estimation of AR Parameters and its Application for Speech Enhancement,” Proc. IEEE ICASSP, pp , Phil S. Whitehead, David V. Anderson, and Mark A. Clements, “Adaptive, Acoustic Noise Suppression for Speech Enhancement.” Proc. IEEE ICME, pp. 565 – 568, A. V. Oppenheim, E. Weinstein, K. C. Zangi, M. Feder, and D. Gauger, “Single Sensor Active Noise Cancellation Based on the EM Algorithm,” Proc. IEEE ICASSP, pp. 277 – 280, T. Rutkowski, A. Cichocki, and A. K. Barros, “Speech Enhancement Using Adaptive Filters and Independent Component Analysis Approach,” Proc. AISAT, H. Saruwatari, K. Sawai, A. Lee, K. Shikano, A. Kaminuma, and M. Sakata, “Speech Enhancement and Recognition in Car Environment Using Blind Source Separation and Subband Elimination Processing,” Proc. ICA, pp. 367 – 372, Simon Haykin, Adaptive Filter Theory, Prentice-Hall Inc., Upper Saddle River, NJ, pp 466 – 501, M. T. Johnson, A. C. Lindgren, R. J. Povinelli, and X. Yuan, “Performance of Nonlinear Speech Enhancement using Phase Space Reconstruction,” Proc IEEE ICASSP, pp. 872 – 875, Andrew C. Lindgren, “Speech Recognition Using Features Extracted from Phase Space Reconstructions,” Thesis, Marquette University, Milwaukee WI, May 2003.

END