Download presentation
Presentation is loading. Please wait.
Published byEmory Boone Modified over 8 years ago
1
Yi-zhang Cai, Jeih-weih Hung 2012/08/17 報告者:汪逸婷 1
2
Introduction Background methods Presented methods with NMF Experimental Results Conclusion 2
3
The effect of noise in speech signals ◦ This thesis focuses on developing new algorithms to reduce the noise effect in speech recognition (Convolutional Noise) Channel Effect Background Noise (Additive Noise) Noisy Speech Clean Speech 3
4
Noise causes a serious mismatch in the modulation spectrum of speech feature streams ◦ We try to normalize the modulation spectra under different SNR cases. 4
5
The nonnegative matrix factorization(NMF) is a novel method in processing the modulation spectrum. By using the following criterion 5
6
In general, the two nonnegative matrix W and H is obtained in an iterative manner ◦ With an initial guess of W and H, the following multiplicative updating rule is employed to achieve a local minimum of the cost function and 6
7
Conventional NMF for dealing with the modulation spectrum is relatively complicated: ◦ Using an iterative approach to find the best possible encoding vector h(given the basis matrix W is fixed) : Iteration : Termination : ◦ Processing the entire modulation frequency band of speech features Only the first-half low frequency part is important for speech recognition 7
8
Two ways to reduce the complexity ◦ Orthogonal projection Find the orthogonal basis B for the basis matrix W, which can be done off-line Obtain the new modulation spectrum by projecting the old modulation spectrum onto the column space of B ◦ Updating the low-frequency modulation spectrum Reducing the computation while keeping the effect of enhancement in modulation spectrum 8
9
9
10
10
11
Experimental setup ◦ Aurora-2 database Clean condition trainng Noise environment: Test set A: subway, babble, car, and exhibition noises Test set B:restaurant, street, airport, and train station noises Test set C: MIRS subway and MIRS street noises SNRs: clean, 20dB, 15dB, 10dB, 5dB, 0dB, -5dB ◦ HMM for each digit: 16 states and 20 Gaussian mixtures per state 11
12
The iteration function : The projection function : The computational complexity(for a feature sequence) 12
13
13
14
The negative spectra magnitude appeared (averaged for a feature sequence) ◦ The probability of producing negative magnitudes in NMF(p,f) and PCA(p,f) is very small even though they have no nonnegative constraints 14
15
The recognition accuracy (%) of each method for different numbers of frequency points forming the low sub-band 15
16
16
17
The proposed two schemes reduce the computational complexity of NMF a lot Using the orthogonal projection in NMF further improves the recognition accuracy Normalizing the low sub-band provides very similar results when compared with the full-band normalization 17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.