Download presentation
Presentation is loading. Please wait.
Published byBernadette Dawson Modified over 9 years ago
1
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS Jain-De,Lee Emad M. GraisHakan Erdogan 17 th International Conference on Digital Signal Processing,2011
2
Outline INTRODUCTION NON-NEGATIVE MATRIX FACTORIZATION SIGNAL SEPARATION AND MASKING EXPERIMENTS AND DISCUSSION CONCLUSION
3
Introduction There are two main stages of this work – Training stage – Separation stage Using NMF with different types of masks to improve the separation process – The separation process faster – NMF with fewer iterations
4
Introduction Problem formulation – The observe a signal x(t),which is the mixture of two sources s(t) and m(t) – Assume the sources have the same phase angle as the mixed Where (t, f) be the STFT of x(t) X = S + MX = S + M
5
Non-negative Matrix Factorization Non-negative matrix factorization algorithm Minimization problem Different cost functions C of NMF – Euclidean distance – KL divergence subject to elements of B,W ≧ 0
6
Non-negative Matrix Factorization Euclidean distance cost function KL divergence cost function Multiplicative Update Algorithm
7
Non-negative Matrix Factorization The magnitude spectrogram S and M are calculated by NMF Larger number of basis vectors – Lower approximation error – Redundant set of basis – Require more computation time
8
Signal Separation and Masking The NMF is used decompose the magnitude spectrogram matrix X The initial spectrograms estimates for speech and music signals are respectively calculated as follows Where W S and W M are submatrices in matrix W
9
Signal Separation and Masking Use the initial estimated spectrograms and to build a mask as follows Source signals reconstruction Where 1 is a matrix of ones is element-wise multiplication
10
Signal Separation and Masking Two specific values of p correspond to special masks – Wiener filter(soft mask) – Hard mask
11
Signal Separation and Masking The value of the mask versus the linear ratio for different values of p
12
Experiments and Discussion Simulation – 16kHz sampling rate – Speech Training speech data-540 short utterances Testing speech data-20 utterances – Music 38 pieces for training 1 piece for testing – Hamming window-512 point – FFT size-512 point
13
Experiments and Discussion Performance measurement of the separation
14
Experiments and Discussion
17
Conclusion The family of masks have a parameter to control the saturation level The proposed algorithm gives better results and facilitates to speed up the separation process
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.