Presentation is loading. Please wait.

Presentation is loading. Please wait.

Emad M. Grais Hakan Erdogan

Similar presentations


Presentation on theme: "Emad M. Grais Hakan Erdogan"— Presentation transcript:

1 SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS
Emad M. Grais Hakan Erdogan 17th International Conference on Digital Signal Processing,2011 Jain-De,Lee

2 Outline INTRODUCTION NON-NEGATIVE MATRIX FACTORIZATION
SIGNAL SEPARATION AND MASKING EXPERIMENTS AND DISCUSSION CONCLUSION

3 Introduction There are two main stages of this work
Training stage Separation stage Using NMF with different types of masks to improve the separation process The separation process faster NMF with fewer iterations

4 Introduction Problem formulation X = S + M
The observe a signal x(t) ,which is the mixture of two sources s(t) and m(t) Assume the sources have the same phase angle as the mixed Where (t , f) be the STFT of x(t) X = S + M

5 Non-negative Matrix Factorization
Non-negative matrix factorization algorithm Minimization problem Different cost functions C of NMF Euclidean distance KL divergence subject to elements of B,W≧0

6 Non-negative Matrix Factorization
The magnitude spectrogram S and M are calculated by NMF Larger number of basis vectors Lower approximation error Redundant set of basis Require more computation time

7 Signal Separation and Masking
The NMF is used decompose the magnitude spectrogram matrix X The initial spectrograms estimates for speech and music signals are respectively calculated as follows Where WS and WM are submatrices in matrix W

8 Signal Separation and Masking
Use the initial estimated spectrograms and to build a mask as follows Source signals reconstruction Where 1 is a matrix of ones is element-wise multiplication

9 Signal Separation and Masking
The value of the mask versus the linear ratio for different values of p

10 Signal Separation and Masking
Two specific values of p correspond to special masks Wiener filter(soft mask) Hard mask

11 Experiments and Discussion
Simulation 16kHz sampling rate Speech Training speech data-540 short utterances Testing speech data-20 utterances Music 38 pieces for training one piece for testing Hamming window-512 point FFT size-512 point

12 Experiments and Discussion

13 Experiments and Discussion

14 Experiments and Discussion

15 Experiments and Discussion

16 Conclusion The family of masks have a parameter to control the saturation level The proposed algorithm gives better results and facilitates to speed up the separation process


Download ppt "Emad M. Grais Hakan Erdogan"

Similar presentations


Ads by Google