Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reporter : Chia-Cheng Chen Advisor : Wen-Ping Chen 1 Network Application Laboratory Department of Electrical Engineering National Kaohsiung University.

Similar presentations


Presentation on theme: "Reporter : Chia-Cheng Chen Advisor : Wen-Ping Chen 1 Network Application Laboratory Department of Electrical Engineering National Kaohsiung University."— Presentation transcript:

1 Reporter : Chia-Cheng Chen Advisor : Wen-Ping Chen 1 Network Application Laboratory Department of Electrical Engineering National Kaohsiung University of Applied Sciences A Study of Single Channel Blind Source Separation and Recognition Based on Mixed-State Prediction

2 Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 2

3 Introduction 3 The applications of voiceprint recognition system Call routing (1997) Jupiter (1997) Let’s Go! (2002) Siri (2010) Skyvi (2011) Vlingo (2011)

4 Introduction Current Ecological Status of the Survey: Sensor networks Wireless networks Database Voiceprint recognition system Advantage Reduce the cost of human resource and time Save and share the raw data conveniently 4

5 Introduction 5 Blind Source Separation http://metadata.froghome.org/about.phphttp://metadata.froghome.org/about.php 台灣地區兩棲類物種描述資料

6 Introduction 6 Blind Source Separation

7 Introduction Voiceprint recognition C.J. Huang, Y.J. Yang, D.X. Yang and Y.J. Chen, “Frog classification using machine learning techniques,” Expert Systems with Applications, Vol. 36, No. 2, pp. 3737-3743, 2009. (SCI) S.C. Hsieh, W.P. Chen, W.C. Lin, F.S. Chou, and J.R. Lai, “Endpoint detection of frog croak syllables with using average energy entropy method,” Taiwan Journal of Forest Science, Vol.27, No.2, pp.149-161, Jun. 2012. (EI) W.P. Chen, S.S. Chen, C.C. Lin, Y.Z. Chen and W.C. Lin, “Automatic recognition of frog call using multi-stage average spectrum,” Computers & Mathematics with Applications, Vol. 64, No. 5, pp. 1270-1281, Sep. 2012. (SCI) 7

8 Introduction Single channel source separation M.N. Schmidt and M. Mørup, “Nonnegative matrix factor 2-D deconvolution for blind single channel source separation,” Proceedings of International Conferences Independent Component Analysis and Blind Signal Separation, Vol. 3889, pp. 700-707, Mar. 2006. (SCI) S. Kırbız and B. Gunsel, “Perceptually weighted non-negative matrix factorization for blind single-channel music source separation,” 21st International Conference on Pattern Recognition, Nov. 2012. (EI) 8

9 Motivation Automatic frog species voiceprint recognition system Predicting the number of mixed signal Single channel blind source separation Biologist People 9

10 Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 10

11 Background Blind Source Separation Non-negative Matrix Factor 2-D Deconvolution Matching Adaptive Multi-stages Average Spectrum Feature Extraction Mel-frequency Cepstrum Coefficient Endpoint Detection Time DomainFrequency Domain Signal Processing Pre-emphasisFrameWindow 11

12 Background Signal Processing Syllable Segmentation Feature Extraction Matching 12 Voiceprint Recognition

13 Signal Processing 13 Frog Signal Pre-emphasis Frame Hamming Window Resample 44100Hz

14 Syllable Segmentation Endpoint Detection Algorithm Energy Time Domain Simple Square of the Amplitude or Absolute Value of the Amplitude Vulnerable to Noise Impact Entropy Frequency Domain Complex Noise Immunity 14

15 Average Energy Entropy Signal Transform Average Energy 15 s(n) : windowed signal N : frame size k : frequency component u : the mean for energy of input signal A(n) : the amplitude value of input signal N : total number of input signal

16 Average Energy Entropy Probability Density Function 16

17 Average Energy Entropy 17 H’ : the negative entropy for each frame

18 Endpoint Detection Algorithm 18 Signal AEE Absolute Energy Square Energy

19 Feature Extraction 19

20 Adaptive Multi-stage Average Spectral Adaptive Clustering 20 Cluster B Cluster A

21 Adaptive Multi-stage Average Spectral Cluster B Cluster A 21 Adaptive Clustering

22 Adaptive Multi-stage Average Spectral 22 Adaptive Clustering

23 Adaptive Multi-stage Average Spectral Template Training 23 Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7 Stage 1 Stage 2 Stage 3

24 Adaptive Multi-stage Average Spectral Template Training 24

25 Adaptive Multi-stage Average Spectral Template Training 25 Minimum Cumulative Difference

26 Adaptive Multi-stage Average Spectral Template Maching 26 Minimum Cumulative Difference

27 Blind Source Separation Non-negative Matrix Factor 2-D Deconvolution α basis matrix and β coefficient matrix Obtain the relations between the time and the pitch Shift operator 27 , , V: Original Signal : Reconstructed Signal

28 Non-negative Matrix Factor 2-D Deconvolution 28

29 Non-negative Matrix Factor 2-D Deconvolution Cost function Based on Euclidean Distance Based on Kullback-Leibler Divergence 29

30 Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 30

31 Research Methods Mixed-State Prediction voiceprint recognition method Training Mixed signals states Testing Two stages voiceprint recognition Mixed-State Prediction 31

32 32

33 First Stage 33 Latouche's frog MFCCMoltrecht's green tree frog + Latouche's frog MFCC Independent signalMixed signal Signal Processing Syllable Segmentation Feature Extraction Matching

34 Mixed signals states 34

35 Mixed States Average Energy 35 E : the average energy for the frequency X(k) N : the length of the syllable Mixed signalIndependent signal

36 Predicting the number of mixed signal 36 E : the mean spectral energy for test syllable a : the mean energy of training data T : the separation threshold

37 Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 37

38 Experimental Results 38 ParametersParameter Value Frame Length512 samples Frame Overlapping50% Window FunctionHamming Window Frequency Bin512 Feature ParametersMel-Frequency Cepstral Coefficient Feature Dimensions15 Separation Threshold0.3

39 Experimental Results Recognition Experiment Independent signals 39 Method Total Syllable Error Mixed Correct Syllable Accuracy(%) DTW3733128275.6% AMSAS3733131784.71%

40 Experimental Results Recognition Experiment Mixed signals 40 Method Total Syllable Correct Syllable Accuracy(%) DTW26918368.02% AMSAS26921178.43% Total Syllable Error Mixed Correct Syllable Accuracy(%) 1673613178.44%

41 Experimental Results 41

42 Experimental Results 42

43 Conclusion and Future Works The proposed method Improve the mixed signal recognition rate Proposed a method to predict the number of mixed signal 43

44 Conclusion and Future Works Future Works Study of de-noise methods Collect more features between independent and mixed signals Mixed signals recognition within same species Collect various sound of species. Then, improve the system performance Adopt Support Vector Machines(SVM), Neural Network… 44

45 Research Results Competition 第七屆數位訊號處理創思設計競賽 — 入圍 青蛙物種聲紋辨識系統 計畫協助 45 Form NSC 100-2221- E-151-0117 NSC 1002101010508 -080702G1 NSC 1002101050511- 060101G4 Heading WDM-EPON 之 動態波長頻寬 配置與服務品 質之研究 生態資訊學技 術應用在森林 經營之研究 無線感測器網 路在森林災害 監測之應用與 研究

46 46 Thank you for your attention !!


Download ppt "Reporter : Chia-Cheng Chen Advisor : Wen-Ping Chen 1 Network Application Laboratory Department of Electrical Engineering National Kaohsiung University."

Similar presentations


Ads by Google