Download presentation
Presentation is loading. Please wait.
Published byAvis Todd Modified over 9 years ago
1
Reporter : Chia-Cheng Chen Advisor : Wen-Ping Chen 1 Network Application Laboratory Department of Electrical Engineering National Kaohsiung University of Applied Sciences A Study of Single Channel Blind Source Separation and Recognition Based on Mixed-State Prediction
2
Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 2
3
Introduction 3 The applications of voiceprint recognition system Call routing (1997) Jupiter (1997) Let’s Go! (2002) Siri (2010) Skyvi (2011) Vlingo (2011)
4
Introduction Current Ecological Status of the Survey: Sensor networks Wireless networks Database Voiceprint recognition system Advantage Reduce the cost of human resource and time Save and share the raw data conveniently 4
5
Introduction 5 Blind Source Separation http://metadata.froghome.org/about.phphttp://metadata.froghome.org/about.php 台灣地區兩棲類物種描述資料
6
Introduction 6 Blind Source Separation
7
Introduction Voiceprint recognition C.J. Huang, Y.J. Yang, D.X. Yang and Y.J. Chen, “Frog classification using machine learning techniques,” Expert Systems with Applications, Vol. 36, No. 2, pp. 3737-3743, 2009. (SCI) S.C. Hsieh, W.P. Chen, W.C. Lin, F.S. Chou, and J.R. Lai, “Endpoint detection of frog croak syllables with using average energy entropy method,” Taiwan Journal of Forest Science, Vol.27, No.2, pp.149-161, Jun. 2012. (EI) W.P. Chen, S.S. Chen, C.C. Lin, Y.Z. Chen and W.C. Lin, “Automatic recognition of frog call using multi-stage average spectrum,” Computers & Mathematics with Applications, Vol. 64, No. 5, pp. 1270-1281, Sep. 2012. (SCI) 7
8
Introduction Single channel source separation M.N. Schmidt and M. Mørup, “Nonnegative matrix factor 2-D deconvolution for blind single channel source separation,” Proceedings of International Conferences Independent Component Analysis and Blind Signal Separation, Vol. 3889, pp. 700-707, Mar. 2006. (SCI) S. Kırbız and B. Gunsel, “Perceptually weighted non-negative matrix factorization for blind single-channel music source separation,” 21st International Conference on Pattern Recognition, Nov. 2012. (EI) 8
9
Motivation Automatic frog species voiceprint recognition system Predicting the number of mixed signal Single channel blind source separation Biologist People 9
10
Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 10
11
Background Blind Source Separation Non-negative Matrix Factor 2-D Deconvolution Matching Adaptive Multi-stages Average Spectrum Feature Extraction Mel-frequency Cepstrum Coefficient Endpoint Detection Time DomainFrequency Domain Signal Processing Pre-emphasisFrameWindow 11
12
Background Signal Processing Syllable Segmentation Feature Extraction Matching 12 Voiceprint Recognition
13
Signal Processing 13 Frog Signal Pre-emphasis Frame Hamming Window Resample 44100Hz
14
Syllable Segmentation Endpoint Detection Algorithm Energy Time Domain Simple Square of the Amplitude or Absolute Value of the Amplitude Vulnerable to Noise Impact Entropy Frequency Domain Complex Noise Immunity 14
15
Average Energy Entropy Signal Transform Average Energy 15 s(n) : windowed signal N : frame size k : frequency component u : the mean for energy of input signal A(n) : the amplitude value of input signal N : total number of input signal
16
Average Energy Entropy Probability Density Function 16
17
Average Energy Entropy 17 H’ : the negative entropy for each frame
18
Endpoint Detection Algorithm 18 Signal AEE Absolute Energy Square Energy
19
Feature Extraction 19
20
Adaptive Multi-stage Average Spectral Adaptive Clustering 20 Cluster B Cluster A
21
Adaptive Multi-stage Average Spectral Cluster B Cluster A 21 Adaptive Clustering
22
Adaptive Multi-stage Average Spectral 22 Adaptive Clustering
23
Adaptive Multi-stage Average Spectral Template Training 23 Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7 Stage 1 Stage 2 Stage 3
24
Adaptive Multi-stage Average Spectral Template Training 24
25
Adaptive Multi-stage Average Spectral Template Training 25 Minimum Cumulative Difference
26
Adaptive Multi-stage Average Spectral Template Maching 26 Minimum Cumulative Difference
27
Blind Source Separation Non-negative Matrix Factor 2-D Deconvolution α basis matrix and β coefficient matrix Obtain the relations between the time and the pitch Shift operator 27 , , V: Original Signal : Reconstructed Signal
28
Non-negative Matrix Factor 2-D Deconvolution 28
29
Non-negative Matrix Factor 2-D Deconvolution Cost function Based on Euclidean Distance Based on Kullback-Leibler Divergence 29
30
Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 30
31
Research Methods Mixed-State Prediction voiceprint recognition method Training Mixed signals states Testing Two stages voiceprint recognition Mixed-State Prediction 31
32
32
33
First Stage 33 Latouche's frog MFCCMoltrecht's green tree frog + Latouche's frog MFCC Independent signalMixed signal Signal Processing Syllable Segmentation Feature Extraction Matching
34
Mixed signals states 34
35
Mixed States Average Energy 35 E : the average energy for the frequency X(k) N : the length of the syllable Mixed signalIndependent signal
36
Predicting the number of mixed signal 36 E : the mean spectral energy for test syllable a : the mean energy of training data T : the separation threshold
37
Outline Introduction and Motivation Background Research Methods Experimental Results Conclusion and Future Works Research Results 37
38
Experimental Results 38 ParametersParameter Value Frame Length512 samples Frame Overlapping50% Window FunctionHamming Window Frequency Bin512 Feature ParametersMel-Frequency Cepstral Coefficient Feature Dimensions15 Separation Threshold0.3
39
Experimental Results Recognition Experiment Independent signals 39 Method Total Syllable Error Mixed Correct Syllable Accuracy(%) DTW3733128275.6% AMSAS3733131784.71%
40
Experimental Results Recognition Experiment Mixed signals 40 Method Total Syllable Correct Syllable Accuracy(%) DTW26918368.02% AMSAS26921178.43% Total Syllable Error Mixed Correct Syllable Accuracy(%) 1673613178.44%
41
Experimental Results 41
42
Experimental Results 42
43
Conclusion and Future Works The proposed method Improve the mixed signal recognition rate Proposed a method to predict the number of mixed signal 43
44
Conclusion and Future Works Future Works Study of de-noise methods Collect more features between independent and mixed signals Mixed signals recognition within same species Collect various sound of species. Then, improve the system performance Adopt Support Vector Machines(SVM), Neural Network… 44
45
Research Results Competition 第七屆數位訊號處理創思設計競賽 — 入圍 青蛙物種聲紋辨識系統 計畫協助 45 Form NSC 100-2221- E-151-0117 NSC 1002101010508 -080702G1 NSC 1002101050511- 060101G4 Heading WDM-EPON 之 動態波長頻寬 配置與服務品 質之研究 生態資訊學技 術應用在森林 經營之研究 無線感測器網 路在森林災害 監測之應用與 研究
46
46 Thank you for your attention !!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.