Presentation is loading. Please wait.

Presentation is loading. Please wait.

핵심어 검출을 위한 단일 끝점 DTW 알고리즘 Yong-Sun Choi and Soo-Young Lee

Similar presentations


Presentation on theme: "핵심어 검출을 위한 단일 끝점 DTW 알고리즘 Yong-Sun Choi and Soo-Young Lee"— Presentation transcript:

1 핵심어 검출을 위한 단일 끝점 DTW 알고리즘 Yong-Sun Choi and Soo-Young Lee
Brain Science Research Center and Department of Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology

2 Contents Keyword Spotting Dynamic Time Warping (DTW)
Meaning & Necessity Problems Dynamic Time Warping (DTW) Advantages of DTW Some conventional types & Proposed DTW type Experimental Results Verification of proposed DTW performance Standard threshold setting Results of various conditions Conclusions

3 Keyword Spotting Meaning Necessity
Detection of pre-defined keywords in the continuous speech Example) Keywords : ‘open’, ‘window’ Input : “um…okay, uh… please open the…uh…window” Necessity Human may say OOV(Out Of Vocabulary), sometimes stammer But machine only needs some specific words for recognition

4 Problems & Goal Difficulties Goal of process of implementation
End-Point-Detection of speech segment Rejection of OOVs of implementation A big load of calculations Complex algorithm Hard to build up a real hardware system Goal Simple & Fast Algorithm

5 DTW for Keyword Spotting
Hidden Markov Model (HMM) A statistical model : need large number of datum for training Complex algorithm : hard to implement a hardware system Many parameters : can cause memory problem Dynamic Time Warping (DTW) Advantages Small number of datum for training Simple algorithm (addition & multiplication) Small number of stored datum Weak points Need EPD process, Many calculations

6 General DTW Process Known both End Points Repetition of searches
Finding corresponding frames

7 Advanced DTW Myers, Rabiner and Rosenberg No EPD Process
Series of small area searches Global search in one area Setting next area around the best match point of local area Reducing amount of calculations but still much Tested in isolated word recognition

8 Proposal – Shape & Weights
No EPD process Only one path Select the best match point and search again at the point Less computations Modifying weights To compensate weight-sum differences For search For distance accumulation

9 Proposal – End Point Small search area End condition
Successive local searches Start search at one point End condition When the point is on the last frame of Ref. pattern Setting up End Point automatically

10 Proposal – Distance Modifying distance
Using differences of pattern lengths Pattern lengths of same words are similar each other

11 DTW – Computation Loads
3 types

12 Data Base & EX-SET DB SET construction RoadRally Usages
For keyword spotting Based on telephone channel Usages 11 keywords (Total 434 occurrences) 40 male speakers read speech (Total 47 min.) in Stonehenge SET construction 4 sub-set (about 108 keywords / set) 3 set for training , 1 set for test 2 reference patterns / keyword / set

13 Verification Result Isolated Word Recognition Test Set
3 set for training , 1 set for test Test Set Recognition Rate (%) General DTW Proposed DTW 1 96.3 98.2 2 100.0 99.1 3 95.4 4 97.2 Avg. 97.5

14 Experimental Setup Assumption Threshold Result presentation
Any frame can be the last frame of keywords Threshold To reject OOV 1 threshold / ref. Standard threshold : no false alarm in training set Result presentation ROC (Receiver Operator Characteristic) X-axis : false alarm / hour / keyword Y-axis : recognition rate

15 Thresholds Setting & Recognition Rate of Training Set
Training set = Test set (No false alarm) Keyword Right Total % Mountain 21 40 52.5 Secondary 38 95.0 Middleton 27 37 73.0 Boonsboro 32 39 82.1 Conway 33 82.5 Thicket 30 77.0 Keyword Right Total % Primary 34 40 85.0 Minus 25 39 64.1 Interstate 37 92.5 Waterloo 35 87.5 Retrace 36 90.0 368 434 84.8

16 Result – DTW & HMM ROC Curve

17 Changing Conditions No. of Keywords No. of References

18 Conclusion Proposed DTW Keyword Spotting Advantages Good performance
Simple structure : addition & multiplication (good for hardware) No EPD processing Very small computation load Small stored datum : small memory Only keyword information Good performance Keyword Spotting Better than HMM in the case of small training datum


Download ppt "핵심어 검출을 위한 단일 끝점 DTW 알고리즘 Yong-Sun Choi and Soo-Young Lee"

Similar presentations


Ads by Google