1 / 18

핵심어 검출을 위한 단일 끝점 DTW 알고리즘

핵심어 검출을 위한 단일 끝점 DTW 알고리즘. Yong-Sun Choi and Soo-Young Lee Brain Science Research Center and Department of Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology. Contents. Keyword Spotting Meaning & Necessity Problems Dynamic Time Warping (DTW)

shirin
Download Presentation

핵심어 검출을 위한 단일 끝점 DTW 알고리즘

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 핵심어 검출을 위한 단일 끝점 DTW 알고리즘 Yong-Sun Choi and Soo-Young Lee Brain Science Research Center and Department of Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology

  2. Contents • Keyword Spotting • Meaning & Necessity • Problems • Dynamic Time Warping (DTW) • Advantages of DTW • Some conventional types & Proposed DTW type • Experimental Results • Verification of proposed DTW performance • Standard threshold setting • Results of various conditions • Conclusions

  3. Keyword Spotting • Meaning • Detection of pre-defined keywords in the continuous speech • Example) • Keywords : ‘open’, ‘window’ • Input : “um…okay, uh… pleaseopenthe…uh…window” • Necessity • Human may say OOV(Out Of Vocabulary), sometimes stammer • But machine only needs some specific words for recognition

  4. Problems & Goal • Difficulties • of process • End-Point-Detection of speech segment • Rejection of OOVs • of implementation • A big load of calculations • Complex algorithm • Hard to build up a real hardware system • Goal • Simple & Fast Algorithm

  5. DTW for Keyword Spotting • Hidden Markov Model (HMM) • A statistical model : need large number of datum for training • Complex algorithm : hard to implement a hardware system • Many parameters : can cause memory problem • Dynamic Time Warping (DTW) • Advantages • Small number of datum for training • Simple algorithm (addition & multiplication) • Small number of stored datum • Weak points • Need EPD process, Many calculations

  6. General DTW Process • Known both End Points • Repetition of searches • Finding corresponding frames

  7. Advanced DTW • Myers, Rabiner and Rosenberg • No EPD Process • Series of small area searches • Global search in one area • Setting next area around the best match point of local area • Reducing amount of calculations but still much • Tested in isolated word recognition

  8. Proposal – Shape & Weights • No EPD process • Only one path • Select the best match point and search again at the point • Less computations • Modifying weights • To compensate weight-sum differences • For search • For distance accumulation

  9. Proposal – End Point • Small search area • Successive local searches • Start search at one point • End condition • When the point is on the last frame of Ref. pattern • Setting up End Point automatically

  10. Proposal – Distance • Modifying distance • Using differences of pattern lengths • Pattern lengths of same words are similar each other

  11. DTW – Computation Loads • 3 types

  12. Data Base & EX-SET • DB • RoadRally • For keyword spotting • Based on telephone channel • Usages • 11 keywords (Total 434 occurrences) • 40 male speakers read speech (Total 47 min.) in Stonehenge • SET construction • 4 sub-set (about 108 keywords / set) • 3 set for training , 1 set for test • 2 reference patterns / keyword / set

  13. Verification Result • Isolated Word Recognition • 3 set for training , 1 set for test

  14. Experimental Setup • Assumption • Any frame can be the last frame of keywords • Threshold • To reject OOV • 1 threshold / ref. • Standard threshold : no false alarm in training set • Result presentation • ROC (Receiver Operator Characteristic) • X-axis : false alarm / hour / keyword • Y-axis : recognition rate

  15. Thresholds Setting & Recognition Rate of Training Set • Training set = Test set (No false alarm)

  16. Result – DTW & HMM • ROC Curve

  17. Changing Conditions No. of Keywords No. of References

  18. Conclusion • Proposed DTW • Advantages • Simple structure : addition & multiplication (good for hardware) • No EPD processing • Very small computation load • Small stored datum : small memory • Only keyword information • Good performance • Keyword Spotting • Better than HMM in the case of small training datum

More Related