1 / 13

A Study on Detection Based Automatic Speech Recognition

A Study on Detection Based Automatic Speech Recognition. Author : Chengyuan Ma Yu Tsao Professor: 陳嘉平 Reporter : 許峰閤. Outline. Introduction Word detector design Hypotheses combination Experiment. Introduction.

dunn
Download Presentation

A Study on Detection Based Automatic Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Study on Detection Based Automatic Speech Recognition Author : Chengyuan Ma Yu Tsao Professor:陳嘉平 Reporter :許峰閤

  2. Outline • Introduction • Word detector design • Hypotheses combination • Experiment

  3. Introduction • The current ASR system is top-down and this is a bottom-up system. • It include: 1.word detector. 2.word hypothesis verification and false alarm pruning. 3.Hypothesis combination.

  4. Word detector design • We have separate detector for each lexical item in the vocabulary. • HMM model are used for detector design. • The key issue is how to choose an appropriate grammer network.

  5. Word detector design

  6. Word verification and pruning

  7. Word verification and pruning • It’s obvious that these detectors generate a lot of false alarms. • Here are three pruning strategies will be presented.

  8. Word verification and pruning • Temporal information based pruning: For example, the duration of the word “one” should be greater than 150 ms. • Attributes model based pruning: Each word has its own attribute sequence pattern. • Signal based pruning: Signal feature based pruning. For example, we know the energy of a nasalsound is often concentrated on the low frequency region.

  9. Hypotheses combination • We investigate hypothesis combination strategies using outputs from all detectors to generate a word string. • The weighted directed graph is one of the methods that can be used to combine the detector output into a digit string.

  10. Hypotheses combination • Each node in the graph is a detected digit boundary. • The number in the node is the time stamp. • The number beside each edge is the frame average log-likelihood. • We can use the Dijkstra’s algorithm to find the shortest path.

  11. Experiment • Conduct on the TIDIGITS corpus. • Digit vocabulary is made of 11 digits, one to nine, plus oh and zero. • 12-dimensional MFCC is used for frond-end processing.

  12. Experiment

More Related