1 / 11

Dirichlet Class Language Models for Speech Recognition

Dirichlet Class Language Models for Speech Recognition. Jen- Tzung Chien , Senior Member, IEEE, and Chuang- Hua Chueh , Student Member, IEEE. IEEE TRANSACTIONS 2011. 報告者:郝柏翰. Outline. Introduction Survey of Pervious Work Dirichlet Class Language Models (DCLM) Experiments Conclusion.

joann
Download Presentation

Dirichlet Class Language Models for Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dirichlet Class Language Modelsfor Speech Recognition Jen-TzungChien, Senior Member, IEEE, and Chuang-HuaChueh, Student Member, IEEE IEEE TRANSACTIONS 2011 報告者:郝柏翰

  2. Outline • Introduction • Survey of Pervious Work • Dirichlet Class Language Models (DCLM) • Experiments • Conclusion

  3. Introduction • Latent Dirichlet allocation (LDA) was successfully developed for document modeling due to its generalization to unseen documents through the latent topic modeling. LDA calculates the probability of a document based on the bag-of-words scheme without considering the order of words. • This work presents a new Dirichlet class language model (DCLM), which projects the sequence of history words onto a latent class spaceand calculates a marginal likelihood over the uncertainties of classes, which are expressed by Dirichlet priors. • Furthermore, the long-distance class information is continuously updated using the large-span history words and is dynamically incorporated into class mixtures for a cache DCLM.

  4. Survey of Pervious Work • Class-Based Language Model • The class-based language model has been originally proposed to solve the data sparseness problem in -gram models by considering the transition probabilities between classes rather than words. • Neural Network Language Model • The neural network language model (NNLM) was also proposed to tackle the data sparseness problem by learning the distributed representation of words.

  5. Survey of Pervious Work • Latent Dirichlet Allocation Language Model • LDA provides a powerful mechanism for discovering the structure of a text document. The latent topic of each document is treated as a random variable. Where denotes the Dirichlet parameters of topic mixtures and is a matrix that contains the multinomial entries of the topic unigram

  6. DCLM • LDA builds a hierarchical Bayesian model and detects the latent topics or clusters from a document collection in an unsupervised manner. The bag-of-words scheme was adopted without considering the word order, and so it did not directly work for speech recognition. • We are motivated to integrate LDA and NNLM and construct a direct LDA language model for speech recognition.

  7. DCLM • DCLM acts as a Bayesian class-based language model, which involves the prior density of the class variable.

  8. Cache DCLM • Such a smoothed language model can predict unseen n-gram events. However, the long-distance information beyond the n-gram window is not captured.

  9. Experiments • N-best rescoring on 1987–1989 WSJ corpus with 38 M words, and 86 K documents.

  10. Experiments

  11. Conclusion • This work proposed a new Dirichlet class language model for continuous speech recognition. This DCLM relaxed the assumption of bag-of-words made in the LDA document model, and considered the order of history words in class-based language modeling. • To incorporate long-distance information, the online cache DCLM was exploited by merging the class occurrences of all preceding words to generate class mixtures. The frequent classes were generated with high probability. The prediction of a word in a test sentence was improved.

More Related