1 / 13

Topic-Dependent-Class-Based N-Gram Language Model

Topic-Dependent-Class-Based N-Gram Language Model. Welly Naptali , Masatoshi Tsuchiya, and Seiichi Nakagawa , Member, IEEE. IEEE TRANSACTIONS 2012. 報告者:郝柏翰. Outline. Introduction TDC-based n-gram language model Experimental Results Conclusion. TDC-based n-gram language model.

steve
Download Presentation

Topic-Dependent-Class-Based N-Gram Language Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic-Dependent-Class-Based N-GramLanguage Model WellyNaptali, Masatoshi Tsuchiya, and Seiichi Nakagawa, Member, IEEE • IEEE TRANSACTIONS 2012 報告者:郝柏翰

  2. Outline • Introduction • TDC-based n-gram language model • Experimental Results • Conclusion

  3. TDC-based n-gram language model • The model is based on the belief that noun relations contain latent topic information. Hence, a semantic extraction method is employed along with a clustering method to reveal and define topics based on nouns only. • Given a word sequence, a fixed-size window is used to observe noun occurrences in the context history to decide the topic through voting. • Finally, the topic is integrated as a part of the word sequence in the n-gram model.

  4. TDC-based n-gram language model • TDC standalone model suffers from a shrinking training corpus. Therefore, to achieve good results, the TDC model needs to be interpolated with a word-based n-gram as a general model.

  5. Soft Clustering • In the LSA space, VQ is applied to cluster these words into topics. A VQ algorithm is iterated using the cosine similarity between nouns until the desired number of clusters is reached. • A confidence measure γ is defined as the distance between a word vector and its class centroid. • Previously, we mapped each noun wi into only one topic class Ci. This is known as a hard clustering technique. • To make this model more robust, soft clustering is performed so that a noun may belong to multiple topics.

  6. Soft Voting • A TDC with window size m leads to an LM in which the probability of a word sequence W = w1,w2,w3…,wN is defined by • Z is the topic class obtained by observing m words in outer contexts of the near n-gram

  7. Soft Voting • F is the voting score for a given window size defined as follows: • where

  8. Soft Voting

  9. Interpolation • Word-Based N-Gram:We used a word-based N-gram as the LM for capturing the local constraint through linear interpolation. • Cache-Based LM:A cache-based LM is based on the notion that words appearing in a document will increase the probability of appearing again in the same document.

  10. Interpolation • There are two ways of combining LMs, i.e., to scale the TDC before or afterit is linearly interpolated with the word-based N-gram. • Before: • After:

  11. Experimental Results • All results show very significant improvements, especially the standalone model. The standalone model gives 47.0% relative reduction while the interpolated model gives 7.4% relative reduction of perplexity.

  12. Experimental Results • The best perplexity achieved by TDC*CACHE+NGRAM is 87.0. • That gives 22.0% relative improve-ment against the word-based 3-gram • and 9.6% relative improvement against the TDC without a cache-based LM combi-nation.

  13. Conclusion • A TDC is a topic-dependent LM with unsupervised topic extraction employing semantic analysis and voting on nouns. • We demonstrated that a TDC with soft clustering and/or soft voting in the training and/or test phases improved performances. • We also demonstrated that incorporating a cache-based LM improved the TDC further. • The only drawback of the TDC LM is that it causes an increase in the number of parameters when performing soft voting in the training phase.

More Related