1 / 17

Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007

Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007. Discussion led by Chunping Wang ECE, Duke University March 2, 2009. Outline. Motivations Related Topic Models Hidden Topic Markov Models Inference Experiments Conclusions. Motivations.

nicki
Download Presentation

Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hidden Topic Markov ModelsAmit Gruber, Michal Rosen-Zvi and Yair Weissin AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009

  2. Outline • Motivations • Related Topic Models • Hidden Topic Markov Models • Inference • Experiments • Conclusions

  3. Motivations • Feature Reduction Extensively large text corpora a small number of variables • Topical segmentation Segment a document according to hidden topics • Word sense disambiguation Distinguish between different instances of the same word according to the context

  4. Related Topic Models • LDA (JMLR 2003) 1. For , draw 2. For , (a) Draw (b) For , draw (c) For , draw Words in a document are exchangeable; documents are also exchangeable.

  5. Related Topic Models • Dynamic Topic Models (ICML 2006) Words in a document are exchangeable; documents are not exchangeable.

  6. Related Topic Models • Topic Modeling: Beyond Bag of Words (ICML 2006) Words in a document are not exchangeable; documents are exchangeable.

  7. Related Topic Models • Integrating Topics and Syntax (NIPS 2005) LDA Semantic words HMM Non-semantic (syntactic) words Words in a document are not exchangeable; documents are exchangeable.

  8. Hidden Topic Markov Models No topic transition is allowed within a sentence. Whenever a new sentence starts, either the old topic is kept or a new topic is drawn according to .

  9. Hidden Topic Markov Models Transition matrices within a sentence or no transition between two sentences, with probability Transition occurs between two sentences, with probability Emission matrix Initial state distribution

  10. Inference EM algorithm: • E-step Compute using the forward-backward algorithm; • M-step

  11. Experiments • NIPS dataset (1740 documents, 1557 for training, 183 for testing) • Data preprocess Extract words in the vocabulary (J=12113, no stop words); Divide text to sentences according to “.?!;” . • Compare LDA, HTMM and VHTMM1 in terms of perplexity VHTMM1: a variant of HTMM with , a “bag of sentences” Ntest: the total length of the test document; N: the first N words of the document are observed. Average Ntest=1300

  12. Experiments K=100 N=10 The lower the perplexity is, the better the model is in predicting unseen words.

  13. Experiments • Topical segmentation HTMM LDA

  14. Experiments • Top words of topics acknowledgments math reference HTMM LDA

  15. Experiments As more topics are available, the topics become more specific and topic transitions are more frequent.

  16. Experiments • Two toy datasets, generated using HTMM and LDA. Goal: to eliminate the option that the perplexity of HTMM might be lower than the perplexity of LDA only because it has less degrees of freedom. With toy datasets, other criteria can be used for comparison.

  17. Conclusions • HTMM is another extension of LDA, which relaxes the “bag-of-words” assumption by modeling the topic dynamics with a Markov chain. • This extension leads to a significant improvement in perplexity, and makes additional inferences possible, such as topical segmentation and word sense disambiguation. • It requires a larger storage since the entire document has to be the input of the algorithm. • It only applies to structured data, where sentences are well defined.

More Related