Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research , 3:993-1022, January 2003. . Jonathan Huang (firstname.lastname@example.org) Advisor: Carlos Guestrin 11/15/2005. “Bag of Words” Models. Let’s assume that all the words within a document are exchangeable.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Latent Dirichlet AllocationD. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003.
Jonathan Huang (email@example.com)
Advisor: Carlos Guestrin
Mixture of Unigrams Model (this is just Naïve Bayes)
For each of M documents,
In the Mixture of Unigrams model, we can only have one topic per document!
For each word of document d in the training set,
In pLSI, documents can have multiple topics.
Probabilistic Latent Semantic Indexing (pLSI) Model
For each document,
(J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, W. T. Freeman. Discovering object categories in image collections. MIT AI Lab Memo AIM-2005-005, February, 2005. )
Examples of ‘visual words’