1 / 16

Hierarchical Topic Models and the Nested Chinese Restaurant Process

This paper introduces a hierarchical topic model and the nested Chinese restaurant process. It applies a Bayesian approach to generate a suitable prior and builds a hierarchical topic model. The approach is illustrated using simulated data and experiments. The nested Chinese restaurant process is extended to hierarchies, and the model is applied to real data. Gibbs sampling is used for estimation.

gervais
Download Presentation

Hierarchical Topic Models and the Nested Chinese Restaurant Process

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Topic Models and the Nested Chinese Restaurant Process Liutong Chen(lc6re) Siwei Liu(sl7vy) Shaojia Li(sl4ab)

  2. Introduction • CRP • Model • Experiment • Conclusion

  3. Introduction • Takes Bayesian approach to generate an appropriate prior via a distribution on partitions • Builds a hierarchical topic model • Illustrates approach on simulated data

  4. CRP • A Chinese restaurant with an infinite number of tables and each with infinite capacity. • Customer 1 sits at the first table • The next customer either sits at the same table as customer 1 or the next table • the m customers will sit at a table drawn from the following equation:

  5. Extending CRP to Hierarchies Assumption: • Infinite amount of infinite table Chinese restaurant in a city. One of them is root restaurant. • Each tables has cards that refer to other restaurant • Each restaurant is referred to exactly once.

  6. Extending CRP to Hierarchies Scene: 1.Time 1, a tourist enter the root restaurant and choose a table by above equation. 2.Time 2, the tourist go to the referred restaurant and choose a table by the equation. 3.Time L, the tourist is at the L-th referred restaurant and establish a path from root to L level in the infinite tree. 4. M tourist with L times, the collection of paths is a particular L level subtree of the infinite tree.

  7. Hierarchical LDA Given an L-level tree and each node is associated with a topic. • Choose a path from root to leaf • Draw a vector of topic proportions θ from an L-dimensional Dirichlet • Generate the words in the document from a mixture of the topics along the path from root to leaf, with mixing proportions θ

  8. Nested CRP with LDA • Let c1 be the root restaurant. • For each level l ∈ {2,...,L}: • Draw a table from restaurant cl−1 using Eq. (1). • Set cl to be the restaurant referred to by that table. • Draw an L-dimensional topic proportion vector θ from Dir(α). • For each word n ∈ {1,...,N}: • Draw z ∈ {1,...,L} from Mult(θ). • Draw wn from the topic associated with restaurant cz .

  9. Gibbs Sampling The goal is to estimate: zm,n, the assignment of the nth word in the mth document to one of the L available topics, and cm,l, the restaurant corresponding to the lth topic in document m. First, given the current state of the CRP, we sample zm,n variables of the underlying LDA model following the algorithm:

  10. Gibbs Sampling Second, given the values of the LDA hidden variables, we sample the cm,l variables which are associated with the CRP prior. The conditional distribution for cm, the L topics associated with document m, is: This expression is an instance of Bayes’ rule with p(wm | c, w−m, z) as the likelihood of the data given a particular choice of cm and p(cm | c−m) as the prior on cm implied by the nested CRP.

  11. Experiment 1. Compare CRP method with Bayes Factor method 2. Estimate five different hierarchies. 3. Demonstration on real data

  12. Compare CRP method with Bayes Factor method Compared with Bayes factors method, CRP is • Faster • Only one free parameter to set • More effective

  13. Estimate five different hierarchies

  14. Demonstration on real data Dataset 1717 NIPS: • 208890 words • vocabulary of 1600 terms

  15. Conclusion • Nested Chinese Restaurant Process • Gibbs sampling procedure for the model Extension: 1. Depth of hierarchies can vary from document to document 2. Documents in models are allowed to mix over paths

  16. Questions?

More Related