1 / 31

The Topic-Perspective Model for Social Tagging Systems

The Topic-Perspective Model for Social Tagging Systems. Caimei Lu et al. (KDD 2010) Presented by Anson Liang. Movitation. Users:. Documents: a rticles, web pages, etc. Annotate. Add. Tags: e.g. economy, science, facebook , twitter …. Movitation. E.g. Delicious.com

najwa
Download Presentation

The Topic-Perspective Model for Social Tagging Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Topic-Perspective Model for Social Tagging Systems Caimei Lu et al. (KDD 2010) Presented by Anson Liang

  2. Movitation Users: Documents: articles, web pages, etc. Annotate Add Tags: e.g. economy, science, facebook, twitter …

  3. Movitation E.g. • Delicious.com • Social bookmarking

  4. Movitation Users Web pages Tags

  5. Movitation • Different users have different perspectives! Movie, action, inspiration Movie, action, violent

  6. Movitation • Objectives • Simulate the generation process of social annotations • Automatically generate tags for documents based on user perspective

  7. Related Work • Topic Analysis using Generative Models • Latent Dirichlet Allocation (LDA) model • Discover topics for a given document • A document has many words • A document may belong to many topics • A word may belong to many topics • A topic may be associated with many words Words 1..N Documents N..N 1..N Topics

  8. Related Work • LDA – Plate notation • α is the parameter of the uniform Dirichlet prior on the per-document topic distributions. • β is the parameter of the uniform Dirichlet prior on the per-topic word distribution. • θi is the topic distribution for document i, • zij is the topic for the j th word in document i, and • wij is the specific word. • The wij are the only observable variables, and the other variables are latent variables. • φ is a K*V (V is the dimension of the vocabulary) Markov matrix each row of which denotes the word distribution of a topic. Ref: http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

  9. Related Work • Conditionally-independent LDA (Ramage et al.) • Tags are generated from the topics (the same source as the words)

  10. Related Work • Correlated LDA (Bundschus et al.) • Generates topics associated with the words for a document • Generates tags from the topics associated with the words

  11. Related Work • Problem • Assume words and tags are generated from the same source • However, words are generated by the authors, and tags are generated by users (readers) • User information is missed in the tag generation process!

  12. Related Work • Three classification schemas of social tags

  13. Topic-Perspective Model • Overview • Two factors act on the tag generation process: • The topics of the document • The perspective adopted by the user

  14. Topic-Perspective Model User-perspective distribution • Model Formulation Standard LDA X=0: Tags are generated from user perspectives Document-topic distribution X=1: Tags are generated from the topics learned from the words Switch 0/1 Probability that each tag is generated from topics Perspective-tag distribution Topic-word distribution Topic-tag distribution

  15. Topic-Perspective Model Document-topic distribution User-perspective distribution Topic-tag distribution Topic-word distribution Perspective-tag distribution

  16. Topic-Perspective Model Probability that each tag is generated from topics

  17. Topic-Perspective Model • Parameter Estimation • document-topic distribution θ(d) • topic-word distribution φ (w) • topic-tag distribution φ (t) • user-perspective distribution θ(u) • perspective-tag distribution ψ • binomial distribution λ • Gibbs Sampling!

  18. Topic-Perspective Model • Parameter Estimation

  19. Topic-Perspective Model • Parameter Estimation

  20. Topic-Perspective Model • Parameter Estimation

  21. Topic-Perspective Model • Parameter Estimation

  22. Experiments & Results • Datasets & Setup • A social tagging dataset from delicious.com after refining • 41190 documents • 4414 users • 28740 unique tags • 129908uniques words • 90% for training, 10% for testing

  23. Experiments & Results • Evaluation Criterion • Perplexity • Reflects the ability of the model to predict tags for new unseen documents • A lower perplexity score -> better generalization performance

  24. Experiments & Results • Number of topics vs. number of perspectives

  25. Experiments & Results • Tag Perplexity

  26. Experiments & Results • Discovered Topics • Observed high semantic correlations among the top words and tags for each topic

  27. Experiments & Results • Discovered Perspectives • Perspectives are more complicated than topics • The correlation among the tags assigned to each perspective is not as obvious as those for topics

  28. Experiments & Results • The generation sources of tags Tags are generated from the topics learned from the words Tags are generated from user perspectives

  29. Future Work • Apply the results of the model for • tag recommendation • Personalized information retrieval

  30. Summary • Topic-perspective LDA model • Simulate the tag generation process in a more meaningful way • Tags are generated from both the topics in the document and the user perspective • May be applied in computer vision tagging problems

  31. Question? • Thank you!

More Related