1 / 11

Chinese Blog Clustering by Hidden Sentiment Factors

Chinese Blog Clustering by Hidden Sentiment Factors. ADMA 2009 Shi Feng, Daling Wang, Ge Yu, Chao Yang, and Nan Yang. College of Information Science and Engineering, Northeastern University. Hidden Sentiment Factors(HSF). Probabilistic latent semantic analysis (PLSA)

Download Presentation

Chinese Blog Clustering by Hidden Sentiment Factors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chinese Blog Clustering by Hidden Sentiment Factors ADMA 2009 Shi Feng, Daling Wang, Ge Yu, Chao Yang, and Nan Yang. College of Information Science and Engineering, Northeastern University

  2. Hidden Sentiment Factors(HSF) • Probabilistic latent semantic analysis (PLSA) • Blog Set B = {b1,b2,…,bN} • Sentiment words set W = {w1,w2,…,wM} • NTUSD • 2,812 positive words and 8,276 negative words • Hownet Sentiment Dictionary • 4,566positive words and 4,370 negative words • A = NxM Matrix , A(i,j) = Freq(bi,wj) • HSF Z = {z1,z2,….,zk}

  3. Hidden Sentiment Factors(HSF)

  4. Hidden Sentiment Factors(HSF) P(w|b) -> P(z|b)

  5. Clustering by HSF • K-Means Algorithm • k’ : # of clusters. In this paper, set k’ = k. • Fig.1 Similarity=0 • Fig.2 Similarity=?

  6. Label Words Extraction

  7. Experiment • 1. Collect blogs about reviews on Stephen Chow’s movie “CJ7” (Long River 7) • 2. Collect blog entries about Liu Xiang since 2008/8/18. • Tag1. “Positive”, “Negative” and “Neutral”Tag2. “Irrelevant” or not • Ex: A blog may tagged {“Positive” , ”Irrelevant”}, {“Neutral”} or {“Negative” , ”Irrelevant”} • Evaluate the clustering purity.

  8. Experiment

  9. Experiment

  10. Experiment

  11. Experiment

More Related