1 / 11

CS581 Opinion Mining Adapting a Polarity Lexicon to a Domain

CS581 Opinion Mining Adapting a Polarity Lexicon to a Domain. G ülşen Demiröz Berrin Yanıkoğlu. Introduction. Polarity Lexicons Lexicon N egative polarit y P ositive polarity O bjective polarity for neutral

caia
Download Presentation

CS581 Opinion Mining Adapting a Polarity Lexicon to a Domain

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS581 Opinion MiningAdapting a Polarity Lexicon to a Domain Gülşen Demiröz Berrin Yanıkoğlu

  2. Introduction • Polarity Lexicons • Lexicon • Negativepolarity • Positivepolarity • Objectivepolarityforneutral • Thesepolaritiesareusedtodetecttheoverallpolarityand hence the sentiment orientation of thedocument • SentiWordNet NEGATIVE OBJECTIVE POSITIVE offer NN 0.0 1.0 0.0 offer VB 0.625 0.375 0.0

  3. Problem Definition • Hotel review: • “The hotel had reallysmall rooms” • Digital camera review: • “This camera is great as it has asmall size” • The assumption of SentiWordNet: • A lexicon always has the same polarity in all circumstances • How do we adapt SentiWordNet to our domain?

  4. Related Work • Yejin Choi, Claire Cardie, 2009: Adapting a Polarity Lexicon using Integer Linear Programming for Domain-Specific Sentiment Classification • They start with an existing general-purpose polarity lexicon • Then adapt it into a domain-specific lexical usage • They use integer linear programming • Polarity of each word is one of: {positive, neutral, negative or negator} • They do expression-level polarity classification • Improvement: %2.8 with Vote&Flip and only %0.8 with CRF’s • Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2009. Recognizing Contextual Polarity: an exploration of features for phrase-level sentiment analysis • They start with prior polarities of words • Then learn contextual polarities of phrases in which those words in the phrase appear in the corpus • They collect many contextual clues using NLP dependency tree for each phrase as machine learning features • They classify words and phrases, not documents

  5. Approach • Calculate pos-tf-idf(wi)= log2(count of wi in positive reviews+1)*log2(#all reviews/#reviews having wi) neg-tf-idf(wi)= log2(count of wi in negative reviews+1)*log2(#all reviews/#reviews having wi) tfidf_diff(wi)= pos-tf-idf(wi)-neg-tf-idf(wi) • Check if tfidf_diff(wi) is compatible with SentiWordNet score • If not compatible flip the polarity but still use SentiWordNet score • If the classification rate improves, keep the flips

  6. Approach • Which words to flip? • Top positive tfidf_diffn words with negative SentiNet score • Negative  Positive • Bottom negative tfidf_diffn words with positive SentiNet score • Positive  Negative • How about middle tfidf_diff words with positive or negative SentiNet scores • Positive  Objective • Negative  Objective • How about tfidf_diffthresholds? • Top, Bottom, Middle range thresholds • Find the most optimum thresholds using grid-search • One word flip at a time or multiple words flip at a time?

  7. Classification Method Given each word wi has polarity tuple<pol-, pol=, pol+> -pol-, if max(pol-,pol=,pol+)=pol- 0 , if max(pol-,pol=,pol+)=pol= pol+, if max(pol-,pol=,pol+)=pol+ pol(wi)= Positive , if average word polarity > 0 Negative, if average word polarity <= 0 pol(doc)= • Onlywordswhich had POS tagsstartingwith JJ*, RB*, NN* and VB* areused in calculations • Review score {1,2} as Negativeand {3,4,5} as Positive

  8. Example Polarity Flips • Word: tall was Negative(0.375), now Positive(0.3125) tall JJ 0.375 0.3125 0.3125 Great Resort but.... Beach: Wonderful, white sand, sparkling blue water(baby blue), long and you can walk for miles if you wish, lots of tall palm trees and beach chairs are plentyfull, massage and braiding services are available as well as variety of water sports (must take classes in the morning) free, except snorkelling trips. … Overall Rating: 4 • Word: beware was Positive (0.5), now Negative (0) beware VB 0.0 0.5 0.5 Nice facility, great staff... but BEWARE of the food... they warn you about the water, but it ain't enough.... of 30 guests in our group, 20 got sick on the food... Overall Rating: 2

  9. Example Polarity Flips • Word: unlimited was Negative(0.4166), now Positive(0.166) unlimited JJ 0.4166 0.4166 0.166 … Breckfast and lunch are buffet with almost unlimited choices that are fresh, top quality and well presented. … Overall Rating: 5 • Word: ill was Negative(0.625), now Objective(0.375) ill RB 0.625 0.375 0.0 … Another tip: don't each too much fruit at breakfast. It is so fresh and delicious but it can make you illfor a day. Eat the fruit in small portions and you'll be fine. Overall Rating: 5

  10. Conclusion • I could improve classification accuracy rate • There were only 315 positive and negative words in my list of words to flip • Bigger domain with more words will improve more • Word flips made sense • unlimited, offer, beware, ill, tall, ordinary, … • Some word flips that didn’t make sense: • cheerful was Positive (0.6875), now Negative (0) • negative was Negative (0.55), now Positive (0.0138) • Future Work: • Apply to larger datasets • Different classification method

More Related