sentiment analysis on twitter data n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Sentiment Analysis on Twitter Data PowerPoint Presentation
Download Presentation
Sentiment Analysis on Twitter Data

Loading in 2 Seconds...

play fullscreen
1 / 9

Sentiment Analysis on Twitter Data - PowerPoint PPT Presentation


  • 704 Views
  • Uploaded on

Sentiment Analysis on Twitter Data. Authors: Apoorv Agarwal Boyi Xie Ilia Vovsha Owen Rambow Rebecca Passonneau Presented by Kripa K S. Overview: twitter.com is a popular microblogging website. Each tweet is 140 characters in length

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Sentiment Analysis on Twitter Data


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Sentiment Analysis on Twitter Data Authors: • Apoorv Agarwal • Boyi Xie • Ilia Vovsha • Owen Rambow • Rebecca Passonneau Presented by Kripa K S

    2. Overview: • twitter.com is a popular microblogging website. • Each tweet is 140 characters in length • Tweets are frequently used to express a tweeter's emotion on a particular subject. • There are firms which poll twitter for analysing sentiment on a particular topic. • The challenge is to gather all such relevant data, detect and summarize the overall sentiment on a topic.

    3. Classification Tasks and Tools: • Polarity classification – positive or negative sentiment • 3-way classification – positive/negative/neutral • 10,000 unigram features – baseline • 100 twitter specific features • A tree kernel based model • A combination of models. • A hand annotated dictionary for emoticons and acronyms

    4. About twitter and structure of tweets: • 140 charactes – spelling errors, acronyms, emoticons, etc. • @ symbol refers to a target twitter user • # hashtags can refer to topics • 11,875 such manually annotated tweets • 1709 positive/negative/neutral tweets – to balance the training data

    5. Preprocessing of data • Emoticons are replaced with their labels :) = positive :( = negative • 170 such emoticons. • Acronyms are translated. 'lol' to laughing out loud. • 5184 such acronyms • URLs are replaced with ||U|| tag and targets with ||T|| tag • All types of negations like no, n't, never are replaced by NOT • Replace repeated characters by 3 characters.

    6. Prior Polarity Scoring • Features based on prior polarity of words. • Using DAL assign scores between 1(neg) - 3(pos) • Normalize the scores • < 0.5 = negative • > 0.8 = positive • If word is not in dictionary, retrieve synonyms. • Prior polarity for about 88.9% of English words

    7. Tree Kernel • “@Fernando this isn’t a great day for playing the HARP! :)”

    8. Features It is shown that f2+f3+f4+f9 (senti-features) achieves better accuracy than other features.

    9. 3-way classification • Chance baseline is 33.33% • Senti-features and unigram model perform on par and achieve 23.25% gain over the baseline. • The tree kernel model outperforms both by 4.02% • Accuracy for the 3-way classification task is found to be greatest with the combination of f2+f3+f4+f9 • Both classification tasks used SVM with 5-fold cross-validation.