1 / 18

Consumer sentiment analysis with Twitter

Consumer sentiment analysis with Twitter. Reetta Suonperä August 2013. Two months , one csv.gz file per day In total about 1.2 billion tweets

selma
Download Presentation

Consumer sentiment analysis with Twitter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Consumer sentiment analysis with Twitter Reetta Suonperä August 2013

  2. Two months, one csv.gz file per day • In total about 1.2 billion tweets • It's always easy for a person to say get over, but you don't feel what heart feels to make that statment|PrettynPinkC215|2011-02-01T04:01:16Z|2011-02-01T04:00:48Z|1296532876139018784| My dataset

  3. General approach: natural language processing (NLP) • The Natural Language Toolkit (NLTK) The tools I use

  4. A survey-based indicator of consumer confidence or sentiment • History goes back to 1946 at University of Michigan • Ireland’s consumer sentiment index by the ESRI since 1996 Introduction: the consumer sentiment index

  5. Q1: Economic situation in the country (next 12 months) • Q2:Unemployment in the country (next 12 months) • Q3: Household financial situation (12 months ago) • Q4: Household financial situation (next 12 months) • Q5: Good/bad time to buy large household items • Answers: positive/neutral/negative ESRI survey questions

  6. This is what it looks like:The KBC/ESRI consumer sentiment index

  7. On the June 2013 improvement in households’ assessment of their personal finances: “We think that the ECB rate cut in May played some role … a combination of low inflation, early summer sales and increasing signs of improvement in the residential property market could have contributed…” On the decline in the July 2013 index: “We think reports that the Irish economy had fallen back into recession and a couple of high profile job loss announcements unnerved consumers last month.” We can speculate on what drives sentiment – but we can’t really know

  8. More timely • Continuous information • Save money • What drives sentiment Motivation: why using Twitter could help

  9. O’Connor et al (2010): From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series • An index based on tweets containing the word “jobs” correlates with the Michigan index and Gallup’s daily poll • Indices with economy or job correlate poorly! Previous research

  10. The process (simplified)

  11. Initial wordlist topics

  12. Use WordNet to find synonyms for initial keyword list: • Words have many different meanings • Include part-of-speech tag • Word doesn’t exist in WordNet? • Output does not include tenses or plurals Using WordNet to expand seed wordlist

  13. Regular expressions for more basic tasks: • Cleaning, tokenising URLs, usernames • NLTK functionality for more complex tasks • Stopword removal, stemming, POS-tagging Pre-processing tasks

  14. Do more filtering using bigrams? • “I broke” • “pay cut” • “new job” • Use POS tags? • Classification? Fine selection – not there yet…

  15. The to-do list • Finalise fine selection • Sentiment classification • Visualisation

  16. Resources • www.nltk.org • Natural Language Processing with Python:http://nltk.org/book/ • Python Text Processing with NLTK 2.0 Cookbook

  17. Resources • O’Connor et al (2010): From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series • Bollen et al (2011): Twitter mood predicts the stock market • Bollen et al (2011): Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena • Go et al (2009): Twitter sentiment classification using distant supervision • Jiang et al (2011): Target-dependent Twitter Sentiment Classification

  18. Questions?

More Related