1 / 17

Sentiment Analysis

Sentiment Analysis. An Overview of Concepts and Selected Techniques. Terms. Sentiment A thought, view, or attitude, especially one based mainly on emotion instead of reason Sentiment Analysis opinion mining

havard
Download Presentation

Sentiment Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sentiment Analysis An Overview of Concepts and Selected Techniques

  2. Terms • Sentiment • A thought, view, or attitude, especially one based mainly on emotion instead of reason • Sentiment Analysis • opinion mining • use of natural language processing (NLP) and computational techniques to automate the extraction or classification of sentiment from typically unstructured text

  3. Motivation • Consumer information • Product reviews • Marketing • Consumer attitudes • Trends • Politics • Politicians want to know voters’ views • Voters want to know policitians’ stances and who else supports them • Social • Find like-minded individuals or communities • Webpage

  4. Problem • How to interpret features for sentiment detection? • Bag of words (IR) • Annotated lexicons (WordNet, SentiWordNet) • Syntactic patterns • Which features to use? • Words (unigrams) • Phrases/n-grams • Sentences

  5. Challenges • Must consider other features due to… • Subtlety of sentiment expression • irony • Domain/context dependence • words/phrases can mean different things in different contexts and domains

  6. Approaches • Machine learning • Naïve Bayes • Maximum Entropy Classifier • SVM • Markov Blanket Classifier • Unsupervised methods • Use lexicons Assume pairwise independent features

  7. Three levels of meaning • Lexical Semanics • The meanings of individual words • Sententical / Composional / FormalSemantics • How those meanings combine to make meanings forindividual sentences • Discourse or Pragmatics • How those meanings combine with each other and withother facts about various kinds of context to makemeanings for a text or discourse(+ Dialog or ConversationalSemantics)

  8. Wordnet[1][2] • The research efforts of the Department of Linguistics and Psychology at Princeton University for better understanding of English language and semantics resulted. • WordNet is available as a database, searchable via web interface or via a variety of software APIs, providing acomprehensive database of over 150,000 unique terms organised into more than 117,000 different meanings (WORDNET, 2006). • WordNet also grew with extensions of its structure applied to a number of other languages (WORDNET, 2009).

  9. WordNet • A hierarchically organized lexical database • On‐line thesaurus + aspects of a dictionary • Versions for other languages are under development Category -----UniqueForms Noun ------> 117,097 Verb ------> 11,488 Adjective ------> 22,141 Adverb ------> 4,601

  10. How is “sense” defined inWordNet? • The set of near‐synonyms for a WordNet sense is called a synset(synonym set); it’s their version of a sense or a concept • Example: chump as a noun to mean‘a person who is gullible and easy to take advantage of’ • Each of these senses share this same gloss • Thus for WordNet, the meaning of this sense of chump is this list.

  11. SentiWordNet [3] • Based on WordNet “synsets” • http://wordnet.princeton.edu/ • SentiWordNet is sentiment analysis lexical resource made up of synset from WordNet, athesaurus-like resource; they are allocated a sentiment score of positive, negative or objective. • These scores are automatically generated using the semi-supervised method • Each term in WordNet database is assigned a score of 0 to 1 in SentiWordNet which indicates its polarity. • Strong partiality information terms are assigned with higher scores whereas less bias/subjective terms carry low scores.

  12. Values in 3 dimension sum to 1. Ex: P=0.75, N=0, O=0.25

  13. Demo Explore the sentiment lexicons discussed here: • http://sentiment.christopherpotts.net/lexicon/ • Our Demo: • http://www.tripadvisor.com/Hotel_Review-g187147-d290407-Reviews-Paris_France_Hotel-Paris_Ile_de_France.html • Tutorial page: http://sentiment.christopherpotts.net/lexicons.html#building

  14. Polarity classification or semantic orientation determination of sentiment expressingphrases • a positive sentiment, a negative sentiment • Intensity or strength determination of sentiment expressing phrases • the word excellentis a strong positive word whereas the wordgood is a weak positive word • Product feature extraction • for example battery life, image quality and resolution in acamera domain and seating comfort, maximum speed, wheels and steering in a car domain. • Opinion and sentiment expressing phrase extraction • for example extremely comfortable,not smooth, quite heavy, good and bad

  15. References • http://www.answers.com/sentiment, 9/22/08 • B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment classification using machine learning techniques,” in Proc Conf on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86, 2002. • Esuli A, Sebastiani F. SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining. In: Proc of LREC 2006 - 5th Conf on Language Resources and Evaluation, 2006. • Zhang E, Zhang Y. UCSC on TREC 2006 Blog Opinion Mining. TREC 2006 Blog Track, Opinion Retrieval Task. • Devitt A, Ahmad K. Sentiment Polarity Identification in Financial News: A Cohesion-based Approach. ACL 2007. • Bo Pang , Lillian Lee, A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p.271-es, July 21-26, 2004.

More Related