linguistica generale e computazionale n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
LINGUISTICA GENERALE E COMPUTAZIONALE PowerPoint Presentation
Download Presentation
LINGUISTICA GENERALE E COMPUTAZIONALE

Loading in 2 Seconds...

play fullscreen
1 / 62

LINGUISTICA GENERALE E COMPUTAZIONALE - PowerPoint PPT Presentation


  • 144 Views
  • Uploaded on

LINGUISTICA GENERALE E COMPUTAZIONALE. SENTIMENT ANALYSIS. FACTS AND OPINIONS. Two main types of textual information on the Web: FACTS and OPINIONS Current search engines search for facts (assume they are true) Facts can be expressed with topic keywords .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

LINGUISTICA GENERALE E COMPUTAZIONALE


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
  1. LINGUISTICA GENERALE E COMPUTAZIONALE SENTIMENT ANALYSIS

  2. FACTS AND OPINIONS • Two main types of textual information on the Web: FACTS and OPINIONS • Current search engines search for facts (assume they are true) • Facts can be expressed with topic keywords.

  3. THERE IS PLENTY OF OPINIONS IN THE WEB

  4. SENTIMENT ANALYSIS (also known as opinion mining) Attempts to identify the opinion/sentiment that a person may hold towards an object

  5. Components of an opinion • Basic components of an opinion: • Opinion holder: The person or organization that holds a specific opinion on a particular object. • Object: on which an opinion is expressed • Opinion: a view, attitude, or appraisal on an object from an opinion holder.

  6. SENTIMENT ANALYSIS GRANULARITY • At the document (or review) level: • Task: sentiment classification of reviews • Classes: positive, negative, and neutral • Assumption: each document (or review) focuses on a single object (not true in many discussion posts) and contains opinion from a single opinion holder.

  7. DOCUMENT-LEVEL SENTIMENT ANALYSIS EXAMPLE

  8. SENTIMENT ANALYSIS GRANULARITY • At the document (or review) level: • Task: sentiment classification of reviews • Classes: positive, negative, and neutral • Assumption: each document (or review) focuses on a single object (not true in many discussion posts) and contains opinion from a single opinion holder. • At the sentence level: • Task 1: identifying subjective/opinionated sentences • Classes: objective and subjective (opinionated) • Task 2: sentiment classification of sentences • Classes: positive, negative and neutral. • Assumption: a sentence contains only one opinion; not true in many cases. • Then we can also consider clauses or phrases.

  9. SENTENCE-LEVEL SENTIMENT ANALYSIS EXAMPLE Id: Abc123 on 5-1-2008 “I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, …”

  10. SENTENCE-LEVEL SENTIMENT ANALYSIS Id: Abc123 on 5-1-2008 “I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, …”

  11. SENTENCE-LEVEL SENTIMENT ANALYSIS Id: Abc123 on 5-1-2008 “I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, …”

  12. SENTIMENT ANALYSIS GRANULARITY • At the feature level: • Task 1: Identify and extract object features that have been commented on by an opinion holder (e.g., a reviewer). • Task 2: Determine whether the opinions on the features are positive, negative or neutral. • Task 3: Group feature synonyms. • Produce a feature-based opinion summary of multiple reviews.

  13. SENTIMENT ANALYSIS GRANULARITY • At the feature level: • Task 1: Identify and extract object features that have been commented on by an opinion holder (e.g., a reviewer). • Task 2: Determine whether the opinions on the features are positive, negative or neutral. • Task 3: Group feature synonyms. • Produce a feature-based opinion summary of multiple reviews. • Opinion holders: identify holders is also useful, e.g., in news articles, etc, but they are usually known in the user generated content, i.e., authors of the posts.

  14. FEATURE-LEVEL SENTIMENT ANALYSIS

  15. ENTITY AND ASPECT (Hu and Liu, 2004; Liu, 2006)

  16. OPINION TARGET

  17. A DEFINITION OF OPINION (Liu, Ch. in NLP handbook, 2010)

  18. SENTIMENT ANALYSIS: THE TASK

  19. Applications • Businesses and organizations: • product and service benchmarking. • market intelligence. • Business spends a huge amount of money to find consumer sentiments and opinions. • Consultants, surveys and focused groups, etc • Individuals: interested in other’s opinions when • purchasing a product or using a service, • finding opinions on political topics • Ads placements: Placing ads in the user-generated content • Place an ad when one praises a product. • Place an ad from a competitor if one criticizes a product. • Opinion retrieval/search: providing general search for opinions.

  20. DOCUMENT-LEVEL SENTIMENT ANALYSIS

  21. DOCUMENT-LEVEL SENTIMENT ANALYSIS

  22. DOCUMENT-LEVEL SENTIMENT ANALYSIS = TEXT CLASSIFICATION

  23. ASSUMPTIONS AND GOALS

  24. LEXICON-BASED APPROACHES • Use sentiment and subjectivity lexicons • Rule-based classifier • A sentence is subjective if it has at least two words in the lexicon • A sentence is objective otherwise

  25. SUPERVISED CLASSIFICATION • Treat sentiment analysis as a type of classification • Use corpora annotated for subjectivity and/or sentiment • Train machine learning algorithms: • Naïve bayes • Decision trees • SVM • … • Learn to automatically annotate new text

  26. TYPICAL SUPERVISED APPROACH

  27. FEATURES FOR SUPERVISED DOCUMENT-LEVEL SENTIMENT ANALYSIS • A large set of features have been tried by researchers • Terms frequency and different IR weighting schemes as in other work on classification • Part of speech (POS) tags • Opinion words and phrases • Negations • Syntactic dependency

  28. SENTIMENT ANALYSIS IN PYTHON

  29. EASIER AND HARDER PROBLEMS • Tweets from Twitter are probably the easiest • short and thus usually straight to the point • Reviews are next • entities are given (almost) and there is little noise • Discussions, comments, and blogs are hard. • Multiple entities, comparisons, noisy, sarcasm, etc

  30. ASPECT-BASED SENTIMENT ANALYSIS • Sentiment classification at the document or sentence (or clause) levels are useful, but do not find what people liked and disliked. • They do not identify the targets of opinions, i.e., ENTITIES and their ASPECTS • Without knowing targets, opinions are of limited use.

  31. ASPECT-BASED SENTIMENT ANALYSIS • Much of the research is based on online reviews • For reviews, aspect-based sentiment analysisis easier because the entity (i.e., product name) is usually known • Reviewers simply express positive and negative opinions on different aspects of the entity. • For blogs, forum discussions, etc., it is harder: • both entity and aspects of entity are unknown • there may also be many comparisons • and there is also a lot of irrelevant information.

  32. BRIEF DIGRESSION • Regular opinions: Sentiment/opinion expressions on some target entities • Direct opinions: The touch screen is really cool • Indirect opinions: “After taking the drug, my pain has gone” • COMPARATIVE opinions: Comparisons of more than one entity. • “iPhone is better than Blackberry”

  33. Find entities (entity set expansion) • Although similar, it is somewhat different from the traditional named entity recognition (NER). (See next lectures) • E.g., one wants to study opinions on phones • given Motorola and Nokia, find all phone brands and models in a corpus, e.g., Samsung, Moto,

  34. Feature/Aspect extraction • May extract frequent nouns and noun phrases • Sometimes limited to a set known to be related to the entity of interest or using part discriminators • e.g., for a scanner entity “scanner”, “scanner has” • opinion and target relations • Proximity or syntactic dependency • Standard IE methods • Rule-based or supervised learning • Often HMMs or CRFs (like standard IE)

  35. Aspect extraction using dependency grammar

  36. RESOURCES FOR SENTIMENT ANALYSIS • Annotated corpora • Used in statistical approaches (Hu & Liu 2004, Pang & Lee 2004) • MPQA corpus (Wiebe et. al, 2005) • Tools • Algorithm based on minimum cuts (Pang & Lee, 2004) • OpinionFinder (Wiebe et. al, 2005) • Lexicons • General Inquirer (Stone et al., 1966) • OpinionFinder lexicon (Wiebe & Riloff, 2005) • SentiWordNet (Esuli & Sebastiani, 2006)

  37. Lexical resources for Sentiment and Subjectivity Analysis Overview

  38. Sentiment (or opinion) lexica

  39. Sentiment lexica

  40. Sentiment-bearing words ICWSM 2008 • AdjectivesHatzivassiloglou & McKeown 1997, Wiebe 2000, Kamps & Marx 2002, Andreevskaia & Bergler 2006 • positive:honest important mature large patient • Ron Paul is the only honest man in Washington. • Kitchell’s writing is unbelievably mature and is only likely to get better. • To humour me my patient father agrees yet again to my choice of film

  41. Negative adjectives ICWSM 2008 • Adjectives • negative: harmful hypocritical inefficient insecure • It was a macabre and hypocritical circus. • Why are they being so inefficient ? bjective: curious, peculiar, odd, likely, probably

  42. Subjective adjectives ICWSM 2008 • Adjectives • Subjective (but not positive or negative sentiment): curious, peculiar, odd, likely, probable • He spoke of Sue as his probable successor. • The two species are likely to flower at different times.

  43. Otherwords ICWSM 2008 • Other parts of speechTurney & Littman 2003, Riloff, Wiebe & Wilson 2003, Esuli & Sebastiani 2006 • Verbs • positive:praise, love • negative: blame, criticize • subjective: predict • Nouns • positive: pleasure, enjoyment • negative: pain, criticism • subjective:prediction, feeling

  44. Phrases ICWSM 2008 • Phrases containing adjectives and adverbsTurney 2002, Takamura, Inui & Okumura 2007 • positive: high intelligence, low cost • negative: little variation, many troubles

  45. Creating sentiment lexica ICWSM 2008 Humans Semi-automatic Fully automatic

  46. (Semi) Automatic creation of sentiment lexica ICWSM 2008 • Find relevant words, phrases, patterns that can be used to express subjectivity • Determine the polarity of subjective expressions

  47. FINDING POLARITY IN CORPORA USING PATTERNS

  48. USING PATTERNS ICWSM 2008 Lexico-syntactic patternsRiloff & Wiebe 2003 way with <np>:… to ever let China use force to have its way with … expense of <np>: at the expense of the world’s security and stability underlined <dobj>: Jiang’s subdued tone … underlined his desire to avoid disputes …

  49. DICTIONARY-BASED METHODS

  50. SEMI-SUPERVISED LEARNING(Esuti and Sebastiani, 2005)