Contents analytics studio sentiment as an example of an interesting annotator
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

Contents Analytics Studio Sentiment as an example of an Interesting Annotator PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on
  • Presentation posted in: General

Contents Analytics Studio Sentiment as an example of an Interesting Annotator. Ken Nelson WW Solution Consultant. Requirements for Sentiment Analysis. Correctly detect positive and negative sentiment Happy, Angry Handle negated sentiment

Download Presentation

Contents Analytics Studio Sentiment as an example of an Interesting Annotator

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Contents analytics studio sentiment as an example of an interesting annotator

Contents Analytics StudioSentiment as an example of an Interesting Annotator

Ken Nelson

WW Solution Consultant


Requirements for sentiment analysis

Requirements for Sentiment Analysis

  • Correctly detect positive and negative sentiment

    • Happy, Angry

  • Handle negated sentiment

    • I am not happy, I will never be happy, I have never been less happy

  • Index the normal form of terms

    • Great, greater, greatest … no need to distinguish

  • Substitute “NOT” for the actual negation

    • The negation term or phrase probably doesn’t matter

  • Create annotations compatible with ICA V3 Sentiment

    • OpinionPhrase with sentimentMatch, sentimentTerm, polarity, and ruleName

    • Requires Fixpack 1


Create a sample document

Create a sample document

  • The sample document name must end with .txt

  • It should contain enough examples to create and validate the model

I am happy with the result.

We were ANGRY about not being included.

Negation

I am not happy with that product

I will never be HAPPY with that product

I was appalled with the result

I don't expect ever to be happy

I am not angry

I will never be angry

Some cases don't follow the simple negation rule

I have never been happier

I have never been more pleased


Create dictionaries

Create Dictionaries

  • Positive and Negative terms

  • Normal plus surface forms

    • Many use cases should index only the normal / common form, map synonyms to a common term, handle spelling variations, etc

    • Languages where the ending / suffix of a term is used for negation will require a different approach than outlined here


Two approaches to the dictionaries

Two approaches to the dictionaries

  • Separate dictionaries for Positive vs Negative

    • Since there will probably be additional dictionaries beyond simple terms, this might be the best choice

  • Single dictionary with a “Polarity” column

    • Any feature of an annotation can be used in Rules

    • Performance should be slightly better with a combined dictionary

  • For this example, a single dictionary will be used


Is there a source for dictionaries

Is there a source for dictionaries?

  • Sometimes dictionaries can be found online

    • Many sources of Sentiment dictionaries

    • Google search for free download german "sentiment dictionary"

    • We can’t use online sources without Legal approval

    • For interesting discussion and a source of data http://provalisresearch.com/wordstat/Sentiment-Analysis.html

  • Sometimes we can get data from other products

    • ICA V3 (CCI) Annotator is a possible source

    • configurations\indexservice\data\Sentiment\languages\en\dictionaries


Ica v3 sentiment dictionaries

ICA V3 Sentiment Dictionaries

1,741 AA.dict 7,127 Opinions-NegativeCompetence.dict

1,748 ADDRESS_KW.dict 4,699 Opinions-NegativeFeeling.dict

1,973 Adverb.dict 10,878 Opinions-NegativeFunctioning.dict

119,668 AlwaysNegatives.dict 5,598 Opinions-PositiveAttitude.dict

7,967 AlwaysNegatives_MWE.dict 2,979 Opinions-PositiveBudget.dict

57,747 AlwaysPositives.dict 12,817 Opinions-PositiveCompetence.dict

6,357 AlwaysPositives_MWE.dict 8,666 Opinions-PositiveFeeling.dict

1,663 AM_PM.dict 6,809 Opinions-PositiveFunctioning.dict

1,970 anaphoraDict_en.dict 53,272 Opinions-Uncertain.dict

1,730 Be.dict 1,914 ORDINAL.dict

3,588 Budget-Budget.dict 1,819 SentimentBlockerFilterDict.dict

1,893 CARD.dict 2,207 SentimentBlockersDictionary.dict

1,697 Core-Location.dict 3,765 SentimentBlockersDictionary_Phrases.dict

1,856 Core-Organization.dict 3,078 SentimentIntensifierDictionary.dict

1,701 Core-Person.dict 9,556 SentimentNegationTriggersDictionary.dict

1,674 Core-Product.dict 3,667 SentimentNegationTriggersDictionary_Verbs.dict

1,659 Core-Unknown.dict 1,651 Slang-Contextual.dict

1,838 Det.dict 1,882 Slang-Negative.dict

1,953 Emoticon-NegativeFeeling_Emoticon.dict 1,746 Slang-NegativeAttitude.dict

1,811 Emoticon-PositiveFeeling_Emoticon.dict 1,652 Slang-NegativeFunctioning.dict

1,734 FULL_MONTH.dict 1,899 Slang-Positive.dict

1,695 Have.dict 1,934 Slang-PositiveAttitude.dict

2,632 LatentSentimentModifiers_Less.dict 1,646 Slang-PositiveBudget.dict

2,911 LatentSentimentModifiers_More.dict 1,795 Slang-PositiveFeeling.dict

2,923 LatentSentiment_Less.dict 1,646 Slang-PositiveFunctioning.dict

2,733 LatentSentiment_More.dict 1,820 Slang-Uncertain.dict

2,336 LatentSentiment_Negation.dict 1,864 Slang-Unknown.dict

1,743 NotNounWords.dict 2,317 STATE.dict

4,562 Opinions-Contextual.dict 2,278 SupportNegPart.dict

6,649 Opinions-NegativeAttitude.dict 2,196 SupportWords.dict

4,511 Opinions-NegativeBudget.dict 2,933 TaggerDependentDictionary.dict

1,684 Variations-Unknown.dict


Ica v3 sentiment dictionaries always positive

ICA V3 Sentiment Dictionaries – Always Positive

abiding acclaim

abidingly acclaimed

abound acclaiming

abounded acclaimly

abounding acclaims

abounds acclamation

absolve acclamations

absolved accolade

absolvedly accolades

absolvely accommodative

absolves accommodatively

absolving accomplish

absorbing accomplished

absorbingly accomplishedly

abundance accomplishes

abundances accomplishing

abundant accomplishly

abundantly accomplishment

accedely accomplishments

acceptable accordance

acceptably accountable

accessible accountably

accessiblely accurate

  • Are these really positive terms?

  • Are they ALWAYS positive as implied by the file names? (NO)

  • The Always Positive and Negative dictionaries did a reasonable job with the Consumer Review data

    • Better accuracy would be possible by converting additional ICA V3 dictionaries into Studio dictionaries


Verify the result

Verify the result

  • After creating dictionaries, adding the jar(s) to the pipeline, and building resources, analyze the sample document


Negation

Negation

  • Drag a negation phrase with one intervening token into the Rule Builder

  • Change number of occurrences and feature type for the token

  • Check the polarity feature to limit selections to Positive (or Negative)


Insert the annotation

Insert the Annotation

  • Select the Sentiment term, then Insert Annotation


Add features to the annotation

Add features to the annotation

  • Create a new feature using Normalization

    • ConvertToLowerCase on _coveredText

    • Ideally, the lemma would be used but there is a problem with the Studio


Avoid redundancy

Avoid redundancy

  • In most cases, you should remove the annotation consumed by the rule.

    • The terms not consumed by the rule will remain

    • The remaining terms will be converted to OpinionPhrase

  • Studio bug: The Sentiment annotation won’t be removed if the lemma is used in the new feature, but will be removed if Covered Text is used

  • If this worked, the Negation rules could produce OpinionPhrase directly


Create opinionphrase for non negated terms

Create OpinionPhrase for non-negated terms

  • Annotations for terms that were negated were removed by the rule

  • The remainder should converted to OpinionPhrase

    • Drag a Sentiment term to the Builder

    • On the Annotation tab, create OpinionPhrase (without com.etc)

    • Drag the lemma to the Feature to create sentimentTerm and sentimentMatch

    • Drag the polarity to the feature

    • Create a string feature for ruleName (the property is required, but any value can be used)

  • Delete the Sentiment annotation (This doesn’t work, but is a good practice)


Verify the result1

Verify the result


Convert negated sentiment to opinionphrase

Convert Negated sentiment to OpinionPhrase

  • Drag negated sentiment to Builder

  • Insert annotation as OpinionPhrase

  • Add a feature for “NOT”

  • Drag the lower case term over Feature

  • Concatenate “NOT” with the term

  • Add polarity value and ruleName

  • Studio bug?

    • When creating a feature with an existing feature name, an error is sometimes displayed even before the type is selected

    • After creating the feature, it can be renamed


Verify the result2

Verify the result


Order of parsing rules

Order of Parsing Rules


Cleanup

Cleanup

  • Remove annotations that are not needed in the index


Disable built in sentiment annotator

Disable built-in Sentiment Annotator

  • Enable custom Sentiment Annotator instead of System T Annotator

    copy ES_NODE_ROOT/master_config/<collectionID>.indexservice/specifiers/lexical/NullAnnotator.xml

    ES_NODE_ROOT/master_config/<collectionID>.indexservice/specifiers/analytics/Sentiment.xml

    • Be sure to save the original Sentiment.xml

    • Fixpack 1 is required

  • Export to ICA (no index field or facet mapping is required)

  • Rebuild the index


V3 sentiment vs custom annotator

V3 Sentiment vs Custom Annotator


Refine the model

Refine the model

  • “I cannot believe the poor drying performance …” “Won’t open”

    • Performance is not a sentiment, so not negated

    • Should negation terms not applied to sentiment be treated as negative sentiment?

    • How would you pick up the term(s) being modified?

  • “I have never been happier”

    • Can comparators (faster, bigger, hotter) be negated?

  • “I have never been more happy”

    • How would you handle this type of statement?

  • “I have never been less happy”

    • Does reducing the degree of a term make it negative? (Always?)

      • less, lessen, reduce, decrease, minimize, fewer…..

  • Domain specific terms

    • Agitator is not negative when associated with washing machines

  • Broad coverage vs confidence

    • V3 annotator has many terms that might not belong in a Sentiment dictionary

      • abound, absolve, absorbing, abundant, acceptable


Refine the model1

Refine the Model

  • Specific phrases might be important to cover text that can’t be handled by simple dictionaries and rules, or for some use cases

  • Would you handle these with dictionaries (resolve x, address x) plus rules, or with a phrase dictionary? Remember that these can be negated!

    able to address all of my questions address my issues

    able to get them worked out address my needs

    able to resolve address our needs

    able to resolve my issue address the issue

    able to resolve my issues address your issues

    able to resolve my problem addressed my concerns

    able to resolve my problems addressed my issue

    able to resolve the issue addressed my issues

    able to resolve the problem addressed my problem

    able to resolve the problems addressed properly


Refine the model2

Refine the Model

  • Are there other challenging phrases or use of language?

  • Sarcasm?

    • “Yeah, right” -- Probably negative

    • “That’s just great” – You can only know based on the context and maybe it isn’t possible with written text.

  • Slang is common, and probably not included in many dictionaries

    • Slang also evolves over time

  • Non-textual Sentiment

    •  ;) :)


Lab assigment

Lab Assigment

  • Create a Sentiment annotator

    • Handle “always negative” or “always positive” terms

    • Handle negation blocking terms or phrases

    • Handle unassigned negation terms (if you think that is appropriate)

  • If you used English, export to the HappyHome collection and rebuild the index

  • Be ready to discuss your approach and results


  • Login