1 / 19

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis. Carolyn Penstein Ros é Language Technologies Institute/ Human-Computer Interaction Institute. Warm Up Discussion. Look at the analysis I have passed out

long
Download Presentation

Computational Models of Discourse Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

  2. Warm Up Discussion • Look at the analysis I have passed out • Note: inscribed sentiment is underlined and invoked sentiment is italicized and relatively frequent words that appear in either of these types of expressions have been marked in bold • Do you see any sarcastic comments here? What if any connection do you see between sentiment and sarcasm? • Keeping in mind the style of templates you read about, do you see any snippets of text in these examples that you think would make good templates?

  3. Patterns I see • Inscribed sentiment • About 24% of words in underlined segments are relatively high frequency • 3 “useful” patterns out of 18 underlined portions • Examples: • Like • Good • More CW than CW •  Invoked sentiment • About 39% of words were relatively high frequency • About 7 possibly useful patterns out of 17, but only 3 really look unambigous • Examples • CW like one • CW the CW of the CW • Makes little CW to CW to the CW • CW and CW of an CW • Like CW on CW • Leave you CW a little CW and CW • CW more like a CW

  4. Unit 3 Plan • 3 papers we will discuss all give ideas for using context (at different grain sizes) • Local patterns without syntax • Using bootstrapping • Local patterns with syntax • Using a parser • Rhetorical patterns within documents • Using a statistical modeling technique • The first two papers introduce techniques that could feasibly be used in your Unit 3 assignment

  5. Student Comment: Point of Discussion • To improve performance language technologies seem to approach the task in either one of two ways. First of all approaches attempt to generate a better abstract model that provides the translation mechanism between a string of terms (sentence) and our human mental model of sentiment in language. Alternatively some start with a baseline and try to find a corpus or dictionary of terms that provides evidence for sentiment. • Please clarify

  6. Connection between Appraisal and Sarcasm • Student Comment: I’m not exactly sure how one would go about applying appraisal theory to something as elusive as sarcasm. A sarcastic example of invoked negative sentiment from Martin and White, p 72

  7. Inscribed versus Invoked • Do we see signposts that tell us how to interpret invoked appraisals?

  8. Overview of Approach • Start with small amount of labeled data • Generate patterns from examples • Select those that appear in training data more than once and don’t appear both in a 1 and a 5 labeled example • Expand data through search using examples from labeled data as queries (take top 50 snippet results) • Represent data in terms of templatized patterns • Modified kNN classification approach How could you do this with SIDE? 1 Build a feature extractor to generate the set of patterns 2 Use search to set up expanded set of data 3 Apply generated patterns to expanded set of data 4 Use kNN classification

  9. Pattern Generation • Classify words into high frequency (HFW) versus content words (CW) • HFWs occur at least 100 times per million words • CW occur no more than 1000 times per million words • Also add [product], [company], [title] as additional HFWs • Constraints on patterns: • 2-6 HFWs, 1-6 slots for CWs, patterns start and end with HFWs Would Appraisal theory suggest other categories of words?

  10. Expand Data: “Great for Insomniacs…” What could they have done instead?

  11. Pattern Selection • Approach: Select those that appear in training data more than once and don’t appear both in a 1 and a 5 labeled example • Could have used an attribute selection technique like Chi-squared attribute evaluation • What do you see as the trade-offs between these approaches?

  12. Representing Data as a Vector • Most of the features were from the generated patterns • Also included punctuation based features • Number of !, number of ?, number of quotes, number capitalized words • What other features would you use? • What modifications to feature weights would you propose?

  13. Modified kNN • Is there a simpler approach? Weighted average so majority class matches count more.

  14. Evaluation Baseline technique: Count as positive examples those that have a highly negative star rating but lots of positive words. Is this really a strong baseline? Look at the examples from the paper. • I am …rather wary of the effectiveness of their approach because it seems that they cherry picked a heuristic ‘star-sentiment’ baseline to compare their results to in table 3 but do not offer a similar baseline for table 2.

  15. Evaluation • What do you conclude from this? • What surprises you?

  16. Revisit: Overview of Approach • Start with small amount of labeled data • Generate patterns from examples • Select those that appear in training data more than once and don’t appear both in a 1 and a 5 labeled example • Expand data through search using examples from labeled data as queries (take top 50 snippet results) • Represent data in terms of templatized patterns • Modified kNN classification approach How could you do this with SIDE? 1 Build a feature extractor to generate the set of patterns 2 Use search to set up expanded set of data 3 Apply generated patterns to expanded set of data 4 Use kNN classification

  17. What would it take to achieve inter-rater reliability? • You can find definitions and examples on the website, just like in the book, but it’s not enough… • Strategies • Simplify – are there distinctions that don’t buy us much anyway? • Add constraints • Identify boarderline cases • Use decision trees

  18. What would it take to achieve inter-rater reliability? • Look at Beka and Elijah’s analyses in comparison with mine • What were our big disagreements? • How would we resolve them?

  19. Questions?

More Related