1 / 23

Computing Word-Pair Antonymy

Computing Word-Pair Antonymy. * Saif Mohammad *Bonnie Dorr φ Graeme Hirst *Univ. of Maryland φ Univ. of Toronto EMNLP 2008. Introduction. Antonymy : pair of semantically contrasting words. Ex: Strongly antonymous: Hot Cold Semantically contrasting: Enemy Fan

aldon
Download Presentation

Computing Word-Pair Antonymy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing Word-Pair Antonymy *Saif Mohammad *Bonnie Dorr φGraeme Hirst *Univ. of Maryland φUniv. of Toronto EMNLP 2008

  2. Introduction • Antonymy: pair of semantically contrasting words. • Ex: Strongly antonymous: HotCold Semantically contrasting: EnemyFan Not antonymous: PenguinClown

  3. Usage • Detecting contradictions • Detecting humor • Automatic creation of thesaurus

  4. Problem Definition • Given a thesaurus, find out the antonymous category pairs. • Assign the degree of antonymy to each pair of antonymous categories.

  5. Hypothesis(1) • The Co-occurrence Hypothesis of Antonyms • Antonymous word pairs occur together much more often than other word pairs.

  6. Hypothesis(1) • Empirical proof: • 1,000 antonymous pairs from Wordnet • 1,000 randomly generated word pairs • Use BNC as corpus, set window size 5. • Calculate the MI for each word pairs and average it

  7. Hypothesis(2) • The Distributional Hypothesis of Antonyms • Antonyms occur in similar contexts more often than non-antonymous words • Ex work: activity of doing job play: activity of relaxation

  8. Hypothesis(2) • Empirical proof: • Use the same set of word pairs in hypothesis(1) • Calculate the distributional distance between their categories

  9. Distributional Distancebetween Two Thesaurus Categories c1,c2: thesaurus category I(x,y):pointwise mutual information between x and y T(c):the set of all words w such that I(c,w)>0

  10. Method • Determine pairs of thesaurus categories that are contrasting in meaning • Use the co-occurrence and distributional hypotheses to determine the degree of antonymy of word pairs

  11. Method • 16 affix rules were applied to Macquarie Thesaurus • 2,734 word pairs were generated as a seed set. • Exceptions: sectXinsect • Relatively few

  12. Method • 10,807 pairs of semantically contrasting word pairs from WordNet

  13. Method • If any word in thesaurus category C1 is antonymous to any word in category C2 as per a seed antonym pair, then the two categories are marked as contrasting. • If no word in C1 is antonymous to any word in C2, then the categories are considered not contrasting

  14. Method • Degree of antonymy----category level • By distributional hypothesis of antonyms, we claim that the degree of antonymy between two contrasting thesaurus categories is directly proportional to the distributional closeness of the two concepts

  15. Method • Degree of antonymy----word level • target words belong to the same thesaurus paragraphs as any of the seed antonyms linking the two contrasting categories highly antonymous • target words do not both belong to the same paragraphs as a seed antonym pair, but occur in contrasting categories  medium antonymous • target words with low tendency to co-occur lowly antonymous

  16. Method • Adjacency Heuristic • Most thesauri are ordered such that contrasting categories tend to be adjacent

  17. Evaluation • 1,112 Closest-opposite questions designed to prepare students for GRE(Graduate Record Examination) • 162 questions as the development set • 950 questions as the test set

  18. Evaluation • Closest-opposite questions • Ex: adulterate: a. renounce b. forbid c. purify d. criticize e. correct

  19. Evaluation • Closest-opposite questions • Ex: adulterate: a. renounce b. forbid c. purify d. criticize e. correct 摻雜的 聲明放棄 禁止 純淨的 批評 正確

  20. Evaluation

  21. Discussion • The automatic approach does indeed mimic human intuitions of antonymy. • In languages without a wordnet, substantial accuracies may be achieved. • Wordnet and affix-generated seed are complementary.

  22. Conclusion • Proposed an empirical approach to antonymy that combines corpus co-occurrence statistics with the structure of a thesaurus. • The system can identify the degree of antonymybetween word pairs. • An empirical proof that antonym pairs tend to be used in similar contexts.

  23. Thanks

More Related