Automatic Summary Evaluation - PowerPoint PPT Presentation

senwe
automatic summary evaluation n.
Skip this Video
Loading SlideShow in 5 Seconds..
Automatic Summary Evaluation PowerPoint Presentation
Download Presentation
Automatic Summary Evaluation

play fullscreen
1 / 13
Download Presentation
114 Views
Download Presentation

Automatic Summary Evaluation

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Automatic Summary Evaluation Ross Greenwood

  2. Recap • Automatically evaluate summaries of text documents • Evaluate content coverage • Compare against one or more ideal summaries

  3. Pyramid Evaluation • Manually annotate texts for phrases expressing similar ideas (summary content units) • Judge content coverage by number of overlapping summary content units

  4. ROUGE: Four Summary Evaluation Measures • ROUGE-N: N-gram Co-Occurrence • Number of matching N-word substrings • ROUGE-L: Longest Common Subsequence • Allows for skipping words • Ex. “a b d f” is a subsequence of “a b c d e f” • ROUGE-W: Weighted LCS • Weight consecutive matches higher • ROUGE-S: Skip-bigram • Number of matching 2-word substrings with arbitrary gaps

  5. ROUGE: Four Summary Evaluation Measures • ROUGE-N: N-gram Co-Occurrence • Number of matching N-word substrings • ROUGE-L: Longest Common Subsequence • Allows for skipping words • Ex. “a b d f” is a subsequence of “a b c d e f” • ROUGE-W: Weighted LCS • Weight consecutive matches higher • ROUGE-S: Skip-bigram • Number of matching 2-word substrings with arbitrary gaps


  6. Precision, Recall, and F-Measure • Precision = matches/num_words_peer • Recall = matches/num_words_models • F = 2/(1/P + 1/R)

  7. Problems with ROUGE-N: False Positives • Homographs, ex: Model: … robbed the bank … Peer: … sat on the river bank …

  8. Problems with ROUGE-N: False Negatives • Synonyms, ex: Model: … held up the financial institution … Peer: … robbed the bank …

  9. Solution: WordNet • Lexical Database • Synsets: organize words by concepts • Method: • Tag words with POS • Tag words with meaning (senseLearner) • Lookup synset in WordNet

  10. Architecture of Solution WordNet {go#v#7, pass#v#6, lead#v#6, extend#v#2} querySense(“run#v#3”, “syns”) POS tagger senseLearner ROUGE Results Data

  11. Evaluating the Evaluator • Correlation with human evaluation scores (ROUGE, Basic Elements) • Success at reducing errors (i.e. number of false negatives/positives avoided vs. original ROUGE)

  12. References • Lin, C.Y. (2004). Rouge: a package for automatic evaluation of summaries. Workshop On Text Summarization Branches Out • Fellbaum, C. (Ed.). (1998). Wordnet: an electronic lexical database. Cambridge, MA: MIT Press.

  13. Questions?