1 / 37

Exploiting Timelines to Enhance Multi-document Summarization

Exploiting Timelines to Enhance Multi-document Summarization. Jun-Ping Ng, Yan Chen, Min-Yen Kan and Zhoujun Li National University of Singapore Beihang University. Cyclone Sidr 2007, JTWC designation: 06B.

Download Presentation

Exploiting Timelines to Enhance Multi-document Summarization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Timelines to Enhance Multi-document Summarization Jun-Ping Ng, Yan Chen, Min-Yen Kan and Zhoujun Li National University of Singapore Beihang University

  2. ACL 2014 - Timelines in Summarization Cyclone Sidr 2007, JTWC designation: 06B “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, …” Image Courtesy: Univ. Wisconsin-Madison

  3. ACL 2014 - Timelines in Summarization “… wiping out homes and trees in what officials described as the worst storm in years.” Image Courtesy: US Navy / Wikipedia

  4. ACL 2014 - Timelines in Summarization “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.” Image Courtesy: US State Department / Wikipedia

  5. ACL 2014 - Timelines in Summarization 1991 Bangladesh Cyclone “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.” Image Courtesy: US Navy / Wikipedia

  6. ACL 2014 - Timelines in Summarization [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.”

  7. ACL 2014 - Timelines in Summarization [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.”

  8. ACL 2014 - Timelines in Summarization Timelines from Text [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.”

  9. ACL 2014 - Timelines in Summarization Key time spans are summary worthy [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.”

  10. ACL 2014 - Timelines in Summarization Timelines + Summarization Timelines + Summarization Lexical and positional features Timeline-derived features Summary Summarization System Timelines (per input document)

  11. ACL 2014 - Timelines in Summarization Outline • Goal and Motivation • Timeline Generation • Integrating Timelines • In Scoring: (Contextual) Importance, Density • In Re-ordering: TimeMMR • Experiments • Discussion

  12. ACL 2014 - Timelines in Summarization Timeline Generation

  13. ACL 2014 - Timelines in Summarization 1. Event-Event Temporal Classification (Ng et al., 2013; EMNLP)

  14. ACL 2014 - Timelines in Summarization 2. Event-Timex Temporal Classification (Ng and Kan, 2012; COLING)

  15. ACL 2014 - Timelines in Summarization 3. Timex Normalization (HeidelTime;Strötgen and Gertz, 2013) “Today” June 6, 2014

  16. ACL 2014 - Timelines in Summarization Timeline Construction • Map normalized timexes to timeline • Place events which OVERLAP with timexes onto timeline • Place events which OVERLAP with other events onto the timeline • Insert rest of events based on BEFORE/AFTER ordering 1999

  17. ACL 2014 - Timelines in Summarization Integrating Timelines into SWING Temporal Processing Summarization Pipeline SWING (Ng et al., COLING 2012, TAC 2011) State-of-the-art open-source extractive summarizerhttps://github.com/WING-NUS/SWING Basic, k of n sentence summaries Time Span Importance Time MMR Contextual Time Span Importance Sentence Temporal Coverage Density

  18. ACL 2014 - Timelines in Summarization 1. Time Span Importance (TSI) • Time spans which contain many events are more salient • Sentences which references events in these time spans are thus better candidates for a summary

  19. ACL 2014 - Timelines in Summarization 2. Contextual Time Span Importance (CTSI) • Time spans near to important time spans are important • Search left and right for local peaks , where

  20. ACL 2014 - Timelines in Summarization 3. Sentence Temporal Coverage Density (TCD) • Favour sentences which • contain more events • covering a wide variety of time spans

  21. ACL 2014 - Timelines in Summarization Identifying Redundancies • SWING makes use of the Maximal Marginal Relevance (MMR) algorithm to identify redundancies in selected sentences • MMR is based largely on surface lexical similarities Idea: Let’s use time as a basis to penalize the selection of sentences from redundant time periods.

  22. ACL 2014 - Timelines in Summarization TimeMMR • Beyond lexical similarities, identify sentences which contain substantial time span overlap. • Candidate sentences which share many time spans with selected sentences are penalized. Proportion of overlap An official in Barisal, 120 kilometres south of Dhaka, spoke of severe destructionas the 500 kilometre-wide mass of cloud passed overhead. “Many trees have been uprootedand houses and schools blownaway,” Mostofa Kamal, a district relief and rehabilitation officer, told AFP by telephone. “Mud huts have been damagedand the roofs of several houses blownoff,” said the state’s relief minister, Mortaza Hossain. Lexically dissimilar but redundant

  23. ACL 2014 - Timelines in Summarization Experiments • Data • TAC 2010 dataset for training • TAC 2011 dataset for testing • Temporal Processing Systems • HeidelTime (Strötgen and Gertz, 2013) • E-T temporal classification (Ng and Kan, 2012) • E-E temporal classification (Ng et al., 2013) • Summarization baseline • SWING (Ng et al., 2012)

  24. ACL 2014 - Timelines in Summarization Results * = p < 0.1, ** = p < 0.05, against R row Doesn’t seem very effective!

  25. ACL 2014 - Timelines in Summarization Analysis: Timelines contain errors • Errors from underlying temporal processing systems • Simplifying assumptions made in timeline construction • Lack of consistency checking and validation For effective use, we must identify good timelines • Identify timelines which potentially contain more errors • Exclude these when performing summarization

  26. ACL 2014 - Timelines in Summarization Reliability Filtering • Short timelines can result when the system fails to extract or relate events and timexes • Features derived from short timelines are prone to have extreme values • Use the length of a timeline as a gauge of its accuracy • Don’t use timelines shorter than average(as computed over the whole collection)

  27. ACL 2014 - Timelines in Summarization With Reliability Filtering * = p < 0.1, ** = p < 0.05, against R row TimeMMR doesn’t seem effective! Why?

  28. ACL 2014 - Timelines in Summarization Does TimeMMR actually help? Possibly Redundant? = R-2: 0.2643, worse by R-2 R-2: 0.2772, better by R-2 Could an (automated) evaluation metric cater for time?

  29. ACL 2014 - Timelines in Summarization Conclusion • Use of automatic timeline generation • Integration of timelines into summarization • Sentence scoring via timeline features • Sentence re-ordering via TimeMMR • Length based timeline filtering helps to ameliorate errors For details on temporal processing, see: Jun Ping’s work at COLING 2012, EMNLP 2013 and his doctoral thesis (2014) Questions? If not, ask for more detailed analysis!

  30. ACL 2014 - Timelines in Summarization Additional Slides

  31. ACL 2014 - Timelines in Summarization Related Work • For Sentence Reordering • Barzilay et al., 1999 • Recency as an indicator of salience • Goldstein et al., 2000;Wan, 2007; Demartini et al., 2010 • Liu et al., 2009 (“Temporal Graph”) • Wu, 2008 (“Largest Cluster”) • TREC Temporal Summarization Track • Not as relevant; about monitoring an event over time Close to our TSI

  32. ACL 2014 - Timelines in Summarization With time features; better Baseline; worse

  33. ACL 2014 - Timelines in Summarization TSI: A crane accident With TSI, the cause of the accident in this summary is included; the alternative R1 sentence is background information and does not occur at any key time span. With TSI; better Without TSI; worse

  34. ACL 2014 - Timelines in Summarization CTSI: Coral Reef Preservation With CTSI; better Without CTSI; worse With CTSI, the “warn” and “disappear” events were promoted in importance due to their proximity with peak P

  35. ACL 2014 - Timelines in Summarization Timeline Caveats • Some events span a long period of time (i.e., “1999”) • Events are ordered based on the start of the duration • Timeline captures relative order • Construction algorithm does not attempt to reconcile contradictions

  36. ACL 2014 - Timelines in Summarization Timex Normalization Source:Bethard, 2013

  37. ACL 2014 - Timelines in Summarization References • Jun-Ping Ng, Interpreting Text with Time, Doctoral Thesis, National University of Singapore, 2014 • Jun-Ping Ng, Min-Yen Kan, Ziheng Lin, Wei Feng, Bin Chen, Jian Su, Chew-Lim Tan, Exploiting Discourse Analysis for Article-Wide Temporal Classification, EMNLP 2013 • Jun-Ping Ng, Praveen Bysani, Ziheng Lin, Min-Yen Kan, Chew-Lim Tan, Exploiting Category-Specific Information for Multi-Document Summarization, COLING 2012 • Jun-Ping Ng, Min-Yen Kan, Improved Temporal Relation Classification using Dependency Parses and Selective Crowdsourced Annotations, COLING 2012

More Related