From Extracting to Abstracting

From Extracting to Abstracting Generating Quasi-abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio, Peter C. Nelson Department of Computer Science University of Illinois at Chicago

Outline • Introduction • Quasi-abstractive summaries • Model & Approach • Experimental Results • Conclusion & Discussion

Introduction • Types of text summaries • Extractive: composed of whole sentences or clauses from source text. Paradigm adopted by most automatic text summarization systems • Abstractive: obtained using various techniques like paraphrasing. Equivalent to human-written abstracts. Still well beyond state-of-the-art.

Quasi-abstractive Summaries • Composed not of whole sentences from source text but of fragments that form new sentences [Jing 02] • We will show they are more similar to human-written abstracts, as measured with cosine similarity & ROUGE-1,2 metrics

Quasi Abstractive: Rationale Two sentences from a human written abstract: A1We introduce the bilingual dual-coding theory as a model for bilingual mental representation. A2Based on this model, lexical selection neural networks are implemented for a connectionist transfer project in machine translation. Extractive Summary (by ADAMS): E1We have explored an information theoretical neural network that can acquire the verbal associations in the dual-coding theory. E2 The bilingual dual-coding theory partially answers the above questions. Candidate sentence set for A1: S1The bilingual dual-coding theory partially answers the above questions. S2There is a well-known debate in psycholinguistics concerning the bilingual mental representation. . . Candidate sentence set for A2: S3We have explored an information theoretical neural network that can acquire the verbal associations in the dual-coding theory. S4It provides a learnable lexical selection sub-system for a connectionist transfer project in machine translation.

Model & Approach • Learn model that can identify Candidate Sentence Set (CSS) • Label: generate patterns of correspondence • Train classifier: to identify the CSS’s • Generate summary for a new document • Generate CSS’s • Realize Summary

CSS’s Discovery Diagram

Learn the CSS Model (1) • Label: • decomposition of abstract sentences based on string overlaps • 70.8% of abstract sentences are composed of fragments of length >= 2, which can be found in the text to be summarized in our test data (CMP-LG corpus)

Learn the CSS Model (2) • Train classifier: Given docs where all CSS’s have been labelled, transform each doc into sentence pair set. Each instance is represented by feature vector and target feature is whether pair belongs to same CSS • Used Decision Trees, also tried Support Vector Machines [Joachims, 2002] and Naïve Bayes classifiers [Borgelt, 1999] • Sparse data problem: [Japkowicz 2000; Chawla et al., 2003]

Summary Generation • Generate CSS’s for unseen documents: • Use classifier to identify sentence pairs belonging to same CSS and merge them • CSS’s formation exhibits natural order since sentences and sentence pairs are labeled sequentially: i.e., first CSS will contain at least one fragment which appears earlier in source text than any fragments in second CSS • Summary Realization

Summary Realization • Simple Quasi-abstractive (SQa) • New sentence generated by appending new word to previously generated sequence according to n-gram probabilities calculated from CSS • Each CSS is used only once

Summary Realization • Quasi-abstractive with Salient Topics (QaST) • Salient NPs model based on social networks [Wasserman & Faust, 94; Xie 2005] • Sort predicted salient NPs according to their lengths • Traverse list of salient NPs and of CSS-based n-gram probabilities in parallel to generate sentence: use highest ranked NP which has not been used yet, and first n-gram probability model that contains this NP

Topic Prediction • Salient NPs • Abstract should contain salient topics of article • Topics are often expressed by NPs • We assume that NPs in an abstract represent most salient topics in article • NP Network & NP Centrality • Collocated NPs can be connected and hence network can be formed • Social network analysis techniques used to analyze network [Wasserman & Faust 94] and calculate centrality for nodes [Xie 05]

Experiments • Data: 178 documents from CMP-LG corpus, 3-fold cross validation • Four Models: • Lead: first sentence from first m paragraphs. • ADAMS: top m sentences ranked according to sentence ranking function ADAMS learned. • SQa: uses n-gram probabilities over first m discovered CSS’s to generate new sentences. • QaST: anchors choice of specific set of n-gram probabilities in salient topics. Stops after m sentences have been generated.

Evaluation Metrics • Cosine similarity: bag of words method • ROUGE-1,2: [Lin 2004] • A recall measure to compare machine-generated summary and its reference summaries • Still bag of words/n-gram method • But showed high correlation with human judges

Experimental Results • SQa’s performance is even lower than Lead • ADAMS achieved +13.6%, +27.9%, and +37.8% improvement over Lead for the three metrics • QaST achieved +29.4%, +31.5%, and +64.3% improvement over Lead, and +13.9%, +2.8%, +19.3% over ADAMS • All differences between QaST and others are statistically significant (two sample t-test) except for ADAMS/ROUGE-1

Generated Sentence Sample • QaST: • Original: • In collaborative expert-consultation dialogues, two participants ( executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan. • In collaborative expert-consultation dialogues, two participants (executing agent and consultant) work together to construct a plan for achieving the executing agent’s domain goal. • The executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan.

Sample Summary QaST:In this paper, we present a plan-based architecture for response generation in collaborative consultation dialogues, with emphasis on cases in which the user has indicated preferences. to an existing tripartite model might require inferring a chain of actions for addition to the shared plan, can appropriately respond to user queries that are motivated by ill-formed or suboptimal solutions, and handles in a unified manner the negotiation of proposed domain actions, proposed problem-solving actions, and beliefs proposed by discourse actions as well as the relationship amongst them. In collaborative expert-consultation dialogues, two participants( executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan. In suggesting better alternatives, our system differs from van Beek’s in a number of ways. Abstract:This paper presents a plan-based architecture for response generation in collaborative consultation dialogues, with emphasis on cases in which the system (consultant) and user (executing agent) disagree. Our work contributes to an overall system for collaborative problem-solving by providing a plan-based framework that captures the Propose-Evaluate-Modify cycle of collaboration, and by allowing the system to initiate subdialogues to negotiate proposed additions to the shared plan and to provide support for its claims. In addition, our system handles in a unified manner the negotiation of proposed domain actions, proposed problem-solving actions, and beliefs proposed by discourse actions. Furthermore, it captures cooperative responses within the collaborative framework and accounts for why questions are sometimes never answered.

Conclusion & Discussion • New type of machine generated summary: Quasi-abstractive summary • N-gram model anchored by salient NPs gives good results • Further investigation needed in several aspects • CSS’s Discovery with cost-sensitive classifiers [Domingos, 1999; Ting, 2002] • Grammaticality and length of generated summaries [Wan et al, 2007]

From Extracting to Abstracting

From Extracting to Abstracting

Presentation Transcript

Extracting Videos from YouTube

Extracting structure from reactions

Extracting fact from fiction

Extracting Opinions from Reviews

EXTRACTING METALS FROM THEIR ORES

Extracting Energy from Wind

Extracting Detergents from Food

Extracting Tables from ERD

Extracting Value from Waste

Extracting information from French obituaries

Extracting Value from SOA

Extracting Schema From Data

Extracting Schema from Semistructured Data

Extracting LTAGs from Treebanks

Extracting information from scientific papers:

Extracting Metals from Ores

Extracting Photometry from SPIRE Maps

Extracting Worth From Waste

Extracting Product Details from Macys

Product Data Extracting from Safeway

Extracting Business Data from Truelocal