Sentence Compression Based on ILP Decoding Method

Sentence Compression Based on ILP Decoding Method Hongling Wang, Yonglei Zhang, Guodong Zhou NLP Lab, Soochow University

Outline • Introduction • Related Work • Sentence Compression based on ILP • Experiments • Conclusion

Introduction(1) • Definition of Sentence Compression • It aims to shorten a sentence x=l1,l2,……,lninto a substring y=c1,c2,……cm, where ci∈{ l1,l2,……,ln}. • Example: • Original Sentence: 据法新社报道，有目击者称，以军 23日空袭加沙地带中部，目前尚无伤亡报告。 • Target Sentence: 目击者称以军空袭加沙地带中部

Introduction(2) • Sentence compression has been widely used in: • Summarization • Automatic title generation • Searching engine • Topic detection • …

Related Work(1) • Mainstream solution – corpus-driven supervised leaning • Generative model • To select the optimal target sentence by estimating the joint probability P(x, y) of original sentence x having the target sentence y. • Discriminative model

Related Work(2) • Generative model • Knight & Marcu (2002) firstly apply the noisy-channel model for sentence compression. • Shortcomings: • the source model is trained on uncompressed sentences – inaccurate data • the channel model requires aligned parse trees for both compressed and uncompressed sentences in the training set -- alignment difficult and the channel probability estimates unreliable

Related Work(3) • Discriminative model • McDonald(2006) used max-margin relaxed algorithm (MIRA) to study the feature weight, then rank the subtrees, and finally select the tree with the highest score as the optimal target sentence. • Cohn & Lapata (2007, 2008, and 2009) formulated the compression problem as tree-to-tree rewriting using a synchronous grammar. Each grammar rule is assigned a weight which is learned discriminatively within a large margin model. • Zhang et al. (2013) compressed sentences based on Structured SVM model which treats the compression problem as a structured learning problem

Our Method • The sentence compression problem is treated as a structured learning problem followed Zhang et al.(2013) • Learning a subtree from the original sentence parse tree as its compressed sentence • Formulating the problem of finding the optimal subtree to an ILP decoding problem

The Framework of SC

Sentence Compression based on ILP • Linear objective function x is the original sentence syntactic tree, y is the target subtree is the feature function of bi-gram and trimming features from x to y, w is the vector of feature weight

Linear constrains • ni for each non-terminal node • where ni is the parent node of nj • wifor each terminal node • wi= nj, where nj is the POS node of word wi • fi for the ith feature • if fi=1，the ith feature appears; or, the feature doesn’t appear • According to the restrictions of feature value, the corresponding linear constrains are added • fi=1-wi

Features – Word/POS Features • the remaining word’s bigram POS • PosBigram (目击者称) = NN&VV • whether the dropped word is a stop word • IsStop (据) = 1 • whether the dropped word is the headword of the original sentence • the number of remaining words.

Features – Syntax features • the parent-children relationship of the cutting edge • del-Edge (PP) = IP-PP • the number of the cutting edge • the dependant relation between the dropped word and its dependence word • dep_type(有)=DEP • the relation chain of the dropped word’s POS with its dependence word’s POS • dep_link (，) = PU-VMOD-VV • whether the dependence tree’s root is deleted • del_ROOT (无) = 1 • whether each dropped word is a leaf of the dependence tree • del_Leaf (法新社) = 1

Loss Function • Function 1 • the loss ratio of bigram of the remaining word in original sentence • Function 2: word loss-based function • the sum of the number of the words deleted by mistake and the number of the words remained by mistake between the predict sentence and the gold target sentence

Evaluation • manual evaluation • Importance • Grammaticality • automatic evaluation • compression ratio (CR) (0.7~10) • BLEU score

Experimental settings • Parallel corpus extracted from news documents • Stanford Parser • Alignment tool developed by our own • Structured SVM

Experimental results Compared to the McDonald’s decoding method, the system based ILP decoding method achieves a comparable performance using simpler and less features

Conclusions • the problem of sentence compression is formulated as a problem of finding an optimal sub-tree using ILP decoding method. • Compared to the work using McDonald’s decoding method, the system which only uses simpler and fewer features achieves a comparable performance on same conditions.

Sentence Compression Based on ILP Decoding Method

Sentence Compression Based on ILP Decoding Method

Presentation Transcript

Ontology Summarization Based on RDF Sentence Graph

Content Based Compression

Wavelet-based Image Compression

Sentence-based Writing:

Decoding-Aware Compression of FPGA Bitstream

Limits on ILP

Hebrew Sentence Compression

New Low Complexity DCT Based Video Compression Method

Lossless Compression Based on the Sequence Memoizer

Image Compression: Coding and Decoding

Wavelet Based Color Compression

Image Compression Based on Regression Equation

On Compression-Based Text Classification

A New Method of Robust Image Compression Based on Embedded Zerotree Wavelet Algorithm

Static ILP Static (Compiler Based) Scheduling

Sidewinder A Scalable ILP-Based Router

A New Image Data Compression/Reconstruction Method based on Fuzzy Relational Equation

Eigen-Texture Method Appearance Compression based on 3D modeling

Image Compression Using Fractal Dimension Based on Quadtree Fuzzy Logic Method

Hot wire measurement method based on inverse method

Progressive decoding method for fractal image compression

Progressive decoding method for fractal image compression