1 / 21

Semi-Supervised Boosting for Statistical Word Alignment

Semi-Supervised Boosting for Statistical Word Alignment. Wu Hua 2006/10/18. Outline. Introduction to semi-supervised learning Introduction to boosting Semi-supervised boosting for word alignment Evaluation results Conclusion. Machine Learning Methods. Supervised Learning Labeled data

eleanorc
Download Presentation

Semi-Supervised Boosting for Statistical Word Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semi-Supervised Boosting for Statistical Word Alignment Wu Hua 2006/10/18

  2. Outline • Introduction to semi-supervised learning • Introduction to boosting • Semi-supervised boosting for word alignment • Evaluation results • Conclusion

  3. Machine Learning Methods • Supervised Learning • Labeled data • Unsupervised learning • Unlabeled data • Semi-supervised learning • Combine both labeled data and unlabeled data

  4. Semi-Supervised Learning in NLP • Word sense disambiguation • (Yarowsky, 1995; Pham et al., 2005) • Classification • (Blum and Mitchell, 1998; Thorsten, 1999) • Clustering • (Basu et al., 2004) • Named entity classification • (Collins and Singer, 1999) • Parsing • (Sarkar, 2001)

  5. Reference Set End? Boosting – Supervised Learning Initialization Supervised Learning Call Learner Calculate Error Rate Re-weight Training data Yes Build Ensemble

  6. Boosting in NLP • Tagging and PP attachment • (Abney et al., 1999) • Word sense disambiguation • (Escudero et al., 2000) • Parser construction • (Haruno et al., 1999; Henderson and Brill, 2000) • Sentence generation • (Walker et al., 2001)

  7. Semi-Supervised Boosting • Three main problems • Semi-supervised learner • Combine labeled data and unlabeled data • Reference set • Automatically construct a reference set for unlabeled data • Error rate calculation • How to calculate the error rate with both labeled data and unlabeled data

  8. End? Semi-Supervised Boosting Applied to Word Alignment Labeled Data Unlabeled Data Supervised Training Unsupervised Training Model Interpolation Real Reference Set Error Rate Calculation Pseudo Reference Set Re-weight Training data Yes Build Ensemble

  9. Semi-Supervised Boosting Applied to Word Alignment • Five main components • Word alignment model interpolation • Pseudo reference set construction for unlabeled data • Error rate calculation • Weight update • Final Ensemble

  10. Word Alignment Model • Supervised alignment model • Calculate the probabilities for IBM Model 4 based on the labeled data • Unsupervised alignment model • Use GIZA++ to train IBM Model 4 • Perform model interpolation

  11. Pseudo Reference Set Construction • Obtain bi-directional word alignment sets S1 and S2 on the training data • Obtain the intersection set of these two alignment sets • Filter the union set of the two alignment sets • Build the pseudo reference set where

  12. Error Rate Calculation • For a sentence pair • Calculate the error rate of a aligner • Based on the labeled data instead of the whole data where is the normalized weight of the ith sentence pair at the lth round

  13. Re-Weight the Training Data • Reweight each sentence pair in the training set • For each sentence pair, there may exist correct links and incorrect links as compared with the pseudo reference set • Calculate the weight of each sentence pair according to the correct and incorrect links where K is the number of the error links n is the total number of the links in the reference

  14. Final Ensemble • Obtain the final ensemble according to the trained word aligners on each round where is the final ensemble for word alignment is the weight of each alignment pair (s,t) produced by the word aligner is the weight of the word aligner

  15. Evaluation • Training set • Unlabeled data: 320,000 English-Chinese pairs • Labeled data: 30,000 English-Chinese pairs • Held-out set • 1,500 sentence pairs • Testing set • 1,000 bilingual English-Chinese sentence pairs • Totally 8,651 alignment links

  16. Evaluation Metric • Word alignment • Precision and Recall • Alignment Error Rate (AER) • Phrase-based machine translation • System: Pharaoh • Metrics: NIST and BLEU

  17. Word Alignment Results

  18. Method Precision Recall AER Baseline 0.7946 0.7775 0.2140 Our method 0.8175 0.7858 0.1987 Weights in Ensembles • Two kinds of weights • Weights for the individual aligners • Weights for the individual alignment links Baseline: only use the first kind of weights Our method: use the two kinds of weights

  19. Translation Results

  20. Conclusion • Features in our semi-supervised boosting method • Perform model interpolation • Automatically build pseudo reference set • Calculate the error rate of training set with the labeled data • Use two kinds of weights in the ensemble • One for aligners • The other for alignment links • Boosting improves the word alignment and translation quality • Boosting does improve word alignment and translation quality • Semi-supervised boosting performs the best

  21. Thanks!

More Related