1 / 32

# Symmetric Probabilistic Alignment - PowerPoint PPT Presentation

Symmetric Probabilistic Alignment. Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen. Motivation. In the CMU EBMT system, alignment has been less studied compared to the other components.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Symmetric Probabilistic Alignment' - cachez

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Jae Dong Kim

Committee:

Jaime G. Carbonell

Ralf D. Brown

Peter J. Jansen

• In the CMU EBMT system, alignment has been less studied compared to the other components.

• We want to investigate a new sub-sentential aligner which uses translation probabilities in a symmetric fashion.

• Introduction

• Symmetric Probabilistic Alignment

• Experiments and Results

• Conclusions

• Future Work

• The CMU EBMT system refers to translation examples to translate unknown source sentence

• Since it is hard to find an exactly matching example sentence, the system finds the longest match

• Encapsulated local context

• Local reordering

• The aligner should work on fragments (sub-sentences)

• Relatively less studied compared to the other components

• The old aligner

• Heuristic based

• Builds a correspondence table

• Finds the longest target fragment and the shortest target fragment

• Checks every substring of the longest one, which includes the shortest one

• Fast but doesn’t use probabilities

• IBM models (Brown et al, 93)

• HMM (Vogel et al, 96)

• Explicit Syntactic Information(Yamada et al, 02)

• ISA (Zhang, 03)

• The SPA is different from the above in that it aligns sub-sentences using translation probabilities and some heuristics when the boundary of source fragment is given.

• Introduction

• Symmetric Probabilistic Alignment

• Experiments and Results

• Conclusions

• Future Work

• Assumptions:

• A bilingual probabilistic dictionary is available

• Contiguous source fragments are translated into contiguous target fragments

• Fragments are translated independently of surrounding context

• Given and

• Assume that we are considering a candidate target fragment 't2 t3 t4' given a source fragment 's7 s8 s9'

• Source -> Target Translation Score

S_tmp = max( p(t2|s7), p(t3|s7), p(t4|s7), ε )

x max( p(t2|s8), p(t3|s8), p(t4|s8), ε )

x max( p(t2|s9), p(t3|s9), p(t4|s9), ε )

S_st = S_tmp^{1/3}

• Source <- Target Translation Score

S_tmp = max( p(s7|t2), p(s8|t2), p(s9|t2), ε )

x max( p(s7|t3), p(s8|t3), p(s9|t3), ε )

x max( p(s7|t4), p(s8|t4), p(s9|t4), ε )

S_ts = S_tmp^{1/3}

• Source <->Target Translation Score

Score = S_st * S_ts

• Untranslated word penalty

s7 s8 s9

t2 t3 t4

• Anchor Context

s6s7 s8 s9s10s6s7 s8 s9s10

t1t2 t3 t4t5t1t2 t3 t4t5

• Length penalty

• “t2 ... t30” for “s7 s8 s9”. Realistic?

• We expect a proportional target fragment length to the source fragment length.

• Distance penalty

• “t45 t46 t47” for “s7 s8 s9”. Realistic? Maybe.

• Between similar word order languages, we might expect a proportional position.

• Set a threshold for the SPA

• The SPA produces results with higher score than the threshold

• For each source fragment

• If there is a result from the SPA -> use the SPA result

• Otherwise, use the IBM result

• Introduction

• Symmetric Probabilistic Alignment

• Experiments and Results

• Conclusions

• Future Work

• Evaluation Metrics

• F1 (Precision, Recall) - based on positions

• Data

• English-Chinese

• Xinhua news wire

• Training data: 1m sentence pairs

• Trained GIZA++ with default parameters

• For the SPA, used the dictionary by GIZA++

• Test data:

• 366 sentence pairs - 3 copies by 3 people

• 20 more sentence pairs - 1 copy by another

• 27286 3-8 words long source fragments

• Data

• French-English

• Training data: 1m sentence pairs

• Trained GIZA++ with default parameters

• For the SPA, used the dictionary by GIZA++

• Test data

• 91 sentence pairs

• 12466 3-8 words long source fragments

• Alignments to be compared

• Random: random alignment to a reasonably long target fragment

• Positional: alignment to a proportionally positioned target fragment

• Oracle: the best possible contiguous human alignment

• SPA-uni: unidirectional basic alignment

• SPA-basic: bidirectional basic alignment

• SPA: the best SPA alignment with restrictions

• IBM4: non-contiguous alignment by IBM Model 4

• COMB: the combination of SPA and IBM4 alignments

• SPA-top10: the best of top 10 alignment results of SPA

• SPA-basic outperformed SPA-uni

• SPA was the best when we applied untranslated word penalty and length penalty

• Our significance test showed that the difference between IBM4 and COMB is significant

• SPA-basic outperformed SPA-uni

• SPA was the best when we applied all the restrictions

• Our significance test showed that the difference between IBM4 and COMB is not significant

Rough idea about how much humans agree on alignment

• Data

• 20k training sentence pairs

• Test

• Development set: 100 sentence pairs

• 2 reference set: 2 references for 100 source sentences

• Evaluation set: 10 X 100 sentence pairs

• Evaluation Metric

• BLEU

• SPA, IBM4 and COMB performs significantly better than EBMT (the old aligner)

• For 'Test', SPA outperformed EBMT by 28.5 %

• Among SPA, IBM4 and COMB, nothing is significantly better than the others

• Introduction

• Symmetric Probabilistic Alignment

• Experiments and Results

• Conclusions

• Future Work

• Improvement on EBMT performance

• Combined aligner worked the best on English-Chinese set

• Bidirectional alignment worked better than unidirectional alignment

• Incorporating human dictionaries to cover more general domains

• Non-contiguous alignment

• Co-training of the SPA and a dictionary

• Experiments on different data sets and different language pairs

• Experiments with different metrics

• Speed up

• Ying Zhang, Stephan Vogel and Alex Waibel. Integrated Phrase Segmentation and Alignment Model for Statistical Machine Translation. submitted to Proc. of International Confrerence on Natural Language Processing and Knowledge Engineering (NLP-KE), 2003, Beijing, China.

• Peter F. Brown, Stephen A. Della Pietra, Vin-cent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machinetranslation: Parameter estimation. Computa-tional Linguistics, 19 (2) :263-311.

• Stephan Vogel, Hermann Ney, and Christoph Till-mann. 1996. HMM-based word alignment in statistical translation. In COLING '96: The 16th Int. Conf. on Computational Linguistics, pages 836-841, Copenhagen, August.

• I. Dan Melamed. "A Word-to-Word Model of Translational Equivalence". In Procs. of the ACL97. pp 490--497. Madrid Spain, 1997.

• K. Yamada and K. Knight. A decoder for syntax-based statistical MT. In ACL '02, 2002.

Questions?

• Alignment Accuracy Calculation

• Non-contiguous Alignment