610 likes | 789 Views
Statistical Machine Translation Part III – Phrase- based SMT / Decoding. Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23 EMA Summer School. Outline. Phrase- based translation Log-linear model Tuning log-linear model Decoding.
E N D
Statistical Machine TranslationPart III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23 EMA Summer School
Outline • Phrase-basedtranslation • Log-linear model • Tuning log-linear model • Decoding
Language Model • Usually a trigramlanguage model isusedfor p(e) • P(the man wenthome) = p(the | START) p(man | START the) p(went | the man) p(home | man went) • Language modelswork well forcomparingthegrammaticalityofstringsofthesame length • However, whencomparingshortstringswithlongstringstheyfavorshortstrings • Forthisreason, a veryimportantcomponentofthelanguage model isthelengthbonus • Thisis a constant > 1 multipliedforeach English word in thehypothesis
d ModifiedfromKoehn 2008
Outline • Phrase-basedtranslation • Log-linear model • Tuning log-linear model • Decoding
Outline • Phrase-basedtranslation model • Log-linear model • Tuning log-linear model automatically • Decoding
Outline • Phrase-basedtranslation model • Log-linear model • Tuning log-linear model automatically • Decoding • Basic phrase-baseddecoding • Dealingwithcomplexity • Recombination • Pruning • Future costestimation • Decoding output