1 / 32

What’s Wrong with Syntax-Based Machine Translation?

What’s Wrong with Syntax-Based Machine Translation?. Steven DeNeefe, Kevin Knight, Daniel Marcu, Wei Wang. Syntactic Approaches to MT. Use of syntactic information (noun, verb, etc) in the translation process: Manually constructed rule-based systems Statistical systems Wu & Wong, 1998

tawny
Download Presentation

What’s Wrong with Syntax-Based Machine Translation?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What’s Wrong withSyntax-Based Machine Translation? Steven DeNeefe, Kevin Knight, Daniel Marcu, Wei Wang

  2. Syntactic Approaches to MT • Use of syntactic information (noun, verb, etc) in the translation process: • Manually constructed rule-based systems • Statistical systems • Wu & Wong, 1998 • Yamada & Knight, 2001-2002 • Galley et al, 2004 • Contrast with phrase-based statistical approaches

  3. Phrase-Based Output Gunman of police killed . 被 枪手 警方 击毙 . Decoder Hypothesis #1

  4. Phrase-Based Output Gunman of police attack . 被 枪手 警方 击毙 . Decoder Hypothesis #7

  5. Phrase-Based Output Gunman by police killed . 被 枪手 警方 击毙 . Decoder Hypothesis #12

  6. Phrase-Based Output Killed gunman by police . 被 枪手 警方 击毙 . Decoder Hypothesis #134

  7. Phrase-Based Output Gunman killed the police . 被 枪手 警方 击毙 . Decoder Hypothesis #9,329

  8. Phrase-Based Output Gunman killed by police . 被 枪手 警方 击毙 . Decoder Hypothesis #50,654 Problematic – - Output lacks English auxiliary and determiner - Re-ordering relies on too much luck, instead of on Chinese passive marker

  9. Syntax-Based Output The gunman killed by police . DT NN VBD IN NN NPB PP NP-C VP S 被 枪手 警方 击毙 . Decoder Hypothesis #1

  10. Syntax-Based Output Gunman by police shot . NN IN NN VBD NPB PP NP-C VP S 被 枪手 警方 击毙 . Decoder Hypothesis #16

  11. Syntax-Based Output The gunman was killed by police . DT NN AUX VBN IN NN NPB PP NP-C VP S 被 枪手 警方 击毙 . Decoder Hypothesis #1923

  12. Why Might Syntax Help? • Phrase-based MT output is only “n-grammatical”, not grammatical • Every sentence needs a subject and a verb • Re-ordering is poorly explained as “distortion” -- better explained as syntactic transformation • Arabic to English, VSO  SVO • Function words have syntactic effects even if they are not themselves translated • Chinese passive marker, relative clause marker • But see 2005 NIST MT evaluation: • “None of the top seven systems uses any explicit linguistic information at all!” – Franz Och

  13. Why Might Syntax Hurt? • Less freedom to glue pieces of output together • Search space has fewer output strings • Search space is more difficult to navigate • Rule extraction from bilingual text has limits available phrase-based translations this talk

  14. Why Might Syntax Hurt? • Less freedom to glue pieces of output together • Search space has fewer output strings • Search space is more difficult to navigate • Rule extraction from bilingual text has limits available phrase-based translations available syntax-based translations this talk

  15. Why Might Syntax Hurt? • Less freedom to glue pieces of output together • Search space has fewer output strings • Search space is more difficult to navigate • Rule extraction from bilingual text has limits available phrase-based translations available syntax-based translations this talk

  16. This Talk • Quantitatively compare • A typical phrase-based bilingual extraction algorithm (ATS, Och & Ney 2004) • A typical syntax-based bilingual extraction algorithm (GHKM, Galley et al 2004) • These algorithms picked from two good-scoring NIST-06 systems • Identify areas of improvement for syntax-based rule coverage • Make improvements, test effect on end-to-end MT

  17. Phrase-Based and Syntax-Based Pattern Extraction estring alignment cstring … ATS [Och & Ney, 2004] phrase pairs consistent with word alignment etree alignment cstring … GHKM [Galley et al 2004] syntax transformation rules consistent with word alignment

  18. ATS (Och & Ney, 2004) PHRASE PAIRS ACQUIRED: felt  有 felt obliged  有 责任 felt obliged to do  有 责任 尽 obliged  责任 obliged to do  责任 尽 do  尽 part  一份 part  一份 力 to do  尽 (usually skipped) i felt obliged to do my part 我 有 责任 尽 一份 力

  19. GHKM (Galley et al, 2004) RULES ACQUIRED: VBD(felt)  有 VBN(obliged)  责任 VB(do)  尽 NN(part)  一份 NN(part)  一份 力 S(NP-C(NPB(PRP(I))) VP(x0:VBD VP-C(x1:VBN SG-C(VP(TO(to) VP-C(x2:VB NP-C(NPB(PRP$(my) x3:NN))))))) 我 x0 x1 x2 x3 x4 S NP-C VP VP-C VBD SG-C VP VBN TO VP-C VB NP-C NPB NPB PRP PRP$ NN i felt obliged to do my part 我 有 责任 尽 一份 力

  20. GHKM (Galley et al, 2004) RULES ACQUIRED: VBD(felt)  有 VBN(obliged)  责任 VB(do)  尽 NN(part)  一份 NN(part)  一份 力 VP-C(x0:VBN x1:SG-C)  x0 x1 VP(TO(to) x0:VP-C)  x0 … S(x0:NP-C x1:VP)  x0 x1 + other non-lexical rules S NP-C VP VP-C VBD SG-C VP VBN TO VP-C VB NP-C NPB NPB PRP PRP$ NN i felt obliged to do my part 我 有 责任 尽 一份 力

  21. GHKM Syntax Rules Phrasal Translation Non-constituent Phrases Non-contiguous Phrases S hay, NP poner, NP VP está, cantando VP PRO VP PRT VB NP VBZ VBG there VB NP on put singing is are Context-Sensitive Word Insertion Multilevel Re-Ordering Lexicalized Re-Ordering NP NP1, , NP2 S VB, NP1, NP2 NPB NNS PP NP2 NP1 VP DT NNS P NP1 VB NP2 the of

  22. GHKM Syntax Rules Phrasal Translation Non-constituent Phrases Non-contiguous Phrases S hay, NP poner, NP VP está, cantando VP PRO VP PRT VB NP VBZ VBG there VB NP on put singing is are Context-Sensitive Word Insertion Multilevel Re-Ordering Lexicalized Re-Ordering NP NP1, , NP2 S VB, NP1, NP2 NPB NNS PP NP2 NP1 VP DT NNS P NP1 VB NP2 the of

  23. ATS and GHKM Methods Do Not Coincide GHKM Phrase Pairs Relevant to NIST-02 ATS Phrase Pairs Relevant to NIST-02 134k 43k GHKM has no built-in phrase size limit -- ATS does. GHKM misses phrases due to parse failures. 161k GHKM pulls unaligned English words into phrases. GHKM forced to incorporate some unaligned foreign words into phrases. GHKM only gets minimal rules to explain each segment pair. GHKM forced to incorporate unaligned English words into phrases. GHKM phrases come with applicability conditions.

  24. ATS and GHKM Methods Overlap GHKM Phrase Pairs Relevant to NIST-02 ATS Phrase Pairs actually used in 1-best decodings of NIST-02 (1,994 = 2 per sentence). GHKM misses phrases due to parse failures. 1,994 GHKM phrases come with applicability conditions. GHKM forced to incorporate some unaligned foreign words into phrases. GHKM only gets minimal rules to explain each segment pair. GHKM forced to incorporate unaligned English words into phrases.

  25. So … Some Ideas for Improving Syntax-Based Extraction • Acquire larger rules, beyond “minimal” syntactic explanation of segments 1. Composed rules (Galley et al, 06) 2. Phrasal rules (Marcu et al, 06) • Acquire more general rules 3. Restructure English trees

  26. Composed Rules A Minimal GHKM Rules: B(e1 e2)  c1 c2 C(e3)  c3 A(x0:B x1:C)  x0 x1 Additional Composed Rules: A(B(e1 e2) x0:C) -> c1 c2 x0 A(x0:B C(e3)) -> x0 c3 A(B(e1 e2) C(e3)) -> c1 c2 c3 “big phrasal rule” B C e3 e1 e2 c1 c2 c3

  27. Composed Rules

  28. Phrasal Rules • SPMT Model 1 (Marcu et al 2006) • consider each foreign phrase up to length L • extract smallest possible syntax rule that does not violate alignments

  29. Restructuring English Training Trees NPB DT NPB JJ NPB NNP NPB NNP NPB NNP NNP NPB DT JJ NNP NNP NNP NNP the Israeli Prime Minister Ariel Sharon c1 c2 c3 the Israeli pr. min. Ariel Sharon c1 c2 c3

  30. Restructuring English Training Trees

  31. Effects of Coverage Improvements on Syntax-Based MT Accuracy NIST Bleu r4n4

  32. Conclusions • Phrase-based and syntax-based extraction algorithms have different coverage. • Syntax-based coverage can be improved. • Improvements lead to better translation accuracy +3.6 Bleu for Chinese/English +2.0 Bleu for Arabic/English • Lots of new techniques for syntax-based MT • Quirk et al, 2005 • Liu et al, 2006 • Galley et al, 2006 • Marcu et al, 2006 • Riezler & Maxwell, 2006 • Zollmann & Venugopal, 2006

More Related