1 / 25

Example-based Machine Translation using Structural Translation Examples

Example-based Machine Translation using Structural Translation Examples. Eiji Aramaki* Sadao Kurohashi* * University of Tokyo. Proposed System. Proposed System. Parses an Input Sentence. Selects Structural Translation Examples. Combines them to generate an output tree.

Download Presentation

Example-based Machine Translation using Structural Translation Examples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Example-based Machine Translation using Structural Translation Examples Eiji Aramaki* Sadao Kurohashi* * University of Tokyo

  2. Proposed System

  3. Proposed System Parses an Input Sentence Selects Structural Translation Examples Combines them to generate an output tree Decides the word-order

  4. Structural Translation Examples pos 日本語の (Japanese) Give me obj 新聞を (news paper) Japanese 下さい (give) newspaper • The Advantage of High-Usability • BUT: It requires many technologies • Parsing & Tree Alignment (are still being developed) → A naive method without such technologies may be efficient in a limited domain

  5. Outline • Algorithm • Alignment Module • Translation Module • Experimental Results • Conclusion

  6. System Frame Work Input Bilingual Corpus Translation Memory Alignment module Translation module Output • Alignment module • Builds Translation Examples from Bilingual Corpus • Translation module • Selects Translation Examples • Combines them into a Translation

  7. Alignment Module (1/2) • A sentence pair is analyzed by parsers [Kurohashi1994][Charniak2000] • Correspondences are estimated by Dictionary-based Alignment method [Aramaki 2001] pos 日本語の (Japanese) Give (S (VP (VBP Give) (NP (PRP me)) (NP (DT a) (JJ Japanese) (NN newspaper.)))) me obj 新聞を (news paper) Japanese 下さい (give) newspaper

  8. Alignment Module (2/2) • Translation example = A combinations of correspondences which are connected to each other • With Surrounding phrases (= the parent and children phrases of correspondences) • for Selection of Translation Examples

  9. System Frame Work Input Bilingual Corpus Translation Memory Alignment module Translation module Output • Alignment module • Builds Translation Examples from Bilingual Corpus • Translation module • Selects of Translation Examples • Combines them into a Translation

  10. Translation Module(1/2) TRANSLATION EXAMPE INPUT pos 中国語の (Chinese) Give me obj obj 新聞を (news paper) 新聞を (news paper) 下さい (give) 下さい (give) newspaper • Equality : The number of equal phrases • Context Similarity: • calculated with a Japanese thesaurus • Alignment Confidence: • the ratio of content words which can be found in dictionaries

  11. Translation Module(1/2) TRANSLATION EXAMPE INPUT pos 中国語の (Chinese) Give me obj obj 新聞を (news paper) 新聞を (news paper) 下さい (give) 下さい (give) newspaper • Equality : The number of equal phrases • Context Similarity: • calculated with a Japanese thesaurus • Alignment Confidence: • the ratio of content words which can be found in dictionaries

  12. Context = surrounding phrases TRANSLATION EXAMPE INPUT pos 中国語の (Chinese) pos 日本語の (Japanese) Give me obj obj 新聞を (news paper) 新聞を (news paper) Japanese 下さい (give) 下さい (give) newspaper • Equality : The number of equal phrases • Context Similarity: • calculated with a Japanese thesaurus • Alignment Confidence: • the ratio of content words which can be found in dictionaries

  13. Translation Module(1/2) TRANSLATION EXAMPE INPUT pos 中国語の (Chinese) pos 日本語の (Japanese) Give me obj obj 新聞を (news paper) 新聞を (news paper) Japanese 下さい (give) 下さい (give) newspaper • Equality : The number of equal phrases • Context Similarity: • calculated with a Japanese thesaurus • Alignment Confidence: • the ratio of content words which can be found in dictionaries

  14. Translation Module(2/2) • Selection • Score:= ( Equality + Similarity ) x (λ+ Confidence) • Combine • The dependency relations & the word order in the translation examples are preserved Give Give + me = me Japanese Japanese newspaper newspaper • The dependency relations & the word order between the translation examples are decided by heuristic rules

  15. Exception: Shortcut Input Bilingual Corpus Translation Memory Alignment module Translation module Output If a Translation Example is almost equal to the input ⇒ the system outputs its target parts as it is. • Almost equal = Character-based DP Matching Similarity > 90%

  16. Outline • Algorism • Alignment Module • Translation Module • Experimental Results • Conclusion

  17. Experiments • We built Translation Examples from training-set (only given in IWSLT) Auto. Eval. Result • Dev-set & Test-set score are similar ← the system has no tuning metrics for the dev-set.

  18. Corpus size & Performance The system without a corpus can generate translations using only the translation dictionaries. The score is not saturated ⇒the system will achieve a higher performance if we obtain more corpora.

  19. Subjective Evaluation • Subjective Evaluation Result • Error Analysis • Most of the errors are classified into the following three problems: (1) Function Words (2) Word Order (3) Zero-pronoun 5: "Flawless English" 4: "Good English" 3: "Non-native English" 2: "Disfluent English" 1: "Incomprehensible"

  20. Problem1: Function words • The system selects translation examples using mainly content words ⇒ it sometimes generates un-natural function words • Determiners, prepositions

  21. Problem 2: Word Order • The word order between translation examples is decided by the heuristic rules. • The lack of rules leads to the wrong word order. Problem 3: Zero-pronoun • The input includes zero-pronoun. ⇒ outputs without a pronoun.

  22. Outline • Algorism • Alignment Module • Translation Module • Experimental Results • Conclusion

  23. Conclusions • We described an EBMT system which handles Structural translation examples • The experimental results shows the basic feasibility of this approach • In the future, as the amount of corpora increases, the system will achieve a higher performance

  24. Alignment Module (1/2) 日本語の 日本語の Give 新聞を 新聞を Japanese Japanese 下さい newspaper newspaper 日本語の 日本語の Give me Give 新聞を 新聞を Japanese Japanese 下さい 下さい newspaper newspaper Give 日本語の Give me me 新聞を 新聞を Japanese 下さい 下さい newspaper newspaper

More Related