1 / 24

CSA4050: Advanced Techniques in NLP

CSA4050: Advanced Techniques in NLP. Machine Translation III Statistical MT. Statistical Translation. Robust Domain independent Extensible Does not require language specialists Uses noisy channel model of translation. Noisy Channel Model Sentence Translation (Brown et. al. 1990).

Download Presentation

CSA4050: Advanced Techniques in NLP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSA4050: Advanced Techniques in NLP Machine Translation III Statistical MT Statistical MT

  2. Statistical Translation • Robust • Domain independent • Extensible • Does not require language specialists • Uses noisy channel model of translation Statistical MT

  3. Noisy Channel ModelSentence Translation (Brown et. al. 1990) target sentence sourcesentence sentence Statistical MT

  4. The Problem of Translation • Given a sentence T of the target language, seek the sentence S from which a translator produced T, i.e. find S that maximises P(S|T) • By Bayes' theorem P(S|T) = P(S) x P(T|S) P(T) whose denominator is independent of S. • Hence it suffices to maximise P(S) x P(T|S) Statistical MT

  5. A Statistical MT System S T Source Language Model Translation Model P(S) * P(T|S) = P(S|T) T S Decoder Statistical MT

  6. The Three Components of a Statistical MT model • Method for computing language model probabilities (P(S)) • Method for computing translation probabilities (P(S|T)) • Method for searching amongst source sentences for one that maximisesP(S) * P(T|S) Statistical MT

  7. ProbabilisticLanguage Models • GeneralP(s1s2...sn) =P(s1)*P(s2|s1) ...*P(sn|s1...s(n-1)) • TrigramP(s1s2...sn) =P(s1)*P(s2|s1)*P(s3|s1,s2) ...*P(sn|s(n-1)s(n-2)) • BigramP(s1s2...sn) =P(s1)*P(s2|s1) ...*P(sn|s(n-1)) Statistical MT

  8. A Simple Alignment Based Translation Model Assumption: target sentence is generated from the source sentence word-by-word S: John loves Mary T: Jean aime Marie Statistical MT

  9. Sentence Translation Probability • According to this model, the translation probability of the sentence is just the product of the translation probabilities of the words. • P(T|S) =P(Jean aime Marie|John loves Mary) =P(Jean|John) * P(aime|loves) * P(Marie|Mary) Statistical MT

  10. More Realistic Example The proposal will not now be implemented Les propositions ne seront pas mises en application maintenant Statistical MT

  11. Some Further Parameters • Word Translation Probability:P(t|s) • Fertility: the number of words in the target that are paired with each source word: (0 – N) • Distortion: the difference in sentence position between the source word and the target word: P(i|j,l) Statistical MT

  12. Searching • Maintain list of hypotheses. Initial hypothesis: (Jean aime Marie | *) • Search proceeds interatively. At each iteration we extend most promising hypotheses with additional wordsJean aime Marie | John(1) *Jean aime Marie | * loves(2) *Jean aime Marie | * Mary(3) *Jean aime Marie | Jean(1) * Statistical MT

  13. Parameter Estimation • In general - large quantities of data • For language model, we need only source language text. • For translation model, we need pairs of sentences that are translations of each other. • Use EM Algorithm (Baum 1972) to optimize model parameters. Statistical MT

  14. Experiment 1 (Brown et. al. 1990) • Hansard. 40,000 pairs of sentences = approx. 800,000 words in each language. • Considered 9,000 most common words in each language. • Assumptions (initial parameter values) • each of the 9000 target words equally likely as translations of each of the source words. • each of the fertilities from 0 to 25 equally likely for each of the 9000 source words • each target position equally likely given each source position and target length Statistical MT

  15. French Probability le .610 la .178 l’ .083 les .023 ce .013 il .012 de .009 à .007 que .007 Fertility Probability 1 .871 0 .124 2 .004 English: the Statistical MT

  16. French Probability pas .469 ne .460 non .024 pas du tout .003 faux .003 plus .002 ce .002 que .002 jamais .002 Fertility Probability 2 .758 0 .133 1 .106 English: not Statistical MT

  17. French Probability bravo .992 entendre .005 entendu .002 entends .001 Fertility Probability 0 .584 1 .416 English: hear Statistical MT

  18. Bajada 2003/4 • 400 sentence pairs from Malta/EU accession treaty • Three different types of alignment • Paragraph (precision 97% recall 97%) • Sentence (precision 91% recall 95%) • Word: 2 translation models • Model 1: distortion independent • Model 2: distortion dependent Statistical MT

  19. Bajada 2003/4 Statistical MT

  20. Experiment 2 • Perform translation using 1000 most frequent words in the English corpus. • 1,700 most frequently used French words in translations of sentences completely covered by 1000 word English vocabulary. • 117,000 pairs of sentences completely covered by both vocabularies. • Parameters of English language model from 570,000 sentences in English part. Statistical MT

  21. Experiment 2 contd • 73 French sentences tested from elsewhere in corpus. Results were classified as • Exact – same as actual translation • Alternate – same meaning • Different – legitimate translation but different meaning • Wrong – could not be intepreted as a translation • Ungrammatical – grammatically deficient • Corrections to the last three categories were made and keystrokes were counted Statistical MT

  22. Results Statistical MT

  23. Results - Discussion • According to Brown et. al., system performed successfully 48% of the time (first three categories). • 776 keystrokes needed to repair 1916 keystrokes to generate all 73 translations from scratch. • According to authors, system therefore reduces work by 60%. Statistical MT

  24. Bibliography • Statistical MTBrown et. al., A Statistical Approach to MT, Computational Linguistics 16.2, 1990 pp79-85 (search “ACL Anthology”) Statistical MT

More Related