Motivations for transfer-based translation - PowerPoint PPT Presentation

tatum
motivations for transfer based translation n.
Skip this Video
Loading SlideShow in 5 Seconds..
Motivations for transfer-based translation PowerPoint Presentation
Download Presentation
Motivations for transfer-based translation

play fullscreen
1 / 65
Download Presentation
Motivations for transfer-based translation
116 Views
Download Presentation

Motivations for transfer-based translation

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Motivations for transfer-based translation • lexical ambiguity • structural differences See further Ingo 91

  2. Example 1 Sv. Fyll på olja i växellådan.  En. Fill gearbox with oil. (from the Scania corpus) • fyll på  fill • obj  adv • adv  obj

  3. Example 2 Sv. I oljefilterhållaren sitter en överströmningsventil.  En. The oil filter retainer has an overflow valve. (from the Scania corpus) • sitter  has • adv  subj • subj  obj

  4. Transfer-based translation • intermediary sentence structure • basic processes • analysis • transfer • generation (synthesis) • language modules • dictionary and grammar of SL • transfer dictionary and transfer rules • dictionary and grammar of TL

  5. Direct translation SL TL Metal Transfer Multra Interlingua

  6. Levels of intermediary structure • cf. J&M, Chapter 21 • word order

  7. Metal • See H&S

  8. MULTRA Multilingual Support for Translation and Writing • translation engine • transfer-based • shake-and-bake • modular • unification-based • preference machinery • trace-able

  9. Analysis • chart parser (Lisp  C) • procedural formalism • unification and other kinds of operations • sentence structure • feature structure • grammatical relations • surface order implicit via grammatical relations See further Sågvall Hein&Starbäck (99),Weijnitz (02), Dahllöf (89)

  10. Transfer • unification-based • declarative formalism • Multra transfer formalism (Beskow 93) • lexical and structural rules • rules are partially ordered • a more specific rule takes precedence over a less specific one • specificity in terms of number of transfer equations • all applicable rules are applied • written in prolog

  11. Generation • syntactic generation • Multra syntactic generation formalism (Beskow 97a) • PATR-like style • unification • concatenation • typed features • morphological generation (Beskow 97b) • lexical insertion rules • morphological realisation and phonological finish in prolog • written in prolog

  12. An example: Tippa hytten. Tippa hytten. : (* = (PHR.CAT = CL MODE = IMP SUBJ = 2ND VERB = (WORD.CAT = VERB INFF = IMP DIAT = ACT LEX = TIPPA.VB.1 VSURF = +) OBJ.DIR = (PHR.CAT = NP NUMB = SING GENDER = UTR CASE = BASIC DEF = DEF HEAD = (LEX = HYTT.NN.1 WORD.CAT = NOUN))) REG = (V1.LEM = TIPPA.VB) SEP = (WORD.CAT = SEP LEX = STOP.SR.0)))

  13. Transfer structure Transfer structure [VERB : [WORD.CAT : VERB LEX : TILT.VB.0 DIAT : ACT INFF : IMP] OBJ.DIR : [PHR.CAT : NP DEF : DEF NUMB : SING HEAD : [WORD.CAT : NOUN LEX : CAB.NN.0]] MODE : IMP SUBJ: 2ND VSURF: + SEP : [WORD.CAT : SEP LEX : STOP.SR.0] PHR.CAT : CL]

  14. Generation Tilt the cab.

  15. A grammar rule defrule legal.obj { <?1 phr.cat> = 'np, not <?1 case> = 'gen, not <?1 case> = 'subj }

  16. Transfer rules • copy feature • delete feature • transfer feature • assign feature

  17. Copy feature LABEL mode SOURCE <* mode> = ?x1 TARGET <* mode> = ?x2 TRANSFER

  18. Delete feature LABEL REG SOURCE <* REG> = ANY TARGET <*> = <*> TRANSFER

  19. Transfer feature LABEL OBJ.DIR SOURCE <* OBJ.DIR> = ?x1 TARGET <* OBJ.DIR> = ?x2 TRANSFER ?x1 <=> ?x2

  20. Define feature LABEL trycka.in-press SOURCE <* lex sym>=trycka.vb+in.ab.1 <* word.cat>=VERB TARGET <* lex>=press.vb.1 <* word.cat>=VERB TRANSFER

  21. A generation rule LABEL CL.IMP X1 ---> X2 X3 X4 : <X1 PHR.CAT> = CL <X1 VERB> = <X2> <X1 TYPE> = IMP <X1 OBJ.DIR> = <X3> <X1 SEP> = <X4>

  22. A contextual lexical rule LABEL tänka.på-think.about SOURCE <* verb lex sym> = tänka.vb.1 <* obj.prep phr.cat> = pp <* obj.prep prep> = ?prep <* obj.prep prep lex sym> = på.pp.1 <* obj.prep rect> = ?rect1 TARGET <* obj.prep phr.cat> = pp <* obj.prep prep word.cat> = PREP <* obj.prep prep lex> = about.pp.1 <* obj.prep rect> = ?rect2 TRANSFER ?rect1<=>?rect2

  23. A generation trace 1-Applying Rule cl-sep 1- Applying Rule cl.imp 1- Applying Rule subj2nd-verb-obj.dir 1- Applying Rule verb.main.act 1- Applying Rule np.the-df 1- Applying Rule ng.noun-def 1-Success!

  24. Language resources in the MATS system • dictionary in a database with different views • analysis grammar • transfer grammar • incl. contextually defined lexical rules • generation grammar

  25. sv-en_LinkLexicon

  26. en-Inflections

  27. en_LemmaLexicon

  28. en_LexemeLexicon

  29. en_Lexicon

  30. en_StemLexicon

  31. sv_Inflections

  32. sv_LemmaLexicon

  33. sv_LexemeLexicon

  34. sv_Lexicon

  35. sv_StemLexicon

  36. The MATS system Frozen demo…

  37. Assignment 2: Working with MATS http://stp.ling.uu.se/~evapet/mt04/assignment2.html

  38. Lexicalistic translation • Identify (lexical) translation units in the source sentence • Translate each unit separately (considering the context) • Order the result in agreement with a model of the target language Formulation due to Lars Ahrenberg; see further AH (reading list) ; see also Beaven, L. John, Shake-and-Bake Machine Translation. Coling –92, Nantes, 23-28 Aout 1992.

  39. T4F – a lexicalistic system • processes in T4F • tokenisation • tagging • transfer • transposition • filtering See further AH (in the reading list)

  40. Interlingua translation • See SN

  41. Applications of alignment • translation memories • translation dictionaries • lexicalistic translation • statistical machine translation • example-based translation

  42. Translation memories • based on sentence links • optionally, sub sentence links See further Macklovitch, E. (2000)

  43. Translation dictionaries • based on word links • refinement of word links

  44. Refinement of word alignment data • neutralise capital letters where appropriate • lemmatise or tag source and target units • identify ambiguities • search for criteria to resolve them • identify partial links • compounds? • remove or complete them • manual revision?

  45. Informally about statistical MT • build a translation dictionary based on word alignment • aim for as big fragments as possible • keep information on link frequency • build an n-gram model of the target language • implement a direct translation strategy • including alternatives ordered by length and frequency • process the output by the n-gram model filtering out the best alternatives and adjust the translation accordingly

  46. Example-based MT HS (in the reading list)