1 / 80

Towards Interactive and Automatic Refinement of Translation Rules

Towards Interactive and Automatic Refinement of Translation Rules. Ariadna Font Llitjós PhD Thesis Proposal Jaime Carbonell (advisor) Alon Lavie (co-advisor) Lori Levin Bonnie Dorr (Univ. Maryland) 5 November 2004. Outline. Introduction Thesis statement and scope Preliminary Research

ronat
Download Presentation

Towards Interactive and Automatic Refinement of Translation Rules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Interactive and Automatic Refinement of Translation Rules Ariadna Font Llitjós PhD Thesis Proposal Jaime Carbonell (advisor) Alon Lavie (co-advisor) Lori Levin Bonnie Dorr (Univ. Maryland) 5 November 2004

  2. Outline • Introduction • Thesis statement and scope • Preliminary Research • Interactive elicitation of error information • A framework for automatic rule adaptation • Proposed Research • Contributions and Thesis Timeline Interactive and Automatic Rule Refinement

  3. Machine Translation (MT) • Source Language (SL) sentence: Gaudi was a great artist Spanish translation: Gaudi era un gran artista • MT System outputs : *Gaudi estaba un artista grande  *Gaudi era un artista grande Interactive and Automatic Rule Refinement

  4. Spanish Adjectives Automatic Rule Adaptation NP DET NADJ una casa grande a big house NP DET ADJ N NP DET ADJ N NP DET ADJN un gran artista a great artist Completed Work General order: grande big in size Exception: gran exceptional Interactive and Automatic Rule Refinement

  5. Commercial and Online Systems Correct Translation: Gaudi era un gran artista • Systran, Babelfish (Altavista), WorldLingo, Translated.net : • *Gaudi era gran artista • ImTranslation: *El Gaudi era un gran artista • 1-800-Translate  *Gaudi era un fenomenal artista Interactive and Automatic Rule Refinement

  6. Post-editing • Current solutions: Post-editing [Allen, 2003] by human linguists or editors (experts) Automated post-edition module (APE) [Allen & Hogan, 2000] to alleviate the tedious task of correcting most frequent errors over and over • No solution to fully automate post-editing process Interactive and Automatic Rule Refinement

  7. Drawbacks of Current Methods • Manual post-editing  Corrections do not generalize  Gaudi era un artista grande  Juan es un amigo grande (Juan is a great friend) Era una oportunidad grande (It is a great opportunity) • APE  Humans need to predict all the errors ahead of time and code for the post-editing rules; given new error  Interactive and Automatic Rule Refinement

  8. My Solution • Automate post-editing efforts by feeding them back into the MT system. • Possible alternatives: • Automatic learning of post-editing rules + system independent - several thousands of sentences might need to be corrected for the same error • Automatic refinement of translation rules + attacks the core of the problem • for transfer-based MT systems (need rules to fix!) Interactive and Automatic Rule Refinement

  9. Related Work [Corston-Oliver & Gammon, 2003] [Imamura et al. 2003] [Menezes & Richardson, 2001] [Brill, 1993] [Gavaldà, 2000] [Callison-Burch, 2004] Fixing Machine Translation Rule Adaptation [Su et al. 1995] My Thesis Post-editing No pre-existing training data required No human reference translations required Non-expert user feedback [Allen & Hogan, 2000] Interactive and Automatic Rule Refinement

  10. Resource-poor Scenarios (AVENUE) • Lack of electronic parallel data • Lack of computational linguists  Lack of manual grammar Why bother? • Indigenous communities have difficult access to crucial information that directly affects their life (such as land laws, plagues, health warnings, etc.) • Preservation of their language and culture Resource-poor Languages: Mapudungun Quechua Aymara Interactive and Automatic Rule Refinement

  11. How is MT possible for resource-poor languages? Bilingual speakers Interactive and Automatic Rule Refinement

  12. AVENUE Project Overview Elicitation Morphology Rule Learning Run-Time System Word-Aligned Parallel Corpus Learning Module Handcrafted rules Run Time Transfer System Transfer Rules Morpho-logical analyzer Elicitation Corpus Lexical Resources Lattice Elicitation Tool Interactive and Automatic Rule Refinement

  13. My Thesis Elicitation Morphology Rule Learning Run-Time System Rule Refinement Translation Correction Tool Word-Aligned Parallel Corpus Learning Module Handcrafted rules Run Time Transfer System Transfer Rules Morpho-logical analyzer Rule Refinement Module Elicitation Corpus Lexical Resources Lattice Elicitation Tool Interactive and Automatic Rule Refinement

  14. Related Work Fixing Machine Translation Rule Adaptation My Thesis Post-editing Resource-poor languages Interactive and Automatic Rule Refinement

  15. Thesis Statement Given a rule-based Transfer MT system: - Extract useful information from non-expert bilingual speakers to correct MT output. - Automatically refine and expand translation rules, given corrected and aligned translation pairs and some error information, to improve coverage and overall MT quality. Interactive and Automatic Rule Refinement

  16. Assumptions • No parallel training data available • No human reference translations available • The SL sentence needs to be fully parsed by the translation grammar. • Bilingual speakers can give enough information about the MT errors. Interactive and Automatic Rule Refinement

  17. Scope Automatically refine types of errors with: 1. Just user correction information. 2. Correction and error information. 3. A reasonable amount of further user interaction and available correction and errorinformation. Both in manually written and automatically learned grammars [AMTA 2004]. Interactive and Automatic Rule Refinement

  18. Technical Challenges Automatically Refine and Expand Translation Rules minimally Manually written Automatically Learned Automatic Evaluation of Refinement process Elicit minimal MT information from non-expert users Interactive and Automatic Rule Refinement

  19. Preliminary Work Interactive elicitation of error information A framework for automatic rule adaptation

  20. Error Typology for Automatic Rule Refinement (simplified) Local vs Long distance Word vs. phrase + Word change Sense Form Selectional restrictions Idiom Missing constraint Extra constraint Completed Work Interactive elicitation of error information Missing word Extra word Wrong word order Incorrect word Wrong agreement Interactive and Automatic Rule Refinement

  21. TCTool (Demo) Interactive elicitation of error information Actions: Add a word Delete a word Modify a word Change word order Interactive and Automatic Rule Refinement

  22. 1st Eng2Spa User Study Completed Work Interactive elicitation of error information [LREC 2004] • MT error classification  9 linguistically-motivated classes [Flanagan, 1994], [White et al. 1994]: word order, sense, agreement error (number, person, gender, tense), form, incorrect wordandno translation Interactive and Automatic Rule Refinement

  23. Translation Rules {NP,8} NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ;; English parsing: ((x0 def) = (x1 def))  NP definiteness = DET definiteness (x0 = x3)  NP = N (N is the head of the NP) ;; Spanish generation: ((y1 agr) = (y2 agr))  DET agreement = N agreement ((y3 agr) = (y2 agr))  ADJ agreement = N agreement (y2 = x3) )  Pass the features of English N to Spanish N ADJ::ADJ |: [nice] -> [bonito] ((X1::Y1) ((x0 pos) = adj) ((x0 form) = nice) ((y0 agr num) = sg)  Spanish ADJ is singular in number ((y0 agr gen) = masc))  Spanish ADJ is masculine in number Interactive and Automatic Rule Refinement

  24. Completed Work Automatic Rule Refinement Framework Automatic Rule Adaptation • Find best RR operations given a: • Grammar (G), • Lexicon (L), • (Set of) Source Language sentence(s) (SL), • (Set of) Target Language sentence(s) (TL), • Its Parse tree (P), and • Minimal correction of TL (TL’) such that TQ2 > TQ1 • Which can also be expressed as: max TQ(TL|TL’,P,SL,RR(G,L)) Interactive and Automatic Rule Refinement

  25. Types of Refinement Operations Automatic Rule Adaptation NP DET N ADJ NP DET N ADJ una casa bonito a nice house NP DET ADJ N NP DET ADJ N N gender = ADJ gender a nice house una casa bonita Completed Work 1. Refine a translation rule: R0  R1 (R0 modified, either made more specific or more general) R0: R1: Interactive and Automatic Rule Refinement

  26. Types of Refinement Operations (2) Automatic Rule Adaptation NP DET NADJ NP DET ADJ N NP DET ADJ N NP DET ADJN Completed Work 2. Bifurcate a translation rule: R0  R0 (same, general rule)  R1 (R0 modified, specific rule) R0: una casa bonita a nice house R1: un gran artista a great artist Interactive and Automatic Rule Refinement

  27. Formalizing Error Information Automatic Rule Adaptation Completed Work Wi = error Wi’ = correction Wc = clue word NP DET ADJ N NP DETN ADJ Wi = bonito una casabonito a nice house Wc = casa NP DET ADJ N NP DET N ADJ N gender = ADJ gender Wi’ = bonita a nice house una casa bonita Interactive and Automatic Rule Refinement

  28. Triggering Feature Detection Automatic Rule Adaptation Completed Work Comparison at the feature level to detect triggering feature(s) • Delta function: (Wi,Wi’) Examples: (bonito,bonita) = {gender} (comiamos,comia) = {person,number} (mujer,guitarra) = {} If  set is empty, need to postulate a new binary feature gen = masc gen = masc Interactive and Automatic Rule Refinement

  29. Deciding on the Refinement Op Automatic Rule Adaptation Completed Work Given: - Action performed by the user(add, delete, modify, change word order) - Errorinformation available(clue word, word alignments, etc.)  Refinement Action Interactive and Automatic Rule Refinement

  30. Modify Add Delete Change W Order +Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…)Wc –Wc  = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Rule Refinement Operations Interactive and Automatic Rule Refinement

  31. Proposed Work - Batch and Interactive mode User Studies Evaluation

  32. Rule Refinement Example Automatic Rule Adaptation Change word order SL: Gaudí was a great artist MT system output: TL: Gaudí era un artista grande Goal (given by user correction): *Gaudí era un artista grande Gaudí era un gran artista Interactive and Automatic Rule Refinement

  33. Automatic Rule Adaptation 1. Error Information Elicitation Refinement Operation Typology Interactive and Automatic Rule Refinement

  34. 2. Variable Instantiation from Log File Automatic Rule Adaptation Correcting Actions: 1. Word order change (artista grande grande artista): Wi = grande 2. Edited grande into gran: Wi’ = gran identified artist as clue word Wc = artista In this case, even if user had not identified Wc, refinement process would have been the same Interactive and Automatic Rule Refinement

  35. 3. Retrieve Relevant Lexical Entries Automatic Rule Adaptation • No lexical entry for [great  gran] • Duplicate lexical entry [great  grande] and change TL side: ADJ::ADJ |: [great] -> [gran] ((X1::Y1) (…) ((y0 agr num) = sg) ((y0 agr gen) = masc)) (Morphological analyzer:grande = gran) ADJ::ADJ |: [great] -> [grande] ((X1::Y1) (…) ((y0 agr num) = sg) ((y0 agr gen) = masc)) Interactive and Automatic Rule Refinement

  36. 4. Finding Triggering Feature(s) Automatic Rule Adaptation Feature  function: (Wi, Wi’) =   need to postulate a new binary feature: feat1 5. Blame assignment (from MT system output) tree: <((S,1 (NP,2 (N,5:1 "GAUDI") ) (VP,3 (VB,2 (AUX,17:2 "ERA") ) (NP,8 (DET,0:3 "UN") (N,4:5 "ARTISTA") (ADJ,5:4 "GRANDE") ) ) ) )> Grammar S,1 … NP,1 … NP,8 … Interactive and Automatic Rule Refinement

  37. 6. Variable Instantiation in the Rules Automatic Rule Adaptation Wi = grande  POSi = ADJ =Y3, y3 Wc = artista  POSc = N = Y2, y2 {NP,8} ;; Y1 Y2 Y3 NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y2 agr)) ; det-noun agreement ((y3 agr) = (y2 agr)) ; adj-noun agreement (y2 = x3) ) Interactive and Automatic Rule Refinement

  38. 7. Refining Rules Automatic Rule Adaptation • BifurcateNP,8  NP,8 (R0) + NP,8’ (R1) (flip order of ADJ-N) {NP,8’} NP::NP : [DET ADJ N] -> [DET ADJ N] ( (X1::Y1) (X2::Y2) (X3::Y3) ((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y3 agr)) ; det-noun agreement ((y2agr) = (y3 agr)) ; adj-noun agreement (y2 = x3) ((y2 feat1) =c + )) Interactive and Automatic Rule Refinement

  39. 8. Refining Lexical Entries Automatic Rule Adaptation ADJ::ADJ |: [great] -> [grande] ((X1::Y1) ((x0 form) = great) ((y0 agr num) = sg) ((y0 agr gen) = masc) ((y0 feat1) = -)) ADJ::ADJ |: [great] -> [gran] ((X1::Y1) ((x0 form) = great) ((y0 agr num) = sg) ((y0 agr gen) = masc) ((y0 feat1) = +)) Interactive and Automatic Rule Refinement

  40. Done? Not yet Automatic Rule Adaptation NP,8 (R0) ADJ(grande) [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] • Need to restrict application of general rule (R0) to just post-nominal ADJ un artista grande un artista gran un gran artista *un grande artista Interactive and Automatic Rule Refinement

  41. Add Blocking Constraint Automatic Rule Adaptation NP,8 (R0) ADJ(grande) [feat1 = -] [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Can we also eliminate incorrect translations automatically? un artista grande *un artista gran un gran artista *un grande artista Interactive and Automatic Rule Refinement

  42. Making the grammar tighter Automatic Rule Adaptation • If Wc = artista • Add [feat1= +] to N(artista) • Add agreement constraint to NP,8 (R0) between N and ADJ ((N feat1) = (ADJ feat1)) *un artista grande *un artista gran un gran artista *un grande artista Interactive and Automatic Rule Refinement

  43. Batch Mode Implementation Automatic Rule Adaptation Proposed Work • Given a set of user corrections, apply refinement module. • For Refinement Operations of errors that can be refined fully automatically: • Just by using correctioninformation 2. Using correction and error information Interactive and Automatic Rule Refinement

  44. Modify Add Delete Change W Order +Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…)Wc –Wc  = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Rule Refinement Operations Interactive and Automatic Rule Refinement

  45. Modify Add Delete Change W Order +Wc –Wc+Wc –Wc +al –al Wi WcWi(…)Wc –Wc = +rule –rule +al–al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Gaudi was a great artist – Gaudi era un artista grande  Gaudi era un gran artista 1. Correction info only Rule Refinement Operations It is a nice house – Es una casa bonito  Es una casa bonita Interactive and Automatic Rule Refinement

  46. Modify Add Delete Change W Order +Wc–Wc+Wc–Wc +al –al Wi WcWi(…)Wc –Wc = +rule–rule +al–al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ 2. Correction and Error info Rule Refinement Operations PP  PREP NP I am proud of you – Estoy orgullosa tu Estoy orgullosade ti Interactive and Automatic Rule Refinement

  47. Interactive Mode Implementation Automatic Rule Adaptation Proposed Work • Extra error information is required to determine triggering context automatically  Need to give other relevant sentences to the user at run-time (minimal pairs) • For Refinement Operations of errors that can be refined fully automatically but: 3. require a reasonable amount of further user interaction and can be solved by available correction and error information. Interactive and Automatic Rule Refinement

  48. Modify Add Delete Change W Order +Wc–Wc+Wc–Wc +al –al Wi WcWi(…)Wc –Wc  =+rule –rule +al –al=WiWiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Focus 3 Rule Refinement Operations I see them – Veo los  Los veo Interactive and Automatic Rule Refinement

  49. Example Requiring Minimal Pair Automatic Rule Adaptation Proposed Work 1. Run SL sentence through the transfer engine I see them  *veo los  los veo 2. Wi = los butnoWi’ nor Wc • Need a minimal pair to determine appropriate refinement: I see cars veo autos 3. Triggering feature(s): (los,autos) = {pos} PRON(los)[pos=pron] N(autos)[pos=n] Interactive and Automatic Rule Refinement

  50. Refining and Adding Constraints Proposed Work VP,3: VP NP  VPNP VP,3’: VP NP  NPVP + [NP pos =c pron] • Percolate triggering features up to the constituent level: NP: PRON  PRON + [NP pos = PRON pos] • Block application of general rule (VP,3): VP,3: VP NP  VP NP + [NP pos = (*NOT* pron)] Interactive and Automatic Rule Refinement

More Related