first and second language models to correct preposition errors n.
Skip this Video
Loading SlideShow in 5 Seconds..
First and Second Language Models to Correct Preposition Errors PowerPoint Presentation
Download Presentation
First and Second Language Models to Correct Preposition Errors

Loading in 2 Seconds...

play fullscreen
1 / 22

First and Second Language Models to Correct Preposition Errors - PowerPoint PPT Presentation

  • Uploaded on

First and Second Language Models to Correct Preposition Errors. Matthieu Hermet, Alain Désilets National Research Council of Canada. Preposition Errors. A good case study : High error rate More than 17% of errors in our dataset

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'First and Second Language Models to Correct Preposition Errors' - deanna

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
first and second language models to correct preposition errors

First and Second LanguageModels to Correct PrepositionErrors

Matthieu Hermet, Alain Désilets

National Research Council of Canada

preposition errors
  • A good case study:
    • High errorrate
      • More than 17% of errors in ourdataset
    • Instance of function-worderrors, correctibleusing corpus-basedmethods
    • Instance of interferenceerrors
preposition errors1
  • 2 major causes:
  • Confusion withpreposition of the samesemantic class

…à la conférence NAACL

…at the NAACL conference

…in the NAACL conference

  • Interferencewith L1

Écouter les intervenants

Listen to the speakers

Listen the speakers

  • Rule-based:
    • Mal-rules: cost of manualcreation
    • Syntacticconstraint relaxation: parser-dependent
  • Corpus-based:
    • Languagemodels: lowcoverage
    • Web as a corpus: bettercoverage
      • Still not enough: lessthan 40% of our data set
  • Interferenceerrorsmaybe hard to addressproperlythroughcorpus-basedmethods
    • Theyrepresent a model of L2 correctness

 To deal withinterferenceerrors, itmaybeadvantageous to use a model whichtakes L1 intoaccount

roundtrip mt
Roundtrip MT
  • carry out a single round-trip translation at the level of a clause or sentence
  • Use a phrase-based translation system

 Google Translate

roundtrip mt1
Roundtrip MT

Send to phrase-based translation system

L1 (en): “Police arrived at the scene of the crime”

To L1: Policemen arrived at the crime scene

Back to L2: Les policiers sont arrivés sur les lieux du crime

L2 (fr): “Les policiers sont arrivés

à la scène de la crime.”


Les policiers sont arrivés à la scène du crime

  • The round-trip translatedsentence can show
    • A wrongtranslation

N’hésitez pas de me contacter  s’il vous plait contactez moi

    • A correct translation that uses the wrongpreposition

J’ai de la difficulté de formuler des phrases  je trouve difficile de formuler des phrases

    • A wrong translation that usesthe correct preposition

[…] demandé à mon amie pour le corriger […] demandé à mon amie de le fixer

  • Correctnesscantakeat least twoforms:
    • Correct translation
    • Wrong translation but correct preposition

Twostrategies for evaluation:

    • Clause: the roundtrip translation is a good correction, includingpreposition
    • Prep: the prepositiononlyis correct in the roundtrip translation
  • In the Clausestrategy, the RT translation is sent back as the correction
  • In the Prepstrategy, weneed a procedure to retrieve the prepositionfrom the incorrect translation

 The prepositiononlyis sent back as the correction

  • greedy mining method to retrieve the preposition from the translation
  • Êtreprocheàlui êtreprèsdelui
  • The sequences <prepà> lui == <prepde> lui validates the preposition de as a correction
  • An instance of a corpus-basedapproach
    • Web as a probabilisticlanguage-model
    • Strength of an utterancemeasured in number of search hits
  • Practically the Web’scoverageisincomplete
    • Impossible to discriminatewhenzerohits are returned for all alternatives

 Syntacticpruning to maximize chances of hits

pruning 1
Pruning 1
  • Sentence isparsed and reduced to a phrasalminimum around the preposition
    • S  VP or NP (or AP)

I have lived in a smalltown all my life  lived in a smalltown

I’llget a chance to meet people a chance to meet

  • Words are lemmatized
    • Verbs to Infinitive
    • Nouns to singular
pruning 2
Pruning 2
  • Suppressunnecessarywords
    • Adj, whenattributive:

To live in a smalltown To live in a town

This iseasy to understandeasyto understand

    • Adv, in all cases

Call immediately for help  call for help

    • NP or PP

Une fenêtre qui permet au soleil d’entrer

… qui permet d’entrer

… au soleil d’entrer

alternate prepositions
  • Once pruned, replace the erroneouspreposition by alternates
    • Most commonprepositions
      • De, sur, avec, par, pour, à
    • Prepositions of the samesemantic class
      • Localization, temporal, cause, goal, manner, material, possession
  • 1 input sentence = as many sentences as there are alternateprepositions
  • Input Sentence

Il y a une grande fenêtre qui permet au soleil <à> entrer

(there is a large window which lets the sun come in)

  • Syntactic Pruning and Lemmatization

permettre<à> entrer + au soleil <à> entrer

(let come in) (the sun come in)

  • Generation of alternate prepositions
    • semanticallyrelated: dans, en, chez, sur, sous, au, dans, après, avant, en, vers
    • mostcommon: de, avec, par, pour
  • Query and sort alternative phrases

permettre d'entrer: 119 000 hits au soleil d’entrer: 397 hits

permettre avant entrer: 12 hits au soleil avant entrer: 0 hits

permettre à entrer: 4 hits …

permettre en entrer: 2 hits


  • → preposition <d'> is returned as correction
  • Dataset: 133 sentences extractedfromintermediate-advanced FSL productions
  • Unilingualreturns hits in only~85% of cases
    • Impact of L1 on L2 inputs
    • Incompleteness of the Web as a language model
  • Agreement between the two strategies is only 65.4%
  • A thirdstrategy to combine the twomodels
    • MT as a model of controlled incorrectness (here, anglicisms)
    • Web as a model of correctness
  • Triggered when the unilingual approach does not give any hits

 Then send to roundtrip MT - prep

  • Yields results of 82%
conclusion and future work
Conclusion and Future Work
  • Unilingual and roundtrip MT equivalent
  • Hybridapproachseemsrelevant due to the differentparadigms of the twoapproaches
  • More Data
  • Enhancepruning
  • Study in the context of errordetection
  • Extend MT approach to othererror classes