first and second language models to correct preposition errors n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
First and Second Language Models to Correct Preposition Errors PowerPoint Presentation
Download Presentation
First and Second Language Models to Correct Preposition Errors

Loading in 2 Seconds...

play fullscreen
1 / 22

First and Second Language Models to Correct Preposition Errors - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

First and Second Language Models to Correct Preposition Errors. Matthieu Hermet, Alain Désilets National Research Council of Canada. Preposition Errors. A good case study : High error rate More than 17% of errors in our dataset

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'First and Second Language Models to Correct Preposition Errors' - deanna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
first and second language models to correct preposition errors

First and Second LanguageModels to Correct PrepositionErrors

Matthieu Hermet, Alain Désilets

National Research Council of Canada

preposition errors
PrepositionErrors
  • A good case study:
    • High errorrate
      • More than 17% of errors in ourdataset
    • Instance of function-worderrors, correctibleusing corpus-basedmethods
    • Instance of interferenceerrors
preposition errors1
PrepositionErrors
  • 2 major causes:
  • Confusion withpreposition of the samesemantic class

…à la conférence NAACL

…at the NAACL conference

…in the NAACL conference

  • Interferencewith L1

Écouter les intervenants

Listen to the speakers

Listen the speakers

approaches
Approaches
  • Rule-based:
    • Mal-rules: cost of manualcreation
    • Syntacticconstraint relaxation: parser-dependent
  • Corpus-based:
    • Languagemodels: lowcoverage
    • Web as a corpus: bettercoverage
      • Still not enough: lessthan 40% of our data set
approach
Approach
  • Interferenceerrorsmaybe hard to addressproperlythroughcorpus-basedmethods
    • Theyrepresent a model of L2 correctness

 To deal withinterferenceerrors, itmaybeadvantageous to use a model whichtakes L1 intoaccount

roundtrip mt
Roundtrip MT
  • carry out a single round-trip translation at the level of a clause or sentence
  • Use a phrase-based translation system

 Google Translate

roundtrip mt1
Roundtrip MT

Send to phrase-based translation system

L1 (en): “Police arrived at the scene of the crime”

To L1: Policemen arrived at the crime scene

Back to L2: Les policiers sont arrivés sur les lieux du crime

L2 (fr): “Les policiers sont arrivés

à la scène de la crime.”

theory
Theory

Les policiers sont arrivés à la scène du crime

drawback
Drawback
  • The round-trip translatedsentence can show
    • A wrongtranslation

N’hésitez pas de me contacter  s’il vous plait contactez moi

    • A correct translation that uses the wrongpreposition

J’ai de la difficulté de formuler des phrases  je trouve difficile de formuler des phrases

    • A wrong translation that usesthe correct preposition

[…] demandé à mon amie pour le corriger […] demandé à mon amie de le fixer

assessment
Assessment
  • Correctnesscantakeat least twoforms:
    • Correct translation
    • Wrong translation but correct preposition

Twostrategies for evaluation:

    • Clause: the roundtrip translation is a good correction, includingpreposition
    • Prep: the prepositiononlyis correct in the roundtrip translation
assessment1
Assessment
  • In the Clausestrategy, the RT translation is sent back as the correction
  • In the Prepstrategy, weneed a procedure to retrieve the prepositionfrom the incorrect translation

 The prepositiononlyis sent back as the correction

slide12
Prep
  • greedy mining method to retrieve the preposition from the translation
  • Êtreprocheàlui êtreprèsdelui
  • The sequences <prepà> lui == <prepde> lui validates the preposition de as a correction
unilingual
Unilingual
  • An instance of a corpus-basedapproach
    • Web as a probabilisticlanguage-model
    • Strength of an utterancemeasured in number of search hits
  • Practically the Web’scoverageisincomplete
    • Impossible to discriminatewhenzerohits are returned for all alternatives

 Syntacticpruning to maximize chances of hits

pruning 1
Pruning 1
  • Sentence isparsed and reduced to a phrasalminimum around the preposition
    • S  VP or NP (or AP)

I have lived in a smalltown all my life  lived in a smalltown

I’llget a chance to meet people a chance to meet

  • Words are lemmatized
    • Verbs to Infinitive
    • Nouns to singular
pruning 2
Pruning 2
  • Suppressunnecessarywords
    • Adj, whenattributive:

To live in a smalltown To live in a town

This iseasy to understandeasyto understand

    • Adv, in all cases

Call immediately for help  call for help

    • NP or PP

Une fenêtre qui permet au soleil d’entrer

… qui permet d’entrer

… au soleil d’entrer

alternate prepositions
Alternateprepositions
  • Once pruned, replace the erroneouspreposition by alternates
    • Most commonprepositions
      • De, sur, avec, par, pour, à
    • Prepositions of the samesemantic class
      • Localization, temporal, cause, goal, manner, material, possession
  • 1 input sentence = as many sentences as there are alternateprepositions
unilingual1
Unilingual
  • Input Sentence

Il y a une grande fenêtre qui permet au soleil <à> entrer

(there is a large window which lets the sun come in)

  • Syntactic Pruning and Lemmatization

permettre<à> entrer + au soleil <à> entrer

(let come in) (the sun come in)

  • Generation of alternate prepositions
    • semanticallyrelated: dans, en, chez, sur, sous, au, dans, après, avant, en, vers
    • mostcommon: de, avec, par, pour
  • Query and sort alternative phrases

permettre d'entrer: 119 000 hits au soleil d’entrer: 397 hits

permettre avant entrer: 12 hits au soleil avant entrer: 0 hits

permettre à entrer: 4 hits …

permettre en entrer: 2 hits

...

  • → preposition <d'> is returned as correction
results
Results
  • Dataset: 133 sentences extractedfromintermediate-advanced FSL productions
  • Unilingualreturns hits in only~85% of cases
    • Impact of L1 on L2 inputs
    • Incompleteness of the Web as a language model
hybrid
Hybrid
  • Agreement between the two strategies is only 65.4%
  • A thirdstrategy to combine the twomodels
    • MT as a model of controlled incorrectness (here, anglicisms)
    • Web as a model of correctness
hybrid1
Hybrid
  • Triggered when the unilingual approach does not give any hits

 Then send to roundtrip MT - prep

  • Yields results of 82%
conclusion and future work
Conclusion and Future Work
  • Unilingual and roundtrip MT equivalent
  • Hybridapproachseemsrelevant due to the differentparadigms of the twoapproaches
  • More Data
  • Enhancepruning
  • Study in the context of errordetection
  • Extend MT approach to othererror classes