Evaluating the waspbench
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Evaluating the Waspbench PowerPoint PPT Presentation


  • 47 Views
  • Uploaded on
  • Presentation posted in: General

Evaluating the Waspbench. A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University of Brighton Credits: UK EPSRC grant WASPS, M34971. Lexicographers need NLP. NLP needs lexicography. Word senses: nowhere truer.

Download Presentation

Evaluating the Waspbench

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Evaluating the waspbench

Evaluating the Waspbench

A Lexicography Tool Incorporating Word Sense Disambiguation

Rob Koeling, Adam Kilgarriff,

David Tugwell, Roger Evans

ITRI, University of Brighton

Credits: UK EPSRC grant WASPS, M34971


Lexicographers need nlp

Lexicographers need NLP


Nlp needs lexicography

NLP needs lexicography


Word senses nowhere truer

Word senses: nowhere truer

  • Lexicography

    • the second hardest part


Word senses nowhere truer1

Word senses: nowhere truer

  • Lexicography

    • the second hardest part

  • NLP

    • Word sense disambiguation (WSD)

      • SENSEVAL-1 (1998): 77% Hector

      • SENSEVAL-2 (2001): 64% WordNet


Word senses nowhere truer2

Word senses: nowhere truer

  • Lexicography

    • the second hardest part

  • NLP

    • Word sense disambiguation (WSD)

      • SENSEVAL-1 (1998): 77% Hector

      • SENSEVAL-2 (2001): 64% WordNet

    • Machine Translation

      • Main cost is lexicography


Synergy

Synergy

The WASPBENCH


Inputs and outputs

Inputs and outputs

  • Inputs

    • Corpus (processed)

    • Lexicographic expertise


Inputs and outputs1

Inputs and outputs

  • Outputs

    • Analysis of meaning/translation repertoire

    • Implemented:

      • Word expert

      • Can disambiguate

        A “disambiguating dictionary”


Inputs and outputs2

Inputs and outputs

MT needs rules of form

in context C, S => T

  • Major determinant of MT quality

  • Manual production: expensive

  • Eng oil => Fr huile or petrole?

    • SYSTRAN: 400 rules


Inputs and outputs3

Inputs and outputs

MT needs rules of form

in context C, S => T

  • Major determinant of MT quality

  • Manual production: expensive

  • Eng oil => Fr huile or petrole?

    • SYSTRAN: 400 rules

      Waspbench output: thousands of rules


Evaluation

Evaluation

hard


Evaluation1

Evaluation

hard

  • Three communities


Evaluation2

Evaluation

hard

  • Three communities

  • No precedents


Evaluation3

Evaluation

hard

  • Three communities

  • No precedents

  • The art and craft of lexicography


Evaluation4

Evaluation

hard

  • Three communities

  • No precedents

  • The art and craft of lexicography

  • MT personpower budgets


Five threads

Five threads

  • as WSD: SENSEVAL

  • for lexicography: MED

  • expert reports

  • Quantitative experiments with human subjects

    • India

      • Within-group consistency

    • Leeds

      • Comparison with commercial MT


Method

Method

  • Human1

    creates word experts

  • Computer

    uses word experts to disambiguate test instances

  • MT system

    translates same test instances

  • Human2

    • evaluates computer and MT performance on each instance:

    • good / bad / unsure / preferred / alternative


Words

Words

  • mid-frequency

    • 1,500-20,000 instances in BNC

  • At least two clearly distinct meanings

    • Checked with ref to translations into Fr/Ger/Dutch

  • 33 words

    • 16 nouns, 10 verbs, 7 adjs

  • around 40 test instances per word


Words1

Words


Human subjects

Human subjects

  • Translation studies students, Univ Leeds

    • Thanks: Tony Hartley

  • Native/near-native in English and their other language

  • twelve people, working with:

    • Chinese (4) French (3) German (2) Italian (1) Japanese (2) (no MT system for Japanese)

  • circa four days’ work:

    • introduction/training

    • two days to create word experts

    • two days to evaluate output


Method1

Method

  • Human1

    creates word experts, average 30 mins/word

  • Computer

    uses word experts to disambiguate test instances

  • MT system: Babelfish via Altavista

    translates same test instances

  • Human2

    • evaluates computer and MT performance on each instance:

    • good / bad / unsure / preferred / alternative


Results

Results (%)


Results by pos

Results by POS (%)


Observations

Observations

  • Grad student users, 4-hour training

  • 30 mins per (not-too-complex) word

  • ‘fuzzy’ words intrinsically harder

  • No great inter-subject disparities

    • (it’s the words that vary, not the people)


Conclusion

Conclusion

  • WSD can improve MT

    (using a tool like WASPS)


Future work

Future work

  • multiwords

  • n>2

  • thesaurus

  • other source languages

  • new corpora, bigger corpora

    • the web


  • Login