evaluating the waspbench
Download
Skip this Video
Download Presentation
Evaluating the Waspbench

Loading in 2 Seconds...

play fullscreen
1 / 28

Evaluating the Waspbench - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Evaluating the Waspbench. A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University of Brighton Credits: UK EPSRC grant WASPS, M34971. Lexicographers need NLP. NLP needs lexicography. Word senses: nowhere truer.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Evaluating the Waspbench' - jason


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
evaluating the waspbench

Evaluating the Waspbench

A Lexicography Tool Incorporating Word Sense Disambiguation

Rob Koeling, Adam Kilgarriff,

David Tugwell, Roger Evans

ITRI, University of Brighton

Credits: UK EPSRC grant WASPS, M34971

word senses nowhere truer
Word senses: nowhere truer
  • Lexicography
    • the second hardest part
word senses nowhere truer1
Word senses: nowhere truer
  • Lexicography
    • the second hardest part
  • NLP
    • Word sense disambiguation (WSD)
      • SENSEVAL-1 (1998): 77% Hector
      • SENSEVAL-2 (2001): 64% WordNet
word senses nowhere truer2
Word senses: nowhere truer
  • Lexicography
    • the second hardest part
  • NLP
    • Word sense disambiguation (WSD)
      • SENSEVAL-1 (1998): 77% Hector
      • SENSEVAL-2 (2001): 64% WordNet
    • Machine Translation
      • Main cost is lexicography
synergy

Synergy

The WASPBENCH

inputs and outputs
Inputs and outputs
  • Inputs
    • Corpus (processed)
    • Lexicographic expertise
inputs and outputs1
Inputs and outputs
  • Outputs
    • Analysis of meaning/translation repertoire
    • Implemented:
      • Word expert
      • Can disambiguate

A “disambiguating dictionary”

inputs and outputs2
Inputs and outputs

MT needs rules of form

in context C, S => T

  • Major determinant of MT quality
  • Manual production: expensive
  • Eng oil => Fr huile or petrole?
    • SYSTRAN: 400 rules
inputs and outputs3
Inputs and outputs

MT needs rules of form

in context C, S => T

  • Major determinant of MT quality
  • Manual production: expensive
  • Eng oil => Fr huile or petrole?
    • SYSTRAN: 400 rules

Waspbench output: thousands of rules

evaluation1
Evaluation

hard

  • Three communities
evaluation2
Evaluation

hard

  • Three communities
  • No precedents
evaluation3
Evaluation

hard

  • Three communities
  • No precedents
  • The art and craft of lexicography
evaluation4
Evaluation

hard

  • Three communities
  • No precedents
  • The art and craft of lexicography
  • MT personpower budgets
five threads
Five threads
  • as WSD: SENSEVAL
  • for lexicography: MED
  • expert reports
  • Quantitative experiments with human subjects
    • India
      • Within-group consistency
    • Leeds
      • Comparison with commercial MT
method
Method
  • Human1

creates word experts

  • Computer

uses word experts to disambiguate test instances

  • MT system

translates same test instances

  • Human2
    • evaluates computer and MT performance on each instance:
    • good / bad / unsure / preferred / alternative
words
Words
  • mid-frequency
    • 1,500-20,000 instances in BNC
  • At least two clearly distinct meanings
    • Checked with ref to translations into Fr/Ger/Dutch
  • 33 words
    • 16 nouns, 10 verbs, 7 adjs
  • around 40 test instances per word
human subjects
Human subjects
  • Translation studies students, Univ Leeds
    • Thanks: Tony Hartley
  • Native/near-native in English and their other language
  • twelve people, working with:
    • Chinese (4) French (3) German (2) Italian (1) Japanese (2) (no MT system for Japanese)
  • circa four days’ work:
    • introduction/training
    • two days to create word experts
    • two days to evaluate output
method1
Method
  • Human1

creates word experts, average 30 mins/word

  • Computer

uses word experts to disambiguate test instances

  • MT system: Babelfish via Altavista

translates same test instances

  • Human2
    • evaluates computer and MT performance on each instance:
    • good / bad / unsure / preferred / alternative
observations
Observations
  • Grad student users, 4-hour training
  • 30 mins per (not-too-complex) word
  • ‘fuzzy’ words intrinsically harder
  • No great inter-subject disparities
    • (it’s the words that vary, not the people)
conclusion
Conclusion
  • WSD can improve MT

(using a tool like WASPS)

future work
Future work
  • multiwords
  • n>2
  • thesaurus
  • other source languages
  • new corpora, bigger corpora
    • the web
ad