A Method for Word Sense Disambiguation on Unrestricted Text

A method for WSD on Unrestricted Text Authors: Rada Mihalcea and Dan Moldovan Presenter: Marian Olteanu

Introduction • WSD methods: • Information in MRD (machine readable dictionaries) • Supervised training (info from a disambiguated corpus) • Unsupervised training (info from a raw corpus) • Hybrid methods

Approach • Unsupervised learning • Tag all content words (nouns, verbs, adjectives, adverbs) • Use Web as a corpus (Altavista search engine) • Use semantic density (using WordNet)

Algorithm • Use word pairs (one word in the context of the other) • Verb-noun pairs (syntactically linked) • I.e.: investigate report • {report#1, study}, {report#2, news report, story, account, write up}

Algorithm (cont.) • Search for “investigate report” and “investigate study” – first sense • Search for “investigate report”, “investigate news report”, …, “investigate write up” – second sense • Order sense # by counts

Algorithm (cont.) • Repeat for verbs • Use both phrases and NEAR operator – similar results • Select first 4 senses for N and V, first 2 for J and R

Algorithm – step 2 • Compute conceptual density • Apply only for N-V pair (because WN doesn’t have adequate hierarchies for J and R) • Between senses found at step 1 • Count match between nouns in the sub-glosses of the verb and all the hyponyms (+noun) for the noun

Algorithm – step 2 (cont.) • Formula: • I find it flawed (log part) • revise law:

Evaluation • SemCor • Step 1: • Step 2:

Comparison

A Method for Word Sense Disambiguation on Unrestricted Text

A Method for Word Sense Disambiguation on Unrestricted Text

Presentation Transcript

Survey on WSD and IR

A Framework for Unrestricted Whole-Program Optimization

Unrestricted Net Assets

Restricted and Unrestricted Hartree-Fock method

An Unsupervised WSD Algorithm for a NLP System

A High Performance Semi-Supervised Learning Method for Text Chunking

A text cloud as a method of visualizing a document?

Unrestricted Submarine Warfare

We propose a word snapping method for text selection on the tablet OSs.

Unrestricted Submarine Warfare

A Laplacian Method for Video Text Detection

Unrestricted Grammars

A Clustering Method Based on Nonnegative Matrix Factorization for Text Mining

Towards Parsing Unrestricted Text into PropBank Predicate-Argument Structures

Modeling Consensus: Classifier Combination for WSD

Parsing Unrestricted Text

A stochastic Parts Program and Noun Phrase Parser for Unrestricted Text by Kenneth Ward Church

Source: IISI / WSD

A Text Filtering Method For Digital Libraries

A Very Fast Method for Clustering Big Text Datasets

WSD Special Programs

WSD for Applications