Word sense disambiguation
Download
1 / 26

Word Sense Disambiguation - PowerPoint PPT Presentation


  • 253 Views
  • Uploaded on

Word Sense Disambiguation. 2000. 3. 24. 자연언어 처리 특강. Contents. Introduction and preliminaries Supervised Learning Bayesian Classification Information Theoretic Approach Dictionary Based Disambiguation Disambiguation based on sense definitions Thesaurus-based Disambiguation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Word Sense Disambiguation' - saeran


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Word sense disambiguation

Word Sense Disambiguation

2000. 3. 24.

자연언어 처리 특강


Contents
Contents

  • Introduction and preliminaries

  • Supervised Learning

    • Bayesian Classification

    • Information Theoretic Approach

  • Dictionary Based Disambiguation

    • Disambiguation based on sense definitions

    • Thesaurus-based Disambiguation

    • Disambiguation based on translations in a second-language corpus

    • One Sense/Discourse,One Sense/Collocation

  • Unsupervised Learning


Introduction
Introduction

  • Word Sense disambiguation

    • Word sense ambiguity

      • ‘Bank’ : 둑, 은행

      • ‘Title’ : 분야에 따라 다른 의미

        • 표제, 직함, 권리, 금의 순도, 선수권 …

        • In gallery : ‘This work doesn’t have a title’

      • ‘butter’ : 품사에 따른 의미 차이

    • Semantic Tagging


Preliminaries
Preliminaries

  • Supervised vs. Unsupervised learning

    • Supervised : classification

    • Unsupervised : clustering

  • Pseudowords

    • Large training/test collection 획득

      • ‘banana-door’ : corpus의 banana와 door에 대한 ambiguity를 가정

  • Upper and lower bounds

    • Upper bound : Human power.

      • Gale et al.’s work : 쌍으로 주어진 문제들에 대해 같은 의미를 갖는지 판단하도록 함 (97%~99% 정확률)

    • Lower bound : 많이 쓰이는 의미로 고정했을 때


Supervised learning
Supervised Learning

  • Two Approach

    • Bayesian Classification

      • Context window 내의 단어들을 source로 판단

      • Structure를 고려하지 않음

    • Information-theoretic approach

      • Context내의 한가지 information feature(indicator)를 통해 sense 결정


Bayesian classification
Bayesian Classification

  • Bayes’s decision rule

    • Baye’s rule


Bag of words
Bag of words

  • Navie Bayes assumptions

    • context window ‘c’에 대해서

    • Use MLE

      • P(vj|sk)=C(vj ,sk)/C(sk)

      • P(sk) = C(sk)/C(w)

      • sense s’에 대해 (p.238 Fig 7.1)



Information theoretic approach
Information-theoretic approach

  • Brown et al.’s (1991) work

    • 불영 번역 시스템에 사용

    • I(P; Q)를 최대화 하는 Indicator를 사용

      • P: 대역어 집합, Q : indicator value 집합

      • Mutual information


Algorithm
Algorithm

  • Maximize I(P; Q)

    • 모든 가능한 indicator에 대해 계산

    • I(P;Q)가 가장 커지는 indicator와 Q의 partition set을 구함

      • Flip-Flop algorithm(p. 240, Fig 7.2)

  • Find random partition P={P1,P2} of {T1…Tm}

  • While (improving) do

    • Find partition Q={Q1,Q2} of {X1…Xn} maximizes I(P;Q)

    • Find partition P={P1,P2} of {t1…tm} maximizes I(P;Q)

  • End

  • (T1…Tm : tranlation word, X1…Xn : indicator’s possible value)


Dictionary based disambiguation
Dictionary-Based Disambiguation

  • 단어의 의미분류에 대한 정보가 없을 때

  • 세가지 접근 방법

    • 사전의 의미정보 만을 사용 (Lesk, 1986)

    • 시소러스 정보 사용 (Yarowsky, 1992)

    • Bilingual dictionary와 이언어 corpus 사용(Dagan and Itai,1994)


Disambiguation based on sense definitions
Disambiguation based on sense definitions

  • 사전의 정의를 사용

    • D1…Dk에 대해,s1…sk의 의미를 설정

    • Algorithm(p.243, Fig 7.3)

    • Accuracy : 50% ~ 70%

  • comment: Given context c

  • for all senses sk of w do

    • score(sk) = overlap(Dk, Evj)

  • end

  • s’=argmax score(sk)

  • *.Evj : context에 있는 사전 정의문의 단어들


Example
Example

  • word ‘ash’

    • 사전정의

    • scoring


Thesaurus based disambiguation
Thesaurus-based Disambiguation

  • 시소러스의 의미 분류 정보를 사용

    • Walker’s algorithm (1987) (p.245, Fig. 7.4)

    • Yarowsky’s algorithm

      • Baye’s classifier 사용

      • context 의 category를 구하고, 그것을 이용해 단어의 catetgory를 구해 의미를 결정한다

comment: given context c

for all senses sk of w do

score(sk) =  vj in c (t(sk),vj)

end

s’ = arg max score(sk)

*. (t(sk),vj) = 1 , iff t(sk)가 vj의 subject code에 포함될 때

= 0, 그 밖의 경우


Yarowsk s algorithm
Yarowsk’s algorithm

  • context 의 score 계산 (p.246, Fig 7.5)

    • Navie Bayes assumption

      • score(ci,tl) = P(tl|ci)

      • sense s’에대해,


Some results
Some Results

  • Roget categories


Disambiguation based on translations in a second language corpus
Disambiguation based on translations in a second-language corpus

  • Dagan and Itai(1994)

    • 번역어의 분포에 따라 의미 결정

    • Algorithm(p.249, Fig 7.6)

    • 공기어의 대역어에 대한 코퍼스의 분포로 의미 결정

  • comment: Given : a context c in which w occurs in relation R(w,v)

  • for all senses sk of w do

    • score (sk)= |{cS | w’ T(sk), v’ T(v): R(w’,v’) c}|

  • end

  • s’ =arg max score(sk)

    • *. S : second language corpus

    • *. T(x) : possible translation of x


Example1
Example corpus

  • ‘interest’

    • ‘show interest’ : show  zeigen

      • zeigen은 interesse와 붙어 나오게 됨

      • sense2 선택


One sense per discourse one sense per collocation
One Sense per Discourse, corpusOne Sense per Collocation

  • One sense per discourse

    • 한 문서 내에서 단어는 한가지 sense를 갖게 될 확률이 높다

  • One sense per collocation

    • 가까이 있는 단어는 목적 단어의 sense의 힌트가 되기 쉽다

    • collocation 정보를 이용해 단어의 sense 결정 (collocation word f : )


Unsupervised disambiguation
Unsupervised Disambiguation corpus

  • Completely unsupervised disambiguation

    • sense tagging은 불가능

    • context-group 판별

      • clustering 을 통해 grouping

      • Gale et al.’s Baye’s classifier와 유사한 확률 모델

        • 정해진 K에 대하여 s1… sK의 group(sense) 가정

        • P(sk|c) 값 계산

        • EM algorithm (p.254 Fig 7.8)으로 확률값 계산


Unsupervised disambiguation cont
Unsupervised Disambiguation (cont.) corpus

  • K 값의 결정

    • K값이 커지면 sense 구분이 세밀해 짐  많은 training corpus 필요

    • corpus 양에 따라 결정

  • 사전의 참조나, tagging 된 corpus없이 sense 차이를 구분 할 수 있다.

    • 정보검색에 유용


Word sense
Word Sense corpus

  • Word Sense 란?

    • 의미의 차이에 대한 정신의 표현

    • sense 를 정하는 기준 : 정신의 올바른 표현인가?

  • Systematic Polysemy

    • Co-activation (p.258 7.9, 7.10)

    • ‘the act of X’ and ‘the people doing X’

      • Organization, administration, formation …

    • Proper nouns : Brown, Bush, Army …

  • Application


ad