chapter 20 part 2
Download
Skip this Video
Download Presentation
Chapter 20 Part 2

Loading in 2 Seconds...

play fullscreen
1 / 25

Chapter 20 Part 2 - PowerPoint PPT Presentation


  • 75 Views
  • Uploaded on

Chapter 20 Part 2. Computational Lexical Semantics Acknowledgements: these slides include material from Rada Mihalcea, Ray Mooney, Katrin Erk, and Ani Nenkova. 1. Knowledge-based WSD. Task definition

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Chapter 20 Part 2' - rasul


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chapter 20 part 2

Chapter 20Part 2

Computational Lexical Semantics

Acknowledgements: these slides include material from Rada Mihalcea, Ray Mooney, Katrin Erk, and Ani Nenkova

1

knowledge based wsd
Knowledge-based WSD
  • Task definition
  • Knowledge-based WSD = class of WSD methods relying (mainly) on knowledge drawn from dictionaries and/or raw text
  • Resources
    • Yes
      • Machine Readable Dictionaries
      • Raw corpora
    • No
      • Manually annotated corpora
machine readable dictionaries
Machine Readable Dictionaries
  • In recent years, most dictionaries made available in Machine Readable format (MRD)
    • Oxford English Dictionary
    • Collins
    • Longman Dictionary of Ordinary Contemporary English (LDOCE)
  • Thesauruses – add synonymy information
    • Roget Thesaurus
  • Semantic networks – add more semantic relations
    • WordNet
    • EuroWordNet
mrd a resource for knowledge based wsd

WordNet definitions/examples for the noun plant

  • buildings for carrying on industrial labor; "they built a large plant to manufacture automobiles”
  • a living organism lacking the power of locomotion
  • something planted secretly for discovery by another; "the police used a plant to trick the thieves"; "he claimed that the evidence against him was a plant"
  • an actor situated in the audience whose acting is rehearsed but seems spontaneous to the audience
MRD – A Resource for Knowledge-based WSD
  • For each word in the language vocabulary, an MRD provides:
    • A list of meanings
    • Definitions (for all word meanings)
    • Typical usage examples (for most word meanings)
mrd a resource for knowledge based wsd1
MRD – A Resource for Knowledge-based WSD
  • A thesaurus adds:
    • An explicit synonymy relation between word meanings
  • A semantic network adds:
    • Hypernymy/hyponymy (IS-A), meronymy/holonymy (PART-OF), antonymy, etc.

WordNet synsets for the noun“plant”

1. plant, works, industrial plant

2. plant, flora, plant life

WordNet related concepts for the meaning “plant life”

{plant, flora, plant life}

hypernym: {organism, being}

hypomym: {house plant}, {fungus}, …

meronym: {plant tissue}, {plant part}

member holonym: {Plantae, kingdom Plantae, plant kingdom}

lesk algorithm
Lesk Algorithm
  • (Michael Lesk 1986): Identify senses of words in context using definition overlap. That is, disambiguate more than one word.
  • Algorithm:
    • Retrieve from MRD all sense definitions of the words to be disambiguated
    • Determine the definition overlap for all possible sense combinations
    • Choose senses that lead to highest overlap

Example: disambiguate PINE CONE

  • PINE

1. kinds of evergreen tree with needle-shaped leaves

2. waste away through sorrow or illness

  • CONE

1. solid body which narrows to a point

2. something of this shape whether solid or hollow

3. fruit of certain evergreen trees

Pine#1  Cone#1 = 0

Pine#2  Cone#1 = 0

Pine#1  Cone#2 = 1

Pine#2  Cone#2 = 0

Pine#1  Cone#3 = 2

Pine#2  Cone#3 = 0

lesk algorithm for more than two words
Lesk Algorithm for More than Two Words?
  • I saw a man who is 98 years old and can still walk and tell jokes
    • nine open class words: see(26), man(11), year(4), old(8), can(5), still(4), walk(10), tell(8), joke(3)
  • 43,929,600 sense combinations! How to find the optimal sense combination?
  • Simulated annealing (Cowie, Guthrie, Guthrie 1992)
    • Let’s review (from CS1571)
search types
Search Types
  • Backtracking state-space search
  • Local Search and Optimization
  • Constraint satisfaction search
  • Adversarial search
local search
Local Search
  • Use a single current state and move only to neighbors.
  • Use little space
  • Can find reasonable solutions in large or infinite (continuous) state spaces for which the other algorithms are not suitable
optimization
Optimization
  • Local search is often suitable for optimization problems. Search for best state by optimizing an objective function.
visualization
Visualization
  • States are laid out in a landscape
  • Height corresponds to the objective function value
  • Move around the landscape to find the highest (or lowest) peak
  • Only keep track of the current states and immediate neighbors
simulated annealing
Simulated Annealing
  • Based on a metallurgical metaphor
    • Start with a temperature set very high and slowly reduce it.
simulated annealing1
Simulated Annealing
  • Annealing: harden metals and glass by heating them to a high temperature and then gradually cooling them
  • At the start, make lots of moves and then gradually slow down
simulated annealing2
Simulated Annealing
  • More formally…
    • Generate a random new neighbor from current state.
    • If it’s better take it.
    • If it’s worse then take it with some probabilityproportional to the temperature and the delta between the new and old states.
simulated annealing3
Simulated annealing
  • Probability of a move decreases with the amount ΔE by which the evaluation is worsened
  • A second parameter T isalso used to determine the probability: high Tallows more worse moves, Tclose to zero results in few or no bad moves
  • Scheduleinput determines the value of Tas a function of the completed cycles
slide16
function Simulated-Annealing(problem, schedule) returns a solution state

inputs: problem, a problem

schedule, a mapping from time to “temperature”

current ← Make-Node(Initial-State[problem])

for t ← 1 to ∞ do

T ← schedule[t]

ifT=0 then return current

next ← a randomly selected successor of current

ΔE ← Value[next] – Value[current]

if ΔE > 0 then current ← next

else current ← next only with probability eΔE/T

intuitions
Intuitions
  • the algorithm wanders around during the early parts of the search, hopefully toward a good general region of the state space
  • Toward the end, the algorithm does a more focused search, making few bad moves
lesk algorithm for more than two words1
Lesk Algorithm for More than Two Words?
  • I saw a man who is 98 years old and can still walk and tell jokes
    • nine open class words: see(26), man(11), year(4), old(8), can(5), still(4), walk(10), tell(8), joke(3)
  • 43,929,600 sense combinations! How to find the optimal sense combination?
  • Simulated annealing (Cowie, Guthrie, Guthrie 1992)
  • Given: W, set of words we are disambiguating
  • State: One sense for each word in W
  • Neighbors of state: the result of changing one word sense
  • Objective function: value(state)
    • Let DWs(state) be the words that appear in the union of the definitions of the senses in state;
    • value(state) = sum over words in DWs(state): # times it appears in the union of the definitions of the senses
    • The value will be higher, the more words appear in multiple definitions.
  • Start state: the most frequent sense of each word
lesk algorithm a simplified version
Lesk Algorithm: A Simplified Version
  • Original Lesk definition: measure overlap between sense definitions for all words in the text
    • Identify simultaneously the correct senses for all words in the text
  • Simplified Lesk (Kilgarriff & Rosensweig 2000): measure overlap between sense definitions of a word and its context in the text
    • Identify the correct sense for one word at a time
  • Search space significantly reduced (the context in the text is fixed for each word instance)
lesk algorithm a simplified version1
Lesk Algorithm: A Simplified Version
  • Algorithm for simplified Lesk:
    • Retrieve from MRD all sense definitions of the word to be disambiguated
    • Determine the overlap between each sense definition and the context of the word in the text
    • Choose the sense that leads to highest overlap

Example: disambiguate PINE in

“Pine cones hanging in a tree”

  • PINE

1. kinds of evergreen tree with needle-shaped leaves

2. waste away through sorrow or illness

Pine#1  Sentence = 1

Pine#2  Sentence = 0

selectional preferences
Selectional Preferences
  • A way to constrain the possible meanings of words in a given context
  • E.g. “Wash a dish” vs. “Cook a dish”
    • WASH-OBJECT vs. COOK-FOOD
  • Alternative terminology
    • Selectional Restrictions
    • Selectional Preferences
    • Selectional Constraints
acquiring selectional preferences
Acquiring Selectional Preferences
  • From raw corpora
    • Frequency counts
    • Information theory measures
preliminaries learning word to word relations
Preliminaries: Learning Word-to-Word Relations
  • An indication of the semantic fit between two words
  • 1. Frequency counts (in a parsed corpus)
    • Pairs of words connected by a syntactic relations
  • 2. Conditional probabilities
    • Condition on one of the words
learning selectional preferences
Learning Selectional Preferences
  • Word-to-class relations (Resnik 1993)
    • Quantify the contribution of a semantic class using all the senses subsumed by that class (e.g., the class is an ancestor in WordNet)
using selectional preferences for wsd
Using Selectional Preferences for WSD
  • Algorithm:
    • Let N be a noun that stands in relationship R to predicate P. Let s1…sk be its possible senses.
    • For i from 1 to k, compute:
    • Ci = {c |c is an ancestor of si}
    • Ai = max for c in Ci A(P,c,R)
    • Ai is the score for sense i. Select the sense with the highest score.
  • For example: Letter has 3 senses in WordNet (written message; varsity letter; alphabetic character) and belongs to 19 classes in all.
  • Suppose we have predicate “write”. For each sense, calculate a score, by measuring association of “write” & direct object, with each ancestor of that sense.
ad