Why the word sense disambiguation problem can t be solved and what should be done instead
Download
1 / 39

Why the - PowerPoint PPT Presentation


  • 210 Views
  • Updated On :

Why the “Word Sense Disambiguation Problem” can't be solved, and what should be done instead . Patrick Hanks Masaryk University, Brno Czech Republic [email protected] U. of Wolverhampton May 2, 2007. Traditional WSD procedure.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Why the ' - RexAlvis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Why the word sense disambiguation problem can t be solved and what should be done instead l.jpg
Why the “Word Sense Disambiguation Problem” can't be solved, and what should be done instead

Patrick Hanks

Masaryk University, Brno

Czech Republic

[email protected]

U. of Wolverhampton

May 2, 2007


Traditional wsd procedure l.jpg
Traditional WSD procedure solved, and what should be done instead

  • Take a list of senses of each word from a source (typically WordNet or LDOCE)

  • Stipulate “disambiguation criteria” for different word senses

  • Applies the criteria to unseen texts

  • Results: some successes; many failures

    unresolved ambiguities; cases where no criteria are satisfied; etc.

  • Declare success.


Yorick wilks l.jpg
Yorick Wilks solved, and what should be done instead

  • A leading figure in Artificial Intelligence.

  • His theory of preference semantics has been hugely influential (on me and Paul Procter among many others and - through us - on lexicography)

  • Wilks rightly characterizes the Semantic Web as “the apotheosis of annotation” and asks, “But what are its semantics?”

    • However, his 2005 paper (with Nancy Ide), Making Sense about Sense, gets things wrong.


What sort of inventory l.jpg
What sort of inventory? solved, and what should be done instead

  • “Contemporary automatic WSD assigns sense labels drawn from a pre-defined sense inventory to words in context. ... If dictionaries are not a good source of sense inventories useful in NLP, where do we turn?” -- Ide and Wilks 2005

    • Yes, but for mapping meaning onto use, you also need a source of syntagmatic inventories (including info about collocations).

    • Only Cobuild says anything systematically about syntagmatics of each word.

    • The WSD people have never tried to use Cobuild.

    • LDOCE has no information about collocations.

    • You can’t extract information that isn’t there.


Ide and wilks on lexicographers l.jpg
Ide and Wilks on lexicographers solved, and what should be done instead

  • “Whatever kind of lexicographer one is dealing with, ... their goal is and must be the explanation of meaning to one who does not know it.”

    • Ide and Wilks again

    • One might as well say:

      • “Whatever kind of computational linguist one is dealing with, ... their goal is and must be the translation of texts into a foreign language without human intervention.”

    • One of many goals of lexicographers is to compile inventories. Ask a suitably trained lexicographer to compile a syntagmatic inventory and he/she will compile one.


Ide and wilks on successful wsd l.jpg
Ide and Wilks on “successful” WSD solved, and what should be done instead

  • “All successful WSD has operated at ... the homograph rather than the sense level ... (e.g. “crane” = bird or machine) ... basically those [distinctions] that can be found easily in parallel texts in different languages.”

    • Ide and Wilks again

    • You mean, like Frenchgrue, Czechjeráb?


Does mt really succeed at the homograph level l.jpg
Does MT really succeed at the homograph level? solved, and what should be done instead

  • Consider:

    • Eng. crane --> Ger. Hebewerkzeug, Kranich, Kran.

  • Google’s t-translate makes a horrible mess of:

    • A crane had built its nest on the roof vs.

    • They used a crane to lift the goods.

  • The words are ambiguous but the contexts are unambiguous.


Reformulation of the aim of wsd l.jpg
Reformulation of the aim of WSD solved, and what should be done instead

  • WSD should aim to disambiguate all uses of words that are not ambiguous in the contexts in which they are used.

    • There is a crane in my garden

      • ambiguous.

    • A crane had built its nest on the roof and

    • They used a crane to lift the goods

      • not ambiguous.


What do we need l.jpg
What do we need? solved, and what should be done instead

  • We need a dictionary of contexts.

  • There isn’t one.


Tim berners lee l.jpg
Tim Berners-Lee solved, and what should be done instead

  • Another hugely influential figure

  • Inventor of the word-wide web

  • Co-author (with Hendler and Lassila) of an article in Scientific American (2001) predicting “the semantic web”.


Semantic web the dream l.jpg
Semantic Web: the dream solved, and what should be done instead

  • To enable computers to manipulate data meaningfully.

  • “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.”

    • Berners-Lee et al., 2001


Why are people so excited about the semantic web idea l.jpg
Why are people so excited about the Semantic Web idea? solved, and what should be done instead

  • It offers “unchecked exponential growth” of “data and information that can be processed automatically”

    • Berners-Lee et al., 2001

  • Distributed, not centrally controlled

    • but with scientists as “guardians of truth”? -Wilks

  • “... paradoxes and unanswerable questions are a price that must be paid to achieve versatility.”

    • Berners-Lee et al., 2001


Semantic web the reality l.jpg
Semantic Web: the reality solved, and what should be done instead

  • RDF (Resource Description Framework) handles only html-tagged entities and precisely defined items.

  • In SW jargon “ontology” means a list of names, addresses, documents, and other tagged, defined entities.

  • The SW does not engage with natural language.

  • PREDICTION: If it does, then in the current state of NLP it will come unstuck.


Three senses of ontology l.jpg
Three senses of ‘ontology’ solved, and what should be done instead

  • An ordered theory of everything that exists.

    • Aristotle, philosophers

  • A list of all the concept words in a language (usually one that is ordered hierarchically)

    • WordNet, computational linguists

  • A list of names, addresses, dates, and other entities

    • Semantic Web people


Semantic web as a librarian l.jpg
Semantic Web as a librarian solved, and what should be done instead

  • All efforts devoted to tagging and classifying documents.

  • The SW currently has neither the time nor the skill needed to look inside the documents and read what they say.

  • If the dream is to be fulfilled, then sooner or later the SW must engage with the vague, fuzzy phenomenon that is meaning in natural language. It must learn to process unstructured text.


Hypertext l.jpg
Hypertext solved, and what should be done instead

  • “The power of hypertext is that anything can link to anything.”

    • Berners-Lee et al., 2001

  • Yes, but we need procedures for determining (automatically) what counts as a relevant link, e.g.

    • Firing a personis relevant to employment law.

    • Firing a gun is relevant to warfare and criminal law.


Precise definition does not help discover implicatures l.jpg
Precise definition does not help discover implicatures solved, and what should be done instead

  • The meaning of the English noun second is vague: “a short unit of time” and “1/60 of a minute”.

    • Wait a second.

    • He looked at her for a second.

  • It is also a very precisely defined technical term in certain scientific contexts, the basic SI unit of time:

    • “the duration of 9,192,631,770 cycles of radiation corresponding to the transition between two hyperfine levels of the ground state of an atom of caesium 133.”


Being precise about vagueness l.jpg
Being precise about vagueness solved, and what should be done instead

  • Giving a precise definition to an ordinary word removes it from ordinary language.

  • When it is given a precise, stipulative definition, an ordinary word becomes a technical term.

  • “An adequate definition of a vague concept must aim not at precision but at vagueness; it must aim at precisely that level of vagueness which characterizes the concept itself.”

    • Wierzbicka 1985


A new resource l.jpg
A New Resource solved, and what should be done instead

  • The Pattern Dictionary of English Verbs and their Arguments

    • Based on detailed, painstaking corpus pattern analysis (CPA)

    • Drawing on a new, lexically based theory of language, the “theory of norms and exploitations” (TNE)

    • Being built in Brno (CZ) and Brandeis (USA)


Cpa corpus pattern analysis l.jpg
CPA (Corpus Pattern Analysis) solved, and what should be done instead

  • Identify usage patterns for each word

    • Patterns include semantic types and lexical sets of arguments (valencies)

  • Associate a meaning (“implicature”) with each pattern (not with the word in isolation)

  • Match occurrences of the target word in unseen texts to the nearest pattern (“norm”)

  • If 2 matches are found, choose the most frequent

  • If no match is found, it is not normal usage -- it is an exploitation of a norm (or a mistake).


Dictionaries and ontologies l.jpg
Dictionaries and Ontologies solved, and what should be done instead

  • “Patterns include semantic types” .... What are these?

  • Dictionaries don’t show semantic type structure.

  • Ontologies such as WordNet and the Brandeis Semantic Ontology (BSO) show a hierarchical structures of semantic types, e.g. a gun, pistol, revolver, rifle, cannon, mortar, Kalashnikov, ... is a:

    weapon

    artifact

    physical object

    entity


Brandeis semantic ontology l.jpg
Brandeis Semantic Ontology solved, and what should be done instead

  • A hierarchy of semantic concepts, with links to words at the appropriate level.

  • Example (shortened and edited a bit):

    Name: gun

    Type:Firearm

    Inheritance tree:TopType > Entity > Material Entity > Artifact > Weapon > Firearm

    Telic:Attack with Weapon


Ontological reasoning l.jpg
Ontological reasoning solved, and what should be done instead

EXAMPLE:

If it’s a gun, it must be a weapon, an artefact, a physical object, and an entity, and it is used for attacking people and things.

  • Otherwise known as ‘semantic inheritance’

  • So far, so good.

  • How useful is ontological information as a basis for verbal reasoning?

  • Not as useful as we would like.


Semantics and usage 1 l.jpg
Semantics and Usage (1) solved, and what should be done instead

  • He was pointing a gun at me

    -- is a Weapon and a Material Entity.

    BUT

    2. A child’s toy gun

    -- is an Entertainment Artifact, not a Weapon

    3. The fastest gun in the west

    -- is a Human < Animate Entity, not a Weapon

  • “must be a weapon” on the previous slide is too strong; should be “is probably a weapon”

  • probabilities can be measured, using corpus data

  • The normal semantics of terms are constantly exploited to make new concepts (as in 2 and 3)


Semantics and usage 2 l.jpg
Semantics and Usage (2) solved, and what should be done instead

  • Knowing the exact place of a word in a semantic ontology is not enough

  • To compute meaning, we need more info....

  • Another major source of semantic information (potentially) is usage:

    • how words go together (normally | unusually | never)

  • How do patterns of usage (syntagmatic) mesh with the information in an ontology?

    • See Church and Hanks (1989), Cobuild, etc.


The semantics of norms l.jpg
The Semantics of Norms solved, and what should be done instead

  • Dennis closed his eyes and firedthe gun

    • [[Human]] fires [[Firearm]]

  • He fired a single round at the soldiers

    • [[Human]] fires [[Projectile]] {at [[PhysObj = Target]]}

      • BOTH PATTERNS MEAN: [[Human]] causes [[Firearm]] to discharge [[Projectile]] towards [[Target]]

  • Rumsfeld firesanyonewho stands up to him.

    • [[Human 1 = Employer]] fires [[Human 2 = Employee]]

      • MEANS: discharge from employment

    • The semantic roles Employer and Employee are assigned by context -- they are not part of the type structure of the language.


Complications and distractions l.jpg
Complications and Distractions solved, and what should be done instead

Minor senses:

  • reading this new book firedme with fresh enthusiasm to visit this town

    • [[Event]] fire[[Human]] {with [[Attitude = Good]]}

  • Mr. Walker firedquestions at me.

    • [[Human 1]] fire [[Speech Act]] {at [[Human 2]]}

      Named inanimate entity:

  • I ... got back on Mabel and fired herup.

    • Mabel is [[Artifact]] (a motorbike, actually)

    • [[Human]] fire [[Artifact > Energy Production Device]] {up}


What do you do with a gun l.jpg

Collocate (verb with solved, and what should be done instead gun as object)

Frequency of collocation

Salience (rank)

BNC

OEC

BNC

OEC

fire

104

1132

45.39 (1)

60.96 (2)

point

59

1639

30.80 (2)

61.37 (1)

carry

85

974

28.42 (3)

44.87 (10)

jump

31

434

27.77 (4)

46.35 (8)

brandish

11

98

25.86 (5)

42.55 (14)

wave

20

249

20.58 (6)

---

hold

70

1504

20.38 (7)

44. 79 (11)

aim

-

663

-

54.96 (4)

load

-

330

-

48.70 (7)

What do you do with a gun?

Word Sketch Engine: freq. of gun: BNC 5,269; OEC 91,781


Shimmering lexical sets 1 l.jpg
Shimmering Lexical Sets (1) solved, and what should be done instead

  • weapon:carry, surrender, possess, use, deploy, fire, acquire, conceal, seize, ...

    ____

  • gun:fire, carry, point, jump, brandish, wave, hold, cock, spike, load, reload, ...

  • rifle:fire, carry, sling (over one’s shoulder), load, reload, aim, drop, clean, ...

  • pistol:fire, load, level, hold, brandish, point, carry, wave, ...

  • revolver: empty, draw, hold, carry, take, ...


Shimmering lexical sets 2 l.jpg
Shimmering Lexical Sets (2) solved, and what should be done instead

  • spear: thrust, hoist, carry, throw, brandish

  • sword: wield, draw, cross, brandish, swing, sheathe, carry, ...

  • dagger:sheathe, draw, plunge, hold

  • sabre: wield, rattle, draw

  • knife:brandish, plunge, twist, wield

  • bayonet: fix


Shimmering lexical sets 3 l.jpg
Shimmering Lexical Sets (3) solved, and what should be done instead

  • missile:fire, deploy, launch

  • bullet: bite, fire, spray, shoot, put

  • shell:fire, lob; crack, ...

  • round:fire, shoot; ...

  • arrow:fire, shoot, aim; paint, follow


Shimmering lexical sets 4 l.jpg
Shimmering Lexical Sets (4) solved, and what should be done instead

  • fire: shot, gun, bullet, rocket, missile, salvo ... [[Projectile]] or [[Firearm]]

  • carry: passenger, weight, bag, load, burden, tray, weapon, gun, cargo ... [polysemous]

  • aim: kick, measure, programme, campaign, blow, mischief, policy, rifle ... [polysemous]

  • point: finger, gun, way, camera, toe, pistol ... [polysemous?]

  • brandish: knife, sword, gun, shotgun, razor, stick, weapon, pistol ... [[Weapon]]

  • shoot: glance, bolt, Palestinian, rapid, policeman;

    • shoot ... with:pistol, bow, bullet, gun


Triangulation l.jpg
Triangulation solved, and what should be done instead

  • Meanings attach to patterns, not words.

  • A typical pattern consists of a verb and its arguments (with semantic values), thus:

    [[Human]] fire [[Projectile]] {from [[Firearm]]}

    {PREP [[PhysObj]]}

  • Pattern elements are often omitted in actual usage. (See FrameNet)


Semantic type vs semantic role l.jpg
Semantic Type vs. Semantic Role solved, and what should be done instead

[[Human]]fire[[Firearm]]{at [[PhysObj = Target]]}

[[Human]]fire[[Projectile]]{at [[PhysObj = Target]]}

Bond walks into our sights and fireshis pistolat the audience

The soldierfireda single shotat me

The Italian authorities claim that three US soldiersfiredat the car .

  • ‘audience’, ‘me’,and‘car’ have the semantic type [[Human]] and [[Vehicle]] (< [[PhysObj]]).

  • The context assigns them the semantic roleTarget.


Lexical sets don t map neatly onto semantic types l.jpg
Lexical sets don’t map neatly onto semantic types solved, and what should be done instead

  • calm as a transitive (causative) verb (pattern 1):

  • What do you calm? 1 lexical set, 5 semantic types:

    • him, her, me, everyone: [[Human]]

    • fear, anger, temper, rage: [[Negative Feeling]]

    • mind: [[Psychological Entity]]

    • nerves, heart: [[Body Part]] but not toes, chest, kidney)

    • breathing, breath: [[Living Entity Relational Process]] (but not defecation, urination)

      • words from at least 3 of these types are canonical members of the set of things that get calmed


Populating a semantic type with lexical items l.jpg
Populating a semantic type with lexical items solved, and what should be done instead

Pattern:

  • [[Human 1 | Event]] calm [[Human 2 | Animal]]

    • Canonical lexical items for [[Human]]:

    • him, her, me, everyone, ...

    • Attributes of [[Human 2]] in this context:

    • fear, anger, temper, rage; mind; nerves, heart; breathing, breath


Why don t ontologies help wsd l.jpg
Why don’t ontologies help WSD? solved, and what should be done instead

  • Ontologies such as Roget and WordNet attempt to organize the lexicon as a representation of 2,500 years of Aristotelian scientific conceptualization of the universe.

  • This is not the same as investigating how people use words to make meanings.

  • Why ever did we think it would be?


Why wsd can t be done as currently formulated l.jpg
Why WSD can’t be done solved, and what should be done instead (as currently formulated)

  • Because words (in isolation) don’t have meanings.

  • You’re looking for something that does not exist.

    • Words have meaning potentials.

    • The meaning potential of a word is activated by context (real-world context of utteranceandco-text in a discourse).


What should be done instead l.jpg
What should be done instead solved, and what should be done instead

  • Compare each actual usage with an inventory of norms.

  • Best match wins.

  • Don’t look for the meaning of the word -- look for the meaning of the pattern.

  • Distinguish conventional, prototypical usage of words (norms) from creativity (exploitations).

  • To do this, we need an inventory of patterned norms.

  • The Pattern Dictionary of English Verbs will be such an inventory.


ad