1 / 38

Augmenting WordNet for Deep Understanding of Text

Augmenting WordNet for Deep Understanding of Text. Peter Clark, Phil Harrison, Bill Murray, John Thompson (Boeing) Christiane Fellbaum (Princeton Univ) Jerry Hobbs (ISI/USC). “The soldier died” “The soldier was shot” “There was a fight” …. “A soldier was killed in a gun battle”.

Download Presentation

Augmenting WordNet for Deep Understanding of Text

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Augmenting WordNet for Deep Understanding of Text Peter Clark, Phil Harrison, Bill Murray, John Thompson (Boeing) Christiane Fellbaum (Princeton Univ) Jerry Hobbs (ISI/USC)

  2. “The soldier died” “The soldier was shot” “There was a fight” … “A soldier was killed in a gun battle” “Deep Understanding” • Not (just) parsing + word senses • Construction of a coherent representation of the scene the text describes • Challenge: much of that representation is not in the text

  3. “Deep Understanding” Because A battle involves a fight. Soldiers use guns. Guns shoot. Guns can kill. If you are killed, you are dead. …. How do we get this knowledge into the machine? How do we exploit it? “The soldier died” “The soldier was shot” “There was a fight” … “A soldier was killed in a gun battle”

  4. “Deep Understanding” Because A battle involves a fight. Soldiers use guns. Guns shoot. Guns can kill. If you are killed, you are dead. …. Several partially useful resources exist. WordNet is already used a lot…can we extend it? “The soldier died” “The soldier was shot” “There was a fight” … “A soldier was killed in a gun battle”

  5. The Initial Vision • Our vision: Rapidly expand WordNet to be more of a knowledge-base Question-answering software to demonstrate its use

  6. The Evolution of WordNet lexical resource • v1.0 (1986) • synsets (concepts) + hypernym (isa) links • v1.7 (2001) • add in additional relationships • has-part • causes • member-of • entails-doing (“subevent”) • v2.0 (2003) • introduce the instance/class distinction • Paris isa Capital-City is-type-of City • add in some derivational links • explode related-to explosion • … • v10.0 (200?) • ????? knowledge base?

  7. Augmenting WordNet • World Knowledge • Sense-disambiguate the glosses (by hand) • Convert the glosses to logic • Similar to LCC’s Extended WordNet attempt • Axiomatize “core theories” • WordNet links • Morphosemantic links • Purpose links • Experiments

  8. LFToolkit: Generate logical form fragments Lexical output rules produce logical form fragments strong drive for success strong(x1) & drive(x2) & for(x3,x4) & success(x5) Converting the Glosses to Logic “ambition#n2: A strong drive for success” Convert gloss to form “word is gloss” Parse (Charniak)

  9. Converting the Glosses to Logic “ambition#n2: A strong drive for success” Convert gloss to form “word is gloss” Parse (Charniak) LFToolkit: Generate logical form fragments Identify equalities, add senses

  10. Converting the Glosses to Logic Composition rules identify variables x2=x3 x4=x5 x1=x2 Lexical output rules produce logical form fragments A strong drive for success strong(x1) & drive(x2) & for(x3,x4) & success(x5) Identify equalities, add senses

  11. ambition#n2(x1) → a(x1) & strong#a1(x1) & drive#n2(x1) & for(x1,x2) & success#a3(x2) Converting the Glosses to Logic “ambition#n2: A strong drive for success” Convert gloss to form “word is gloss” Parse (Charniak) LFToolkit: Generate logical form fragments Identify equalities, add senses

  12. “hammer#n2: tool used to deliver an impulsive force by striking” hammer#n2(x1) → tool#n1(x1) & use#v1(e1,x2,x1) & to(e1,e2) & deliver#v2(e2,x3) & driving#a1(x3) & force#n1(x3) & by(e3,e4) & strike#v3(e4,x4). → Hammers hit things?? Converting the Glosses to Logic • Sometimes works well! • But often not. Primary problems: • Errors in the language processing • Only capture definitional knowledge • “flowery” language, many gaps, metonymy, ambiguity; If logic closely follows syntax → “logico-babble”

  13. Augmenting WordNet • World Knowledge • Sense-disambiguate the glosses (by hand) • Convert the glosses to logic • Axiomatize “core theories” • WordNet links • Morphosemantic links • Purpose links • Experiments

  14. Core Theories • Many domain-specific facts are instantiations of more general, “core” knowledge • By encoding this core knowledge, get leverage • eg 517 “vehicle” noun (senses), 185 “cover” verb (senses) • Approach: • Analysis and grouping of words in Core WordNet • Identification and encoding of underlying theories

  15. Core Theories Composite Entities: perfect, empty, relative, secondary, similar, odd, ... Scales: step, degree, level, intensify, high, major, considerable, ... Events: constraint, secure, generate, fix, power, development, ... Space: grade, inside, lot, top, list, direction, turn, enlarge, long, ... Time: year, day, summer, recent, old, early, present, then, often, ... Cognition: imagination, horror, rely, remind, matter, estimate, idea, ... Communication: journal, poetry, announcement, gesture, charter, ... Persons and their Activities: leisure, childhood, glance, cousin, jump, ... Microsocial: virtue, separate, friendly, married, company, name, ... Material World: smoke, shell, stick, carbon, blue, burn, dry, tough, ... Geo: storm, moon, pole, world, peak, site, village, sea, island, ... Artifacts: bell, button, van, shelf, machine, film, floor, glass, chair, ... Food: cheese, potato, milk, break, cake, meat, beer, bake, spoil, ... Macrosocial: architecture, airport, headquarters, prosecution, ... Economic: import, money, policy, poverty, profit, venture, owe, ...

  16. Augmenting WordNet • World Knowledge • Sense-disambiguate the glosses (by hand) • Convert the glosses to logic • Axiomatize “core theories” • WordNet links • Morphosemantic links • Purpose links • Experiments

  17. Can solve with WN’s derivation links: (“attack”) attack_v3 aggression_n4 (←“violence”) derivation link “aggress”/“aggression” Morphosemantic Links • Often need to cross part-of-speech T: A council worker cleans up after Tuesday's violence in Budapest. H: There were attacks in Budapest on Tuesday.

  18. (“pay”) pay_v1 payment_n1 (→ “transaction”) “pay”/“payment” (“pay”) pay_v1 payer_n1 (→ “person”) “pay”/“payer” Problem: The type of relation matters for derivatives! (Event? Agent?..) A pays B → The payment (event-noun) by A A is the payer (agent-noun) of B etc. Morphosemantic Links • But can go wrong! T: Paying was slow H1: The transaction was slow H2: *Theperson was slow [NOT entailed]

  19. Morphosemantic Links • Task: Classify the 22,000 links in WordNet: • Semi-automatic process • Exploit taxonomy and morphology • 15 semantic types used • agent, undergoer, instrument, result, material, destination, location, result, by-means-of, event, uses, state, property, body-part, vehicle. Verb SynsetNoun SynsetRelationship hammer_v1 hammer_n1 instrument execute_v1 execution_n1 event (equal) sign_v2 signatory_n1 agent

  20. Experimentation

  21. isa(soldier01,soldier_n1), isa(…… object(kill01,soldier01) during(kill01,battle01) instrument(battle01,gun01) “soldier”(soldier01), “kill”(….. object(kill01,soldier01), “in”(kill01,battle01), modifier(battle01,gun01). “A soldier was killed in a gun battle” Initial Logic Final Logic Task: Recognizing Entailment • Experiment with WordNet, logical glosses, DIRT • Text interpretation to logic using Boeing’s NLP system • Entailment: T → H if: • T is subsumed by H (“cat eats mouse” → “animal was eaten”) • An elaboration of T using inference rules is subsumed by H • (“cat eats mouse” → “cat swallows mouse”) • No statistical similarity metrics

  22. Successful Examples with the Glosses • Good example 14.H4 T: Britain puts curbs on immigrant labor from Bulgaria and Romania. H: Britain restricted workers from Bulgaria.

  23. Successful Examples with the Glosses • Good example 14.H4 T: Britain puts curbs on immigrant labor from Bulgaria and Romania. H: Britain restricted workers from Bulgaria. WN: limit_v1:"restrict“: place limits on. T: Britain puts curbs on immigrant labor from Bulgaria and Romania. H: Britain placed limits on workers from Bulgaria. → ENTAILED (correct)

  24. Successful Examples with the Glosses • Another (somewhat) good example 56.H3 T: The administration managed to track down the perpetrators. H: The perpetrators were being chased by the administration.

  25. Successful Examples with the Glosses • Another (somewhat) good example 56.H3 T: The administration managed to track down the perpetrators. H: The perpetrators were being chased by the administration. WN: hunt_v1 “hunt” “track down”: pursue for food or sport T: The administration managed to pursue the perpetrators [for food or sport!]. H: The perpetrators were being chased by the administration. → ENTAILED (correct)

  26. Unsuccessful examples with the glosses • More common: Being “tantalizingly close” 16.H3 T: Satomi Mitarai bled to death. H: His blood flowed out of his body.

  27. Unsuccessful examples with the glosses • More common: Being “tantalizingly close” 16.H3 T: Satomi Mitarai bled to death. H: His blood flowed out of his body. WordNet: bleed_v1: "shed blood", "bleed", "hemorrhage": lose blood from one's body So close! Need to also know: “lose liquid from container” → “liquid flows out of container” usually

  28. Unsuccessful examples with the glosses • More common: Being “tantalizingly close” 20.H2 T: The National Philharmonic orchestra draws large crowds. H: Large crowds were drawn to listen to the orchestra.

  29. Unsuccessful examples with the glosses • More common: Being “tantalizingly close” 20.H2 T: The National Philharmonic orchestra draws large crowds. H: Large crowds were drawn to listen to the orchestra. WordNet: WN: orchestra = collection of musicians WN: musician: plays musical instrument WN: music = sound produced by musical instruments WN: listen = hear = perceive sound So close!

  30. But need an agent (X verb Y -> X is agent-noun of Y) Got: result-noun (“invention” is result of “invent”) So no entailment (correct!) Success with Morphosemantic Links • Good example 66.H100 T: The Zoopraxiscope was invented by Mulbridge. H*: Mulbridge was the invention of the Zoopraxiscope. [NOT entailed] WordNet too permissive!: (“invent”) invent_v1 invention_n1 (“invention”) derivation link “invent”/ “invention”

  31. Successful Examples with DIRT 54.H1 • Good example T: The president visited Iraq in September. H: The president traveled to Iraq. DIRT: IF Y is visited by X THEN X flocks to Y WordNet: "flock" is a type of "travel"  Entailed [correct]

  32. Unsuccessful Examples with DIRT • Bad rule 55.H100 T: The US troops stayed in Iraq although the war was over. H*: The US troops left Iraq when the war was over. [NOT entailed] DIRT: IF Y stays in X THEN Y leaves X  Entailed [incorrect]

  33. “Straight- Forward” Overall Results • Note: Eschewing statistics! • BPI test suite (61%):

  34. Useful Overall Results • Note: Eschewing statistics! • BPI test suite (61%):

  35. Overall Results • Note: Eschewing statistics! • BPI test suite (61%): Occasionally useful

  36. Overall Results • Note: Eschewing statistics! • BPI test suite (61%): Often useful but unreliable • RTE3: 55%

  37. Summary • “Understanding” • Constructing a coherent model of the scene being described • Much is implicit in text → Need lots of world knowledge • Augmenting WordNet • Made some steps forward: • More connectivity • Logicalized glosses • But still need a lot more knowledge!

  38. Thank you!

More Related