1 / 63

Putting Meaning into Your Trees

Putting Meaning into Your Trees. Martha Palmer CIS630 September 13, 2004. Meaning? . Complete representation of real world knowledge - Natural Language Understanding? NLU Only build useful representations for small vocabularies

marnin
Download Presentation

Putting Meaning into Your Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Putting Meaning into Your Trees Martha Palmer CIS630 September 13, 2004

  2. Meaning? • Complete representation of real world knowledge - Natural Language Understanding? NLU • Only build useful representations for small vocabularies • Major impediment to accurate Machine Translation, Information Retrieval and Question Answering

  3. Outline • Introduction • Background: WordNet, Levin classes, VerbNet • Proposition Bank • Captures shallow semantics • Associated lexical frame files • Supports training of an automatic tagger • Mapping PropBank to VerbNet • Mapping PropBank to WordNet • Future directions

  4. Ask Jeeves – A Q/A, IR ex. What do you call a successful movie? • Tips on Being a Successful Movie Vampire ... I shall call the police. • Successful Casting Call & Shoot for ``Clash of Empires'' ... thank everyone for their participation in the making of yesterday's movie. • Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague... • VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer. Blockbuster

  5. Ask Jeeves – filtering w/ POS tag What do you call a successful movie? • Tips on Being a Successful Movie Vampire ... I shall call the police. • Successful Casting Call & Shoot for ``Clash of Empires'' ... thank everyone for their participation in the making of yesterday's movie. • Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague... • VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer.

  6. Filtering out “call the police” Different senses, - different syntax, - different participants call(you,movie,what) ≠ call(you,police) you movie what you police

  7. Machine Translation Lexical Choice- Word Sense Disambiguation • Iraq lost the battle. • Ilakuka centwey ciessta. • [Iraq ] [battle] [lost]. • John lost his computer. • John-i computer-lul ilepelyessta. • [John] [computer] [misplaced].

  8. Cornerstone: English lexical resource • That provides sets of possible syntactic frames for verbs. • And provides clear, replicable sense distinctions. AskJeeves: Who do you call for a good electronic lexical database for English?

  9. WordNet – Princeton (Miller 1985, Fellbaum 1998) On-line lexical reference (dictionary) • Nouns, verbs, adjectives, and adverbs grouped into synonym sets • Other relations include hypernyms (ISA), antonyms, meronyms • Typical top nodes - 5 out of 25 • (act, action, activity) • (animal, fauna) • (artifact) • (attribute, property) • (body, corpus)

  10. WordNet – call, 28 senses • name, call -- (assign a specified, proper name to; "They named their son David"; …) -> LABEL 2. call, telephone, call up, phone, ring -- (get or try to get into communication (with someone) by telephone; "I tried to call you all night"; …) ->TELECOMMUNICATE 3. call-- (ascribe a quality to or give a name of a common noun that reflects a quality; "He called me a bastard"; …) -> LABEL 4. call, send for -- (order, request, or command to come; "She was called into the director's office"; "Call the police!") -> ORDER

  11. WordNet – Princeton (Miller 1985, Fellbaum 1998) • Limitations as a computational lexicon • Contains little syntactic information • Comlex has syntax but no sense distinctions • No explicit lists of participants • Sense distinctions very fine-grained, • Definitions often vague • Causes problems with creating training data for supervised Machine Learning – SENSEVAL2 • Verbs > 16 senses (including call) • Inter-annotator Agreement ITA 73%, • Automatic Word Sense Disambiguation, WSD 60.2% Dang & Palmer, SIGLEX02

  12. WordNet: - call, 28 senses WN2 , WN13,WN28 WN15 WN26 WN3 WN19 WN4 WN 7 WN8 WN9 WN1 WN22 WN20 WN25 WN18 WN27 WN5 WN 16 WN6 WN23 WN12 WN17 , WN 11 WN10, WN14, WN21, WN24

  13. WordNet: - call, 28 senses, Senseval2 groups (engineering!) WN5, WN16,WN12 WN15 WN26 WN3 WN19 WN4 WN 7 WN8 WN9 WN1 WN22 WN20 WN25 WN18 WN27 WN2 WN 13 WN6 WN23 WN28 WN17 , WN 11 WN10, WN14, WN21, WN24, Loud cry Bird or animal cry Request Label Call a loan/bond Challenge Visit Phone/radio Bid

  14. Grouping improved scores:ITA 82%, MaxEnt WSD 69% • Call: 31% of errors due to confusion between senses within same group 1: • name, call -- (assign a specified, proper name to; They named their son David) • call -- (ascribe a quality to or give a name of a common noun that reflects a quality; He called me a bastard) • call -- (consider or regard as being;I would not call her beautiful) • 75% with training and testing on grouped senses vs. • 43% with training and testing on fine-grained senses Palmer, Dang, Fellbaum,, submitted, NLE

  15. Groups • Based on VerbNet, an English Lexical resource that is under development, • Which is in turn based on Levin’s English verb classes….

  16. Levin classes (Levin, 1993) • 3100 verbs, 47 top level classes, 193 second and third level • Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily. John cut the bread. / *The bread cut. / Bread cuts easily. John hit the wall. / *The wall hit. / *Walls hit easily.

  17. Levin classes (Levin, 1993) • Verb class hierarchy: 3100 verbs, 47 top level classes, 193 • Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily. change-of-state John cut the bread. / *The bread cut. / Bread cuts easily. change-of-state, recognizable action, sharp instrument John hit the wall. / *The wall hit. / *Walls hit easily. contact, exertion of force

  18. Limitations to Levin Classes • Coverage of only half of the verbs (types) in the Penn Treebank (1M words,WSJ) • Usually only one or two basic senses are covered for each verb • Confusing sets of alternations • Different classes have almost identical “syntactic signatures” • or worse, contradictory signatures Dang, Kipper & Palmer, ACL98

  19. Multiple class listings • Homonymy or polysemy? • draw a picture, draw water from the well • Conflicting alternations? • Carry verbs disallow the Conative, (*she carried at the ball), but include {push,pull,shove,kick,yank,tug} • also in Push/pull class, does take the Conative (she kicked at the ball)

  20. Intersective Levin Classes Dang, Kipper & Palmer, ACL98 “apart” CH-STATE “across the room” CH-LOC “at” ¬CH-LOC

  21. Intersective Levin Classes • More syntactically and semantically coherent • sets of syntactic patterns • explicit semantic components • relations between senses • VERBNET www.cis.upenn.edu/verbnet

  22. VerbNet – Karin Kipper • Class entries: • Capture generalizations about verb behavior • Organized hierarchically • Members have common semantic elements, semantic roles and syntactic frames • Verb entries: • Refer to a set of classes (different senses) • each class member linked to WN synset(s) (not all WN senses are covered) Dang, Kipper & Palmer, IJCAI00, Coling00

  23. Semantic role labels: Grace broke the LCD projector. break (agent(Grace), patient(LCD-projector)) cause(agent(Grace), change-of-state(LCD-projector)) (broken(LCD-projector)) agent(A) -> intentional(A), sentient(A), causer(A), affector(A) patient(P) -> affected(P), change(P),…

  24. VerbNet entry for leaveLevin class: future_having-13.3 • WordNet Senses: leave, (WN 2,10,13), promise, offer, …. • Thematic Roles: Agent[+animate OR +organization] Recipient[+animate OR +organization] Theme[] • Frameswith Semantic Roles "I promised somebody my time" Agent V Recipient Theme “I left my fortune to Esmerelda" Agent V Theme Prep(to) Recipient ) "I offered my services" Agent V Theme

  25. PropBank Handmade resources vs. Real data • VerbNet is based on linguistic theory – how useful is it? • How well does it correspond to syntactic variations found in naturally occurring text?

  26. Powell met Zhu Rongji battle wrestle join debate Powell and Zhu Rongji met consult Powell met with Zhu Rongji Proposition:meet(Powell, Zhu Rongji) Powell and Zhu Rongji had a meeting Proposition Bank:From Sentences to Propositions (Predicates!) meet(Somebody1, Somebody2) . . . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))

  27. Capturing semantic roles* • Jerry broke [ PATIENTthe laser pointer.] • [PATIENT The windows] were broken by the hurricane. • [PATIENT The vase] broke into pieces when it toppled over. SUBJ SUBJ SUBJ

  28. Capturing semantic roles* • Jerry broke [ ARG1 the laser pointer.] • [ARG1 The windows] were broken by the hurricane. • [ARG1 The vase] broke into pieces when it toppled over. *See also Framenet, http://www.icsi.berkeley.edu/~framenet/

  29. NP a GM-Jaguar pact NP NP the US car maker NP an eventual 30% stake in the British company A TreeBanked phrase A GM-Jaguar pact would give the U.S. car maker an eventual 30% stake in the British company. S VP VP would NP give PP-LOC

  30. a GM-Jaguar pact give(GM-J pact, US car maker, 30% stake) The same phrase, PropBanked A GM-Jaguar pact would give the U.S. car maker an eventual 30% stake in the British company. Arg0 would give Arg1 an eventual 30% stake in the British company Arg2 the US car maker

  31. Frames File example: give Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefsa standing ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation

  32. Annotation procedure • PTB II – Extract all sentences of a verb • Create Frame File for that verb Paul Kingsbury • (3100+ lemmas, 4700 framesets,120K predicates) • First pass: Automatic tagging Joseph Rosenzweig • Second pass: Double blind hand correction • Inter-annotator agreement 84% • Third pass: Solomonization (adjudication) • Olga Babko-Malaya

  33. Annotator accuracy – ITA 84%

  34. Trends in Argument Numbering • Arg0 = proto-typical agent (Dowty) • Arg1 = proto-typical patient • Arg2 = indirect object / benefactive / instrument / attribute / end state • Arg3 = start point / benefactive / instrument / attribute • Arg4 = end point

  35. Additional tags (arguments or adjuncts?) • Variety of ArgM’s (Arg#>4): • TMP - when? • LOC - where at? • DIR - where to? • MNR - how? • PRP -why? • REC - himself, themselves, each other • PRD -this argument refers to or modifies another • ADV –others

  36. Inflection, etc. • Verbs also marked for tense/aspect • Passive/Active • Perfect/Progressive • Third singular (is has does was) • Present/Past/Future • Infinitives/Participles/Gerunds/Finites • Modals and negations marked as ArgMs

  37. PropBank/FrameNet Buy Arg0:buyer Arg1:goods Arg2:seller Arg3:rate Arg4:payment Sell Arg0:seller Arg1:goods Arg2:buyer Arg3:rate Arg4:payment Broader, more neutral, more syntactic – maps readily to VN,TR,FN Rambow, et al, PMLB03

  38. Outline • Introduction • Background: WordNet, Levin classes, VerbNet • Proposition Bank • Captures shallow semantics • Associated lexical frame files • Supports training of an automatic tagger • Mapping PropBank to VerbNet • Mapping PropBank to WordNet

  39. Approach • Pre-processing: • A heuristic which filters out unwanted constituents with significant confidence • Argument Identification • A binary SVM classifier which identifies arguments • Argument Classification • A multi-class SVM classifier which tags arguments as ARG0-5, ARGA, and ARGM

  40. Automatic Semantic Role Labeling Gildea & Jurafsky, CL02, Gildea & Palmer, ACL02 Stochastic Model • Basic Features: • Predicate, (verb) • Phrase Type, (NP or S-BAR) • Parse Tree Path • Position (Before/after predicate) • Voice (active/passive) • Head Word of constituent • Subcategorization

  41. Discussion Part I – Szuting Yi • Comparisons between Pradhan and Penn (SVM) • Both systems are SVM-based • Kernel: Pradhan uses a degree 2 polynomial kernel; Penn uses a degree 3 RGB kernel • Multi-classification: Pradhan uses a one-versus-others approach; Penn uses a pairwise approach • Features: Pradhan includes rich features including NE, head word POS, partial path, verb classes, verb sense, head word of PP, first or last word/pos in the constituent, constituent tree distance, constituent relative features, temporal cue words, dynamic class context (Pradhan et al, 2004)

  42. Xue & Palmer, EMNLP04 Discussion Part II • Different features for different subtasks • Basic features analysis

  43. Discussion Part III (New Features – Bert Xue) • Syntactic frame • use NPs as “pivots” • varying with position within the frame • lexicalization with predicate • Predicate + • head word • phrase type • head word of PP parent • Position + voice

  44. Results

  45. Word Senses in PropBank • Orders to ignore word sense not feasible for 700+ verbs • Mary left the room • Mary left her daughter-in-law her pearls in her will Frameset leave.01 "move away from": Arg0: entity leaving Arg1: place left Frameset leave.02 "give": Arg0: giver Arg1: thing given Arg2: beneficiary How do these relate to traditional word senses in VerbNet and WordNet?

  46. Frames: Multiple Framesets • Out of the 787 most frequent verbs: • 1 Frameset – 521 • 2 Frameset – 169 • 3+ Frameset - 97 (includes light verbs) • 90% ITA • Framesets are not necessarily consistent between different senses of the same verb • Framesets are consistent between different verbs that share similar argument structures, (like FrameNet)

  47. Ergative/Unaccusative Verbs Roles (no ARG0 for unaccusative verbs) Arg1= Logical subject, patient, thing rising Arg2 = EXT, amount risen Arg3* = start point Arg4= end point Sales rose 4% to $3.28 billion from $3.16 billion. The Nasdaq composite index added 1.01 to 456.6 on paltry volume.

  48. Mapping from PropBank to VerbNet

  49. Mapping from PB to VerbNet

  50. Mapping from PropBank to VerbNet • Overlap with PropBank framesets • 50,000 PropBank instances • < 50% VN entries, > 85% VN classes • Results • MATCH - 78.63%. (80.90% relaxed) • (VerbNet isn’t just linguistic theory!) • Benefits • Thematic role labels and semantic predicates • Can extend PropBank coverage with VerbNet classes • WordNet sense tags Kingsbury & Kipper, NAACL03, Text Meaning Workshop http://www.cs.rochester.edu/~gildea/VerbNet/

More Related