1 / 127

Getting From the Utterance to the SemSpec Johno Bryant

Getting From the Utterance to the SemSpec Johno Bryant. Need a grammar formalism Embodied Construction Grammar (Bergen & Chang 2002) Need new models for language analysis Traditional methods too limited Traditional methods also don’t get enough leverage out of the semantics.

dtinsley
Download Presentation

Getting From the Utterance to the SemSpec Johno Bryant

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Getting From the Utterance to the SemSpecJohno Bryant • Need a grammar formalism • Embodied Construction Grammar (Bergen & Chang 2002) • Need new models for language analysis • Traditional methods too limited • Traditional methods also don’t get enough leverage out of the semantics.

  2. Embodied Construction Grammar • Semantic Freedom • Designed to be symbiotic with cognitive approaches to meaning • More expressive semantic operators than traditional grammar formalisms • Form Freedom • Free word order, over-lapping constituency • Precise enough to be implemented

  3. Traditional Parsing Methods Fall Short • PSG parsers too strict • Constructions not allowed to leave constituent order unspecified • Traditional way of dealing with incomplete analyses is ad-hoc • Making sense of incomplete analyses is important when an application must deal with “ill-formed” input (For example, modeling language learning) • Traditional unification grammar can’t handle ECG’s deep semantic operators.

  4. Our Analyzer • Replaces the FSMs used in traditional chunking (Abney 96) with much more powerful machines capable of backtracking called constructionrecognizers • Arranges these recognizers into levels just like in Abney’s work • But uses a chart to deal with ambiguity

  5. Our Analyzer (cont’d) • Uses specialized feature structures to deal with ECG’s novel semantic operators • Supports a heuristic evaluation metric for finding the “right” analysis • Puts partial analyses together when no complete analyses are available • The analyzer was designed under the assumption that the grammar won’t cover every meaningful utterance encountered by the system.

  6. Semantic Chunker Semantic Integration System Architecture Grammar/Utterance Learner Chunk Chart Ranked Analyses

  7. The Levels • The analyzer puts the recognizer on the level assigned by the grammar writer. • Assigned level should be greater than or equal to the levels of the construction’s constituents. • The analyzer runs all the recognizers on level 1, then level 2, etc. until no more levels. • Recognizers on the same level can be mutually recursive.

  8. Recognizers • Each Construction is turned into a recognizer • Recognizer = active representation • seeks form elements/constituents when initiated • Unites grammar and process - grammar isn’t just a static piece of knowledge in this model. • Checks both form and semantic constraints • Contains an internal representation of both the semantics and the form • A graph data structure used to represent the form and a feature structure representation for the meaning.

  9. Recognizer Example Mary kicked the ball into the net. This is the initial Constituent Graph for caused-motion. Patient Agent Action Path

  10. Recognizer Example Construct: Caused-Motion Constituent: Agent Constituent: Action Constituent: Patient Constituent: Path The initial constructional tree for the instance of Caused-Motion that we are trying to create.

  11. Recognizer Example

  12. Recognizer Example processed Mary kicked the ball into the net. A node filled with gray is removed. Patient Agent Action Path

  13. Recognizer Example Construct: Caused-Motion RefExp: Mary Constituent: Action Constituent: Patient Constituent: Path Mary kicked the ball into the net.

  14. Recognizer Example

  15. Recognizer Example processed Mary kicked the ball into the net. Patient Agent Action Path

  16. Recognizer Example Construct: Caused-Motion RefExp: Mary Verb: kicked Constituent: Patient Constituent: Path Mary kicked the ball into the net.

  17. Recognizer Example

  18. Recognizer Example processed Mary kicked the ball into the net. According to the Constituent Graph, The next constituent can either be the Patient or the Path. Patient Agent Action Path

  19. Recognizer Example processed Mary kicked the ball into the net. Patient Agent Action Path

  20. Recognizer Example Construct: Caused-Motion RefExp: Mary Verb: kicked RefExp: Det Noun Constituent: Path Det Noun Mary kicked the ball into the net.

  21. Recognizer Example

  22. Recognizer Example processed Mary kicked the ball into the net. Patient Agent Action Path

  23. Recognizer Example Construct: Caused-Motion RefExp: Mary Verb: kicked RefExp: Det Noun Spatial-Pred: Prep RefExp RefExp Det Noun Prep Det Noun Mary kicked the ball into the net.

  24. Recognizer Example

  25. Resulting SemSpec After analyzing the sentence, the following identities are asserted in the resulting SemSpec: Scene = Caused-Motion Agent = Mary Action = Kick Patient = Path.Trajector = The Ball Path = Into the net Path.Goal = The net

  26. the woman in the lab coat thought you were sleeping 0 1 2 3 4 5 6 7 8 9 Chunking L3 ________________________S_____________S L2 ____NP _________PP VP NP ______VP L1 ____NP P_______NP VP NP ______VP L0 D N P D N N V-tns Pron Aux V-ing Cite/description

  27. Form Meaning PP$,N <-> [Hand num:sg poss:addr] Form Meaning D,N <-> [Cloth num:sg] Form Meaning “you”<->[Addressee] NP NP NP NP NP Construction Recognizers You want to put a cloth on your hand ? Like Abney: Unlike Abney: One recognizer per rule Bottom up and level-based Check form and semantics More powerful/slower than FSMs

  28. Chunk Chart • Interface between chunking and structure merging • Each edge is linked to its corresponding semantics. You want to put a cloth on your hand ?

  29. Combining Partial Parses • Prefer an analysis that spans the input utterance with the minimum number of chunks. • When no spanning analysis exists, however, we still have a chart full of semantic chunks. • The system tries to build a coherent analysis out of these semantics chunks. • This is where structure merging comes in.

  30. Structure Merging • Closely related to abductive inferential mechanisms like abduction (Hobbs) • Unify compatible structures (find fillers for frame roles) • Intuition: Unify structures that would have been co-indexed had the missing construction been defined. • There are many possible ways to merge structures. • In fact, there are an exponential number of ways to merge structures (NP Hard). But using heuristics cuts down the search space.

  31. Bib < Clothing num:sg givenness:def Bib < Clothing num:sg givenness:def Caused-Motion-Action Agent: [Animate] Patient: [Entity] Path:On Caused-Motion-Action Agent: [Addressee] Patient: Path:On Structure Merging Example Utterance:You used to hate to have the bib put on . Before Merging: After Merging: [Addressee < Animate]

  32. Semantic Density • Semantic density is a simple heuristic to choose between competing analyses. • Density of an analysis = (filled roles) / (total roles) • The system prefers higher density analyses because a higher density suggests that more frame roles are filled in than in competing analyses. • Extremely simple / useful? but it certainly can be improved upon.

  33. Summary: ECG • Linguistic constructions are tied to a model of simulated action and perception • Embedded in a theory of language processing • Constrains theory to be usable • Frees structures to be just structures, used in processing • Precise, computationally usable formalism • Practical computational applications, like MT and NLU • Testing of functionality, e.g. language learning • A shared theory and formalism for different cognitive mechanisms • Constructions, metaphor, mental spaces, etc.

  34. Issues in Scaling up to Language • Knowledge • Lexicon (FrameNet) • Constructicon (ECG) • Maps (Metaphors, Metonymies) (MetaNet) • Conceptual Relations (Image Schemas, X-schemas) • Computation • Representation (ECG) • expressiveness, modularity, compositionality • Inference (Simulation Semantics) • tractable, distributed, probabilistic concurrent, context-sensitive

  35. A Best-Fit Approach for Productive Analysis of Omitted Arguments Eva Mok & John Bryant University of California, Berkeley International Computer Science Institute

  36. Simplify grammar by exploiting the language understanding process • Omission of arguments in Mandarin Chinese • Construction grammar framework • Model of language understanding • Our best-fit approach

  37. Productive Argument Omission (in Mandarin) 1 • Mother (I) give you this (a toy). 2 • You give auntie[the peach]. 3 • Oh (go on)! You give[auntie] [that]. 4 • [I]give[you] [some peach]. CHILDES Beijing Corpus (Tardiff, 1993; Tardiff, 1996)

  38. Arguments are omitted with different probabilities All arguments omitted: 30.6% No arguments omitted: 6.1%

  39. Construction grammar approach • Kay & Fillmore 1999; Goldberg 1995 • Grammaticality: form and function • Basic unit of analysis: construction, i.e. a pairing of form and meaning constraints • Not purely lexically compositional • Implies early use of semantics in processing • Embodied Construction Grammar (ECG) (Bergen & Chang, 2005)

  40. Problem: Proliferation of constructions

  41. If the analysis process is smart, then... • The grammar needs only state one construction • Omission of constituents is flexibly allowed • The analysis process figures out what was omitted

  42. Analyzer: Discourse & Situational Context Best-fit analysis process takes burden off the grammar representation Constructions Utterance incremental, competition-based, psycholinguistically plausible Semantic Specification: image schemas, frames, action schemas Simulation

  43. Competition-based analyzer finds the best analysis The best fit has the highest combined score • An analysis is made up of: • A constructional tree • A set of resolutions • A semantic specification

  44. Combined score that determines best-fit • Syntactic Fit: • Constituency relations • Combine with preferences on non-local elements • Conditioned on syntactic context • Antecedent Fit: • Ability to find referents in the context • Conditioned on syntactic information, feature agreement • Semantic Fit: • Semantic bindings for frame roles • Frame roles’ fillers are scored

  45. Analyzing ni3 gei3 yi2 (You give auntie) Two of the competing analyses: • Syntactic Fit: • P(Theme omitted | ditransitive cxn) = 0.65 • P(Recipient omitted | ditransitive cxn) = 0.42 (1-0.78)*(1-0.42)*0.65 = 0.08 (1-0.78)*(1-0.65)*0.42 = 0.03

  46. Using frame and lexical information to restrict type of reference

  47. Discourse & Situational Context • child mother • peach auntie • table Can the omitted argument be recovered from context? • Antecedent Fit: ?

  48. How good of a theme is a peach? How about an aunt? • Semantic Fit:

  49. The argument omission patterns shown earlier can be covered with just ONE construction • Each cxn is annotated with probabilities of omission • Language-specific default probability can be set P(omitted|cxn): 0.78 0.42 0.65

  50. Leverage process to simplify representation • The processing model is complementary to the theory of grammar • By using a competition-based analysis process, we can: • Find the best-fit analysis with respect to constituency structure, context, and semantics • Eliminate the need to enumerate allowable patterns of argument omission in grammar • This is currently being applied in models of language understanding and grammar learning.

More Related