generation of referring expressions managing structural ambiguities
Download
Skip this Video
Download Presentation
Generation of Referring Expressions: Managing Structural Ambiguities

Loading in 2 Seconds...

play fullscreen
1 / 24

Generation of Referring Expressions: Managing Structural Ambiguities - PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on

Generation of Referring Expressions: Managing Structural Ambiguities. I.H. Khan G. Ritchie K. van Deemter University of Aberdeen, UK.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Generation of Referring Expressions: Managing Structural Ambiguities ' - nia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
generation of referring expressions managing structural ambiguities
Generation of Referring Expressions: Managing Structural Ambiguities

I.H. Khan G. Ritchie K. van Deemter

University of Aberdeen, UK

slide2

A natural language generator should avoid generating those phrases, which are too ambiguous to understand. But, how the generator can know whether a phrase is too ambiguous or not? We use corpus-based heuristics, backed by empirical evidence, that estimate the likelihood of different readings of a phrase, and guide the generator to choose an optimal phrase from the available alternatives.

natural language generation nlg
Natural Language Generation (NLG)
  • Process of generating text in natural language (e.g., English) from some non-linguistic data (Reiter & Dale, 2000)
  • Example NLG system
    • Pollen Forecast: generates reports from pollen forecast data

Grass pollen levels for Tuesday have decreased from the high levels of yesterday with values of around 4 to 5 across most parts of the country. However, in South Eastern areas, pollen levels will be high with values of 6. [courtesy E. Reiter]

generation of referring expressions gre
Generation of Referring Expressions (GRE)
  • Referring Expression = Noun Phrase
    • e.g., the black cat; the black cats and dogs (etc.)
  • A key component in most NLG systems
  • Task of GRE:
      • Given a set of intended referents, compute the properties of these referents that distinguish them from distractors in a KB
gre an example
GRE: An Example
  • Input:KB, Intended Referents R
  • Task: find properties that distinguish R from distractors

KB

  • Output: Distinguishing Description (DD)
    • (Black  Sheep)  (Black  Goat)
the problem

NP1: The black sheep and the black goats

= {Object1,Object3,Object4,Object6}

(Black  Sheep)  (Black  Goat)

NP2: The black sheep and goats

(Black  Sheep)  Goat

= {Object1,Object3,Object4,Object5,Object6,Object7}

The Problem
  • Linguistic ambiguities can arise when DDs are realised
  • NP1 unambiguous and long; NP2 ambiguous and brief
  • Question: How the generator might chose between NP1 and NP2?
our approach
Our Approach
  • Psycholinguistic evidence
    • Avoidance of all ambiguity is not feasible (Abney, 1996)
  • Avoid only distractor interpretations
    • An interpretation is distractor if it is more likely or almost as likely as the intended one.
  • Question
    • How to make distractor interpretation precise?
  • Our solution
    • Getting likelihood using word sketches (cf. Chantree et el., 2004)
      • Word sketches provide detailed information about word relationships, based on corpus frequencies
      • Relationships are grammatical
pattern the adj n 1 and n 2
Pattern: the Adj N1 and N2
  • Hypothesis 1
    • If Adj modifies N1 more often than N2, then a narrow-scope reading is likely (no matter how frequently N1 and N2 co-occur).

bearded men and women

handsome men and women

  • Hypothesis 2
    • If Adj does not modify N1 more often than N2, then a wide-scope reading is likely (no matter how frequently N1 and N2 co-occur)..

old men and women

tall men and trees

experiment 1
Experiment 1

Please, remove the roaring lions and horses.

experiment 1 results
Experiment 1: Results
  • Hypothesis 2(i.e., predictions for WS reading) is confirmed
  • Hypothesis 1(i.e., predictions for NS reading) is not confirmed
    • Tendency for WS (even though results are not stat. sig.)
  • Tentative conclusion
    • An intrinsic bias in favour of WS reading
  • BUT: The use of *unusual* features may have made people’s judgements unreliable
experiment 2
Experiment 2

Please, remove the figure containing the young lions and horses.

experiment 2 cont
Experiment 2 (cont.)
  • Results: Both hypotheses are confirmed

Please, remove the figure containing the barking dogs and cats.

slide13

The black sheep and the black goats

(Black  Sheep)  (Black  Goat)

The black sheep and goats

  • Word Sketches can make reasonable predictions about how an NP would be understood.
  • But we need more to know from generation point of view: which of the following two NPs is best?

(Black  Sheep)  Goat

  • We seek the answer in next experiment
clarity brevity trade off
Clarity-brevity trade-off
  • Recall the pattern: the Adj Noun1 and Noun2
  • Brief descriptions (+b) take the form
    • the Adj Noun1 and Noun2
  • Non-brief descriptions (-b) take the form
    • the Adj Noun1 and the Adj Noun2 (IR = WS)
    • the Adj Noun1 and the Noun2 (IR = NS)
  • Clear descriptions (+c)
    • Which have no distractor interpretations
  • Non-clear descriptions (-c)
    • Which have some distractor interpretations
the hypotheses readers preferences
The Hypotheses (Readers’ Preferences)
  • Hypothesis 1
    • (+c, +b) descriptions are preferred over (+c, -b)
  • Hypothesis 2
    • (+c, -b) descriptions are preferred over (-c, +b)
  • Each hypothesis is tested under two conditions
    • C1:intended reading is WS
    • C2: intended reading is NS
experiment 3 ns case
Experiment 3: NS Case
  • Which phrase works best to identify the filled area?
  • The barking dogs and cats
  • The barking dogs and the cats
experiment 3 ws case
Experiment 3: WS Case
  • Which phrase works best to identify the filled area?
  • The young lions and the young horses
  • The young lions and horses
experiment 3 results
Experiment 3: Results
  • Both hypotheses are confirmed:
    • (+c, +b) descriptions are preferred over (+c, -b)
    • (+c, -b) descriptions are preferred over (-c, +b)
  • Role of length:
    • In WS cases preferences are very strong
    • In NS cases preference is not as strong as in WS cases
summary of empirical evidence
Summary of Empirical Evidence
  • For the pattern the Adj Noun1 and Noun2
    • Word Sketches can make reliable predictions
    • Keeping clarity the same, a brief NP is better than a longer one
algorithm development
Algorithm Development
  • Main knowledge sources
    • WordNet (for lexicalisation)
    • SketchEngine (for predicting the most likely reading)
  • Main steps
    • Choose words
    • Use these to construct description in DNF
    • Use transformations to generate alternative structures from DNF
    • Select optimal phrase
transformation rules
Transformation Rules
  • Input
    • Logical formula in DNF
  • Rule Base
    • (A  B1)  (A  B2)  A  (B1 B2)
    • (X  Y)  (Y  X)

[A = Adj, B1=B2=Noun, X=Y=(Adj and/or Noun)]

  • Output
    • Set of logical formulae
select optimal phrase
Select optimal phrase
  • (black  sheep)  (black  goats) DNF
  • (black  goats)  (black  sheep)
  • black  (goats  sheep)
  • black  (sheep  goats) Optimal

(4):Adj has high collocational frequency with N1 and N2, so the intended (wide-scope) reading is more likely.

Therefore, (4) is selected.

conclusions
Conclusions
  • GRE should deal with surface ambiguities
  • Word sketches can make distractor interpretation precise
  • Keeping clarity the same, brief descriptions are preferred over longer ones
  • A GRE algorithm is sketched that balances clarity and brevity
ad