Share and share alike resources for language generation
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

Share and Share Alike: Resources for Language Generation PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

Share and Share Alike: Resources for Language Generation. Prof. Marilyn Walker University of Sheffield NSF- 20 April 2007. What type of resource is needed for generation?. What type of scientific problem is generation?

Download Presentation

Share and Share Alike: Resources for Language Generation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Share and share alike resources for language generation

Share and Share Alike: Resources for Language Generation

Prof. Marilyn Walker

University of Sheffield NSF- 20 April 2007


What type of resource is needed for generation

What type of resource is needed for generation?

  • What type of scientific problem is generation?

  • An essential difference between language generation and language interpretation problems (parsing, WSD, relation extraction, coreference) is that there is no single right answer for language generation;

  • Language Productivity Assumption: An optimal generation resource will represent multiple outputs for each input, with a human-generated quality metric associated with each output


Dialogue vs generation

Dialogue vs. generation?

  • Dialogue is like generation in that there is no single right answer for how to do a task in dialogue;

  • Information gathering and information presentation in dialogue systems are generation problems;

  • DARPA evaluation for dialogue systems;

  • Fixed domain “TRAVEL PLANNING”

  • First: ATIS evaluations compared dialogue system behaviour against human behaviour in corpus of human-wizard dialogues (Hirschman 2000);

  • No “mixed initiative”, different dialogue strategies, divergence of context, user modeling;


Dialogue vs generation1

Dialogue vs. generation?

  • Second: define context, evaluate on system response to user utterance in a particular context;

  • Much more like generation, context is defined, system ‘communicative goal’ is defined

  • Form: How is ‘the same response’ defined? Some forms for identical content may be better than others;

  • Content: User Models, definitions of context. Also dialogue system should be able to decide on communicative goal.


Dialogue vs generation2

Dialogue vs. generation?

  • Third: Communicator evaluation: given user task (NYC to LHR, Continental, April 22nd, 2007), collect metrics (time to completion, ASR error, utterance output quality, concept understanding, user satisfaction);

  • Corpus semi-automatically labelled with dialogue act (quality/strategy metrics) for system utterances (8 or more different instantiations from different systems for particular communicative goals);

  • Try to understand which metrics are contributors to user satisfaction (PARADISE);

  • User utterance labelled subsequently, used in RL experiments comparing dialogue strategies;

  • Hard to compare particular scientific techniques for particular modules in systems, plug and play never worked


Dialogue vs generation conclusions

Dialogue vs. generation: Conclusions?

  • Just having a fixed task (TRAVEL) by itself does not necessarily lead to scientific progress;

  • Want to compare particular scientific techniques for particular modules in systems;

  • Plug and play is the only way to do this;

  • BUT: very hard to define for a whole community what interfaces between modules should be


Position

Position

  • What type of resources would be useful for scientific advancement in language generation??

  • Almost anything!!

  • “If you build it they will come” - “If its useful people will use it”

  • Can we leverage what we already have in our own research groups, share it, and make it better?


What is needed to incentivize data sharing

What is needed to incentivize data sharing

  • Many different domains/problems/modules => NEED LOTS OF DIFFERENT RESOURCES;

  • Resources costly (developing group not ‘finished’ yet) => FINANCIAL INCENTIVE; SCIENTIFIC INCENTIVE; CITATION INCENTIVE;

  • Costs too much to support resource preparation, maintenance, distribution and re-use => NSF/LDC FINANCIAL/SUPPORT

  • NOTE: MANY LDC RESOURCES ARE ``FOUND DATA’’ (not explicitly commissioned)


A proposal for one shared resource

A proposal for one shared resource


Information presentation of one or more database entities

Information presentation of one or more database entities

  • Natural Language Interfaces/SDS (McKeown85, McCoy89, Cooperative Response literature, Carenini&Moore01, Polifroni etal 03, COGENTEX w/ active buyers website, Walkeretal04,Demberg&Moore06, etc)

  • Different communicative goals; Summarize, Recommend, Compare, Describe (DB entities)

  • Representation not controversial (attributes and values for DB entities, relations between entity and attribute)

  • Application not dependent on NLU


What type of resource is needed for generation1

What type of resource is needed for generation?

  • What type of scientific problem is generation?

  • An essential difference between language generation and language interpretation problems (parsing, WSD, relation extraction, coreference) is that there is no single right answer for language generation;

  • Language Productivity Assumption: An optimal generation resource will represent multiple outputs for each input, with a human-generated quality metric associated with each output


We could make available a resource of

We could make available a resource of:

  • INPUT-1: Speech ACT, SET of DB Entities

    • SUMMARIZE(SET); DESCRIBE(ENTITY), RECOMMEND(ENTITY,SET), COMPARE(SET)

  • INPUT-2: user model, discourse/dialogue context, style parameters, etc.

  • OUTPUT-1: a set of alternative outputs possibly with TTS markup

  • OUTPUT-2: human generated ratings or rankings for the outputs oriented to the criteria specified by INPUT-2


A content plan for a recommend

A Content Plan for a Recommend

  • strategy: recommend

  • relations: justify(nuc1; sat:2);

    justify(nuc:1; sat:3);

    justify(nuc:1, sat:4)

  • content: 1. assert(best (Babbo))

    2. assert(has-att (Babbo, foodquality(superb)))

    3. assert(has-att (Babbo, decor(excellent)))

    4. assert(has-att (Babbo, service(excellent)))


Human feedback for ranking

Human Feedback for Ranking

  • The ratings can represent any metric associated with the possible response, e.g. coherence, information quality, social appropriateness, personality.

  • Informational Coherence

    • SPARKY, a generator for MATCH

    • SPOT, a generator for AT&T COMMUNICATOR

  • Users are shown response variants then told:

    • For each variant, please rate to what extent you agree with this statement.

    • The utterance is easy to understand, well-formed and appropriate to the dialogue context.


Examples learned rules applied to test fold

Examples: Learned Rules applied to test fold


Individual differences sentence planning preferences

Individual Differences (Sentence Planning Preferences)


Human feedback for ranking 2

Human Feedback for Ranking (2)

  • Ten Item Personality Inventory Questionnaire, (Gosling 2003)

    • PERSONAGE

  • Users are shown response variants then told:

    • For each variant, rate on a scale of 1 to 7 whether:

    • The speaker is quiet, reserved;

    • The speaker is enthusiastic;


Personality judgments recommend le marais

Personality judgments: `Recommend Le Marais’


What else is out there

What else is out there?

  • Coconut corpus: referring expression generation, but add alternatives and ratings?

  • Boston directions corpus (NSF funded early 1990s)

  • Communicator corpus (8 different system outputs for dialogue contexts that can be characterized)

  • Tools: Halogen, Penman, FUF-SURGE, RealPro

  • Library of text plans, content plans, sentence planners?


  • Login