question asking to inform preference learning a case study l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Question Asking to Inform Preference Learning: A Case Study PowerPoint Presentation
Download Presentation
Question Asking to Inform Preference Learning: A Case Study

Loading in 2 Seconds...

play fullscreen
1 / 18

Question Asking to Inform Preference Learning: A Case Study - PowerPoint PPT Presentation


  • 327 Views
  • Uploaded on

Question Asking to Inform Preference Learning: A Case Study Melinda Gervasio SRI International Karen Myers SRI International Marie desJardins Univ. of Maryland Baltimore County Fusun Yaman BBN Technologies AAAI Spring Symposium: Humans Teaching Agents March 2009

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Question Asking to Inform Preference Learning: A Case Study' - benjamin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
question asking to inform preference learning a case study

Question Asking to Inform Preference Learning: A Case Study

Melinda Gervasio

SRI International

Karen Myers

SRI International

Marie desJardins

Univ. of Maryland Baltimore County

Fusun Yaman

BBN Technologies

AAAI Spring Symposium: Humans Teaching Agents

March 2009

poirot learning from a single demonstration
POIROT:Learning from a Single Demonstration

Demonstration Trace

((lookupReqmts S42)

((lookupAirport PFAL 300m)

((ORBI 90m BaghdadIntl)))

((setPatientAPOE P1 ORBI ))

((getArrivalTime P1 PFAL ORBI )(1h 3h ))

((setPatientAvailable P1 3h ))

((lookupHospitalLocation HKWC)

((KuwaitCity)))

((lookupAirport KuwaitCity 300m 2)

((OKBK 250m KuwaitIntl)))

((setPatientAPOD P1 OKBK ))

((lookupMission ORBI OKBK 24h 3h ))

((lookupAsset ORBI OKBK 24h 3h )

((C9-001 15h 2h 10)))

((initializeTentativeMission c9-001 10 ORBI OKBK 15h 2h))

((getArrivalTime P1 OKBK HKWC 17h)

(18h 19h))

Learning

Generalized

Problem-solving

Knowledge

AAAI 2009 Spring Symposium: Humans Teaching Agents

target workflow
Target Workflow

Learned Knowledge

  • Temporal Ordering
  • Conditional branching
  • Iterations
  • Selection Criteria
  • Method Generalization

AAAI 2009 Spring Symposium: Humans Teaching Agents

quail question asking to inform learning
QUAIL:Question Asking to Inform Learning

Goal: improve learning performance through system-initiated question asking

Approach:

  • define question catalog to inform learning by demo
  • develop question models and representations
  • explore question asking strategies

“Tell me and I forget, show me and I remember,

involve me and I understand.”

- Chinese Proverb

AAAI 2009 Spring Symposium: Humans Teaching Agents

question models
Question Models

Question Cost: approximate ‘cognitive burden’ in answering

Cost(q) = wF ×FormatCost(q) + wG×GroundednessCost(q)

wF + wG = 1

Question Utility: normalize utilities across learners

Utility(q) = ∑lL wl × Utilityl(q,l) where ∑ wl = 1

Utilityl(q) = wB × BaseUtilityl(q) + wG × GoalUtilityl(q)

wB + wG = 1

AAAI 2009 Spring Symposium: Humans Teaching Agents

question selection
Question Selection
  • Given:
    • questions Q={q1… qn} with costs and utilities
    • budget B
  • Problem: find Q'⊆Q with Cost(Q') ≤ B with maximal utility
    • equivalent to 0/1 knapsack problem (no question dependencies)
    • efficient dynamic programming approaches – O(nB)

AAAI 2009 Spring Symposium: Humans Teaching Agents

charm charming hybrid adaptive ranking model

Authority: Civil

Size: Large

Authority: Military

Size: Small

CHARM(Charming Hybrid Adaptive Ranking Model)
  • Learns lexicographic preference models
    • There is an order of importance on the attributes
    • For every attribute there is a preferred value

Example:

  • Airports characterized by Authority (civil, military), Size (small, medium, large)
  • Preference Model:
    • A civil airport is preferred to a military one.
    • Among civil airports, a large airport is preferred to a small airport.

AAAI 2009 Spring Symposium: Humans Teaching Agents

charm learning
CHARM Learning
  • Idea:
    • Keep track of a set of models consistent with data of the form Obj1<Obj2
      • A partial order on the attributes and values
    • The object that is preferred by more models is more preferred
  • Algorithm for learning the models
    • Initially assume all attributes and all values are equally important
    • Loop until nothing changes
      • Given Obj1<Obj2predict a winner using the current model
      • If the predicted winner is actually the preferred one then do nothing
      • Otherwise decrease the importance of the attribute/values that led to the wrong prediction.

AAAI 2009 Spring Symposium: Humans Teaching Agents

learn from mistakes
Learn From Mistakes
  • Given training data
  • (e.g., BWI<DCA)

2) Most important attributes

predict a winner

3) Ranks of attributes

who voted for the

looser updated.

AAAI 2009 Spring Symposium: Humans Teaching Agents

learn from mistakes10
Learn from Mistakes

Given: BWI<Andrews

AAAI 2009 Spring Symposium: Humans Teaching Agents

finally
Finally
  • If the model truly is lexicographic then ranks will converge
    • No convergence => underlying model is not lexicographic.
  • If training data is consist then will correctly predict all examples

Authority: Civil

Size: Large

Authority: Military

Size: Small

AAAI 2009 Spring Symposium: Humans Teaching Agents

quail charm case study
QUAIL+CHARM Case Study

Goal: investigate how different question selection strategies impact CHARM preference learning for ordering patients

Performance Metric: CHARM's accuracy in predicting pairwise ordering preferences

Learning Target: lexicographic preference model for ordering patients defined over a subset of 5 patient attributes

  • triageCode, woundType, personClass, readyForTransport, LAT

Training Input: P1<P2 indicating P1 is at least as preferred as P2

AAAI 2009 Spring Symposium: Humans Teaching Agents

question types for charm
Question Types for CHARM
  • Object ordering: Should Patient1 be handled before Patient2?
  • Attribute relevance: Is Attr relevant to the ordering?
  • Attribute ordering: Is Attr1 preferred to Attr2?
  • Attribute value ordering: For Attr, is Val1 preferred to Val2?
  • Uniform question cost model

AAAI 2009 Spring Symposium: Humans Teaching Agents

experiment setup
Experiment Setup
  • Target preference models generated randomly
    • Draw on database of 186 patient records
  • Train on 1 problem; test on 4 problems
    • Training/test instance: a pairwise preference among 5 patients
  • 10 runs for each target preference model
    • 3 handcrafted target models with irrelevant attributes
    • 5 randomly generated target models over all 5 patient attributes

AAAI 2009 Spring Symposium: Humans Teaching Agents

results
Results

AAAI 2009 Spring Symposium: Humans Teaching Agents

observations on results
Observations on Results
  • Question answering is generally useful
    • Increased number of questions (generally) results in greater performance improvements
    • Has greater impact when fewer training examples available for learning (i.e., learned model is weaker)
  • A little knowledge can be a dangerous thing
    • CHARM’s incorporation of isolated answers can decrease performance
    • Related questions can lead to significant performance improvement
      • Being told {Attr1>Attr2, Attr4>Attr5} may not be useful (and may be harmful)
      • Being told {Attr1>Attr2, Attr2>Attr3} is very useful
  • Need for more sophisticated models of question utility
    • Learn the utility models

AAAI 2009 Spring Symposium: Humans Teaching Agents

future directions
Future Directions
  • Learn utility models through controlled experimentation
    • Assess the impact of different question types in different settings
    • Features for learning:
      • Question attributes, state of learned model, training data, previously asked questions
  • Expand set of questions, support questions with differing costs
  • Expand coverage to a broader set of Learners
  • Continuous model of question asking

AAAI 2009 Spring Symposium: Humans Teaching Agents

related work
Related Work
  • Active Learning:
    • Focus to date on classification, emphasizing selection of additional training data for a human to label
  • Interactive Task Learning:
    • Allen et al.’s work on Learning by Discussion
    • Blythe’s work on Learning by Being Told

AAAI 2009 Spring Symposium: Humans Teaching Agents