1 / 28

Learning through Interactive Behavior Specifications

Learning through Interactive Behavior Specifications. Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan. Goal. Automatically generate cognitive agents Reduce the cost of agent development

brigit
Download Presentation

Learning through Interactive Behavior Specifications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan

  2. Goal • Automatically generate cognitive agents • Reduce the cost of agent development • Reduce the expertise required to develop agents.

  3. Domains • Autonomous Cognitive agents • Dynamic Virtual Worlds • Real time decisions based on knowledge and sensed data • Soar agent architecture

  4. Learning by Observation • Approach: • Observe expert behavior • Learn to replicate it • Why? • We may want human-like agents • In complex domains, imitating humans maybe easier than learning from scratch

  5. Bottleneck in pure Learning by Observation • PROBLEM: • You cannot observe the internal reasoning of the expert • SOLUTION: • Ask the expert for additional information • Goal annotations • Use additional knowledge sources • Task & domain knowledge

  6. Interface Expert Agent Learning by Observation Environment Goal annotations Actions Percepts Additional Task Knowledge Learner

  7. Interface ILP 2004 Machine Learning Journal (forthcoming) Agent Learning by Observation Environment

  8. Interface Expert Agent Learning by ObservationCritic Mode Environment critic Learner

  9. Interface Expert Agent One Body, Two Minds ? ? Environment • How and when to switch control • How the expert and the agent program communicate

  10. Expert Environment Redux Agent Diagrammatic Behavior Specification Learner

  11. Redux • Visual rule editing • Diagrammatic Behavior Specification

  12. i3 i3 i3 i3 Get-item-in-room(Item) Goto-next-room i4 r3 r2 d3 d4 d2 Go-to-door(D) Go-to(Door) Go-through(Door) d1 d5 d6 r1 r4 Goal Hierarchy Get-item(Item) Get-item-different-room(Item) • Task-Performance knowledge is represented with a hierarchy of durative goals.

  13. i3 i3 i4 r3 r2 d3 d4 d2 Go-to-door(D) Go-through(Door) d1 d5 d6 r1 r4 Goal Hierarchy Get-item(i3) Item=i3 Get-item-in-room(Item) Get-item-in-room(i3) Goto-next-room Get-item-different-room(Item) Go-to(Door)

  14. i3 i3 i4 r3 r2 d3 d4 d2 Go-through(Door) d1 d5 d6 r1 r4 Goal Hierarchy Get-item(i3) Item=i3 Get-item-in-room(Item) Get-item-different-room(i3) Get-item-different-room(Item) Door=d1 Go-to(Door) Go-to(d1)

  15. i3 Goto-next-room i4 r3 r2 d3 d4 d2 Go-to(Door) Go-to-door(D) Go-through(d1) d1 d5 d6 r1 r4 Goal Hierarchy i3 Get-item(i3) Get-item-in-room(Item) Get-item-different-room(i3) Door=d1

  16. Expert Agent Behavior Specification • Expert draws initial abstract situation • Create senario by selecting actions

  17. Expert Agent Goal Specification • Goals are explicitly selected • The agent contributes based on the current situation, current goal and its knowledge

  18. Goal Hierarchy • Learning by Observation perspective • Unobservable mental reasoning of the expert • Learning Perspective • Bias hypothesis space • “learn agent” problem reduced to “learn goal selection and termination” • MI Perspective • information exchange between the expert and the agent

  19. Expert Agent Relevant Knowledge Specification Prepare food • Expert can mark important objects in a decision

  20. Rich Behavior Trace • Expert specified undesired actions and goals • Expert rejected actions and goals of the approximately learned agent program Watch TV

  21. Rich Behavior Trace • Hypothetical Actions and Goals • Situation history : a tree structure of possible behaviors

  22. Relational Learning by Observation • Input: • Relational Situations • Goal and action selections and rejections • Additional annotations (i.e. important objects) • Background knowledge • Output: • Rule based agent program • Learn goal/action selection/termination • generalizing over multiple examples • Inductive Logic Programming to combine rich knowledge structures

  23. Relational Learning by Observation

  24. Relational Learning by Observation Find the common structures in the decision examples

  25. Relational Learning by Observation ? Learn relations between what the agent wants, perceives and knows. “Select a door in the current room, which leads to a room that contains the item the agent wants to get”

  26. Summary Diagrammatic behavior specification approach: • To extract rich behavior knowledge • Interactive behavior specification • Communication medium between the agents (explicit goals and assumed situation) • Relational learning by observation approach to combine multiple complex knowledge sources

  27. Future Work • Improve mixed initiative interaction of the interface • Explore domain independent diagrammatic interface features • Allow the expert to enter context sensitive knowledge

  28. Mixed initiative perspective • Interactive behavior specification • Diagrammatic representation of behavior • communication medium between the agents • Explicit goals and desired behavior • Facilitates interaction between the agents

More Related