Emotion driven reinforcement learning
Download
1 / 22

Emotion-Driven Reinforcement Learning - PowerPoint PPT Presentation


  • 156 Views
  • Uploaded on

Emotion-Driven Reinforcement Learning. Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08. Introduction. Interested in the functional benefits of emotion for a cognitive agent Appraisal theories of emotion PEACTIDM theory of cognitive control

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Emotion-Driven Reinforcement Learning' - ramla


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Emotion driven reinforcement learning

Emotion-Driven Reinforcement Learning

Bob Marinier & John Laird

University of Michigan, Computer Science and Engineering

CogSci’08


Introduction
Introduction

  • Interested in the functional benefits of emotion for a cognitive agent

    • Appraisal theories of emotion

    • PEACTIDM theory of cognitive control

  • Use emotion as a reward signal to a reinforcement learning agent

    • Demonstrates a functional benefit of emotion

    • Provides a theory of the origin of intrinsic reward


Outline
Outline

  • Background

    • Integration of emotion and cognition

    • Integration of emotion and reinforcement learning

    • Implementation in Soar

  • Learning task

  • Results


Appraisal theories of emotion

  • A situation is evaluated along a number of appraisal dimensions, many of which relate the situation to current goals

    • Novelty, goal relevance, goal conduciveness, expectedness, causal agency, etc.

  • Appraisals influence emotion

  • Emotion can then be coped with (via internal or external actions)

Appraisal Theories of Emotion

Situation

Goals

Coping

Appraisals

Emotion




Unification of peactidm and appraisal theories
Unification of PEACTIDM and Appraisal Theories

Perceive

Environmental Change

Raw Perceptual Information

Motor

Encode

Suddenness

Unpredictability

Goal Relevance

Intrinsic Pleasantness

Stimulus Relevance

Motor Commands

Prediction

Outcome Probability

Decode

Attend

Causal Agent/Motive

Discrepancy

Conduciveness

Control/Power

Action

Stimulus chosen for processing

Intend

Comprehend

Current Situation Assessment


Distinction between emotion mood and feeling marinier laird 2007

  • Emotion: Result of appraisals

    • Is about the current situation

  • Mood: “Average” over recent emotions

    • Provides historical context

  • Feeling: Emotion “+” Mood

    • What agent actually perceives

Distinction between emotion, mood, and feeling(Marinier & Laird 2007)


Intrinsically motivated reinforcement learning sutton barto 1998 singh et al 2004

Reward = Intensity * Valence

Intrinsically Motivated Reinforcement Learning(Sutton & Barto 1998; Singh et al. 2004)

External Environment

Environment

Actions

Sensations

Critic

“Organism”

Internal Environment

Actions

Rewards

States

Appraisal Process

Critic

Agent

+/- Feeling Intensity

Decisions

Rewards

States

Agent


Extending soar with emotion marinier laird 2007
Extending Soar with Emotion(Marinier & Laird 2007)

Episodic

Semantic

Symbolic Long-Term Memories

Procedural

Semantic

Learning

Episodic

Learning

Chunking

Reinforcement

Learning

Appraisal Detector

Short-Term Memory

Situation, Goals

Decision Procedure

Visual

Imagery

Perception

Action

Body


Extending soar with emotion marinier laird 20071
Extending Soar with Emotion(Marinier & Laird 2007)

Episodic

Semantic

Symbolic Long-Term Memories

Procedural

Semantic

Learning

Episodic

Learning

Chunking

Reinforcement

Learning

+/-Intensity

Appraisal Detector

Feeling

.9,.6,.5,-.1,.8,…

Short-Term Memory

Situation, Goals

Feelings

Decision Procedure

Feelings

Appraisals

Visual

Imagery

Emotion

.5,.7,0,-.4,.3,…

Mood

.7,-.2,.8,.3,.6,…

Perception

Action

Knowledge

Body

Architecture


Learning task
Learning task

Start

Goal


Learning task encoding
Learning task: Encoding

North

Passable: false

On path: false

Progress: true

East

Passable: false

On path: true

Progress: true

West

Passable: false

On path: false

Progress: true

South

Passable: true

On path: true

Progress: true


Learning task encoding appraisal
Learning task: Encoding & Appraisal

North

Intrinsic Pleasantness: Low

Goal Relevance: Low

Unpredictability: High

East

Intrinsic Pleasantness: Low

Goal Relevance: High

Unpredictability: High

West

Intrinsic Pleasantness: Low

Goal Relevance: Low

Unpredictability: High

South

Intrinsic Pleasantness: Neutral

Goal Relevance: High

Unpredictability: Low


Learning task attending comprehending appraisal
Learning task: Attending, Comprehending & Appraisal

South

Intrinsic Pleasantness: Neutral

Goal Relevance: High

Unpredictability: Low

Conduciveness: High

Control: High …



Learning task tasking1
Learning task: Tasking

Optimal Subtasks


What is being learned
What is being learned?

  • When to Attend vs Task

  • If Attending, what to Attend to

  • If Tasking, which subtask to create

  • When to Intend vs. Ignore




Discussion
Discussion

  • Agent learns both internal (tasking) and external (movement) actions

  • Emotion allows for more frequent rewards, and thus learns faster than standard RL

  • Mood “fills in the gaps” allowing for even faster learning and less variability


Conclusion future work
Conclusion & Future Work

  • Demonstrated computational model that integrates emotion and cognitive control

  • Confirmed emotion can drive reinforcement learning

  • We have already successfully demonstrated similar learning in a more complex domain

  • Would like to explore multi-agent scenarios


ad