Learning prospective robot behavior
Download
1 / 15

Learning Prospective Robot Behavior - PowerPoint PPT Presentation


  • 68 Views
  • Uploaded on

Learning Prospective Robot Behavior. Shichao Ou and Roderic Grupen Laboratory for Perceptual Robotics University of Massachusetts Amherst. A Developmental Approach. Infant Learning In stages Maturation processes Parents provide constrained learning contexts Protect Easy Complex

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Learning Prospective Robot Behavior' - amandla


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Learning prospective robot behavior

Learning Prospective Robot Behavior

Shichao Ou and Roderic Grupen

Laboratory for Perceptual Robotics

University of Massachusetts Amherst


A developmental approach
A Developmental Approach

  • Infant Learning

    • In stages

      • Maturation processes

    • Parents provide constrained learning contexts

      • Protect

      • EasyComplex

        • Motion mobile for newborns

        • Use brightly colored, easy to pick up objects

        • Use building blocks

        • Association of words and objects


Application in robotics
Application in Robotics

  • Framework for Robot Developmental Learning

    • Role of teacher: setup learning contexts that make target concept conspicuous

    • Role of robot: acquire concepts, generalize to new contexts by autonomous exploration, provide feedback

  • Control Basis

    • Robot actions are created using combinations of <σ,ф,τ>

    • Establish stages of learning by time-varying constraints on resources

      • Easy  Complex


Example
Example

  • Learning to Reach for Objects

    • Stage 1: SearchTrack

      • Focus attention using single brightly colored object (σ)

      • Limit DOF (τ) to use head ONLY

    • Stage 2: ReachGrab

      • Limit DOF (τ) to use one arm ONLY

    • Stage 3: Handedness, Scale-Sensitive

Hart et. al, 2008


Prospective learning
Prospective Learning

  • Infant adapts to new situations by prospectively look ahead and predict failure and then learn a repair strategy


Robot prospective learning with human guidance
Robot Prospective Learning with Human Guidance

a1

ai-1

ai

aj-1

aj

an-1

a0

S0

S1

Si

Sj

Sn

a1

ai-1

ai

aj-1

aj

an-1

a0

S0

S1

Si

Sj

Sn

Challenge

g(f)=0

g(f)=1

a1

ai-1

ai

aj-1

aj

an-1

a0

S0

S1

Si

Sj

Sn

sub-task

Si1

Sij

Sin


A 2d navigation domain problem
A 2D Navigation Domain Problem

  • 30x30 map

  • 6 doors, randomly closed

  • 6 buttons

  • 1 start and 1 goal

  • 3-bit door sensor on robot


Flat learning results
Flat Learning Results

  • Flat Q-Learning

    • 5-bit state

      • (x,y, door-bit1, door-bit2, door-bit3)

    • 4 actions

      • up, down, left, right

    • Reward

      • 1 for reaching the goal

      • -0.01 for every step taken

    • Learning parameter

      • α=0.1, γ=1.0, ε=0.1

  • Learned solutions after 30,000 episodes


Prospective learning1
Prospective Learning

  • Stage 1

    • All doors open

    • Constrain resources to use only (x,y) sensors

    • Allow agent learn a policy from start to goal

Down

Right

Right

Up

Right

Right

Right

S0

S1

Si

Sj

Sn


Prospective learning2
Prospective Learning

  • Stage 2

    • Close 1 door

    • Robot learns the cause of the failure

    • Robot back tracks and finds an earlier indicator of this cause


Prospective learning3
Prospective Learning

  • Stage 2

    • Close 1 door

    • Robot learns the cause of the failure

    • Robot back tracks and finds an earlier indicator of this cause

    • Create a sub-task

    • Learn a new policy to sub-task


Prospective learning4
Prospective Learning

  • Stage 2

    • Close 1 door

    • Robot learns the cause of the failure

    • Robot back tracks and finds an earlier indicator of this cause

    • Create a sub-task

    • Learn a new policy to sub-task

    • Resume original policy


Prospective learning results
Prospective Learning Results

Learned solutions < 2000 episodes


Humanoid robot manipulation domain
Humanoid Robot Manipulation Domain

  • Benefits of Prospective Learning

    • Adapt to new contexts by maintaining majority of the existing policy

    • Automatically generates sub-goals

    • Sub-task can be learned in a completely different state space.

    • Supports interactive learning


Conclusion
Conclusion

  • A developmental view to robot learning

  • A framework enables interactive incremental learning in stages

  • Extension to the control basis learning framework using the idea of prospective learning


ad