learning prospective robot behavior
Download
Skip this Video
Download Presentation
Learning Prospective Robot Behavior

Loading in 2 Seconds...

play fullscreen
1 / 15

Learning Prospective Robot Behavior - PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on

Learning Prospective Robot Behavior. Shichao Ou and Roderic Grupen Laboratory for Perceptual Robotics University of Massachusetts Amherst. A Developmental Approach. Infant Learning In stages Maturation processes Parents provide constrained learning contexts Protect Easy Complex

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Learning Prospective Robot Behavior' - amandla


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
learning prospective robot behavior

Learning Prospective Robot Behavior

Shichao Ou and Roderic Grupen

Laboratory for Perceptual Robotics

University of Massachusetts Amherst

a developmental approach
A Developmental Approach
  • Infant Learning
    • In stages
      • Maturation processes
    • Parents provide constrained learning contexts
      • Protect
      • EasyComplex
        • Motion mobile for newborns
        • Use brightly colored, easy to pick up objects
        • Use building blocks
        • Association of words and objects
application in robotics
Application in Robotics
  • Framework for Robot Developmental Learning
    • Role of teacher: setup learning contexts that make target concept conspicuous
    • Role of robot: acquire concepts, generalize to new contexts by autonomous exploration, provide feedback
  • Control Basis
    • Robot actions are created using combinations of <σ,ф,τ>
    • Establish stages of learning by time-varying constraints on resources
      • Easy  Complex
example
Example
  • Learning to Reach for Objects
    • Stage 1: SearchTrack
      • Focus attention using single brightly colored object (σ)
      • Limit DOF (τ) to use head ONLY
    • Stage 2: ReachGrab
      • Limit DOF (τ) to use one arm ONLY
    • Stage 3: Handedness, Scale-Sensitive

Hart et. al, 2008

prospective learning
Prospective Learning
  • Infant adapts to new situations by prospectively look ahead and predict failure and then learn a repair strategy
robot prospective learning with human guidance
Robot Prospective Learning with Human Guidance

a1

ai-1

ai

aj-1

aj

an-1

a0

S0

S1

Si

Sj

Sn

a1

ai-1

ai

aj-1

aj

an-1

a0

S0

S1

Si

Sj

Sn

Challenge

g(f)=0

g(f)=1

a1

ai-1

ai

aj-1

aj

an-1

a0

S0

S1

Si

Sj

Sn

sub-task

Si1

Sij

Sin

a 2d navigation domain problem
A 2D Navigation Domain Problem
  • 30x30 map
  • 6 doors, randomly closed
  • 6 buttons
  • 1 start and 1 goal
  • 3-bit door sensor on robot
flat learning results
Flat Learning Results
  • Flat Q-Learning
    • 5-bit state
      • (x,y, door-bit1, door-bit2, door-bit3)
    • 4 actions
      • up, down, left, right
    • Reward
      • 1 for reaching the goal
      • -0.01 for every step taken
    • Learning parameter
      • α=0.1, γ=1.0, ε=0.1
  • Learned solutions after 30,000 episodes
prospective learning1
Prospective Learning
  • Stage 1
    • All doors open
    • Constrain resources to use only (x,y) sensors
    • Allow agent learn a policy from start to goal

Down

Right

Right

Up

Right

Right

Right

S0

S1

Si

Sj

Sn

prospective learning2
Prospective Learning
  • Stage 2
    • Close 1 door
    • Robot learns the cause of the failure
    • Robot back tracks and finds an earlier indicator of this cause
prospective learning3
Prospective Learning
  • Stage 2
    • Close 1 door
    • Robot learns the cause of the failure
    • Robot back tracks and finds an earlier indicator of this cause
    • Create a sub-task
    • Learn a new policy to sub-task
prospective learning4
Prospective Learning
  • Stage 2
    • Close 1 door
    • Robot learns the cause of the failure
    • Robot back tracks and finds an earlier indicator of this cause
    • Create a sub-task
    • Learn a new policy to sub-task
    • Resume original policy
prospective learning results
Prospective Learning Results

Learned solutions < 2000 episodes

humanoid robot manipulation domain
Humanoid Robot Manipulation Domain
  • Benefits of Prospective Learning
    • Adapt to new contexts by maintaining majority of the existing policy
    • Automatically generates sub-goals
    • Sub-task can be learned in a completely different state space.
    • Supports interactive learning
conclusion
Conclusion
  • A developmental view to robot learning
  • A framework enables interactive incremental learning in stages
  • Extension to the control basis learning framework using the idea of prospective learning
ad