Learning momentum integration and experimentation
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Learning Momentum: Integration and Experimentation PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on
  • Presentation posted in: General

Learning Momentum: Integration and Experimentation. Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA. Motivation. It’s hard to manually derive controller parameters. The parameter space increases exponentially with the number of parameters.

Download Presentation

Learning Momentum: Integration and Experimentation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Learning momentum integration and experimentation

Learning Momentum: Integration and Experimentation

Brian Lee and Ronald C. Arkin

Mobile Robot Laboratory

Georgia Tech

Atlanta, GA


Motivation

Motivation

  • It’s hard to manually derive controller parameters.

    • The parameter space increases exponentially with the number of parameters.

  • You don’t always have a priori knowledge of the environment.

    • Without prior knowledge, a user can’t confidently derive appropriate parameter values, so it becomes necessary for the robot to adapt on its own to what it finds.

  • Obstacle densities and layout in the environment may be heterogeneous.

    • Parameters that work well for one type of environment may not work well with another type.


Adaptation and learning methods darpa mars

Adaptation and Learning Methods – DARPA MARS

  • Investigate robot shaping at five distinct levels in a hybrid robot software architecture

  • Implement algorithms within MissionLab mission specification system

  • Conduct experiments to evaluate performance of each technique

  • Combine techniques where possible

  • Integrate on a platform more suitable for realistic missions and continue development


Overview of techniques

Overview of techniques

THE LEARNINGCONTINUUM:

Deliberative (premission)

.

.

.

Behavioral switching

.

.

.

Reactive (online adaptation)

  • CBR Wizardry

    • Guide the operator

  • Probabilistic Planning

    • Manage complexity for the operator

  • RL for Behavioral Assemblage Selection

    • Learn what works for the robot

  • CBR for Behavior Transitions

    • Adapt to situations the robot can recognize

  • Learning Momentum

    • Vary robot parameters in real time

.

.

.


Basic concepts of lm

Basic Concepts of LM

  • Provides adaptability to behavior-based systems

  • A crude form of reinforcement learning.

    • If the robot is doing well, keep doing what it’s doing, otherwise try something different.

  • Behavior parameters are changed in response to progress and obstacles.

  • The system is still fully reactive.

    • Although the robot changes its behavior, there is no deliberation.


Currently used behaviors

Currently Used Behaviors

  • Move to Goal

    • Always returns a vector pointing toward the goal position.

  • Avoid Obstacles

    • Returns a sum of weighted vectors pointing away from obstacles.

  • Wander

    • Returns vectors pointing in random directions.


Adjustable parameters

Adjustable Parameters

  • Move to goal vector gain

  • Avoid obstacle vector gain

  • Avoid obstacle sphere of influence

    • Radius around the robot inside of which obstacles are perceived

  • Wander vector gain

  • Wander persistence

    • The number of consecutive steps the wander vector points in the same direction


Four predefined situations

Four Predefined Situations

  • no movement

    • M < T movement

  • progress toward the goal

    • M > T movement

    • P > T progress

  • no progress with obstacles

    • M > T movement

    • P < T progress

    • O count > T obstacles

  • no progress without obstacles

    • M > T movement

    • P < T progress

    • O count < T obstacles

  • M = average movement

  • M goal = average

    movement to the goal

  • P = M goal /M

  • O count = obstacles

    encountered

  • T movement = movement

    threshold

  • T progress = progress

    threshold

  • T obstacles = obstacles

    threshold


Parameter adjustments

Parameter adjustments

Sample adjustment parameters for ballooning.


Two possible strategies

Two Possible Strategies

  • Ballooning - Sphere of influence is increased when obstacles impede progress. The robot moves around large objects.

  • Squeezing - Sphere of influence is decreased when obstacles impede progress. The robot moves between closely spaced objects.


Integration

Integration

Base System

Sensors

Controller

Position and

Goal Information

Move To Goal(Gm)

Obstacle

Information

Avoid Obstacles(Go,S)

Output

direction

Wander(Gw,P)

  • Gm = goal gain

  • Go = obstacle gain

  • S = obstacle sphere of

    influence

  • Gw = wander gain

  • P = wander persistence


Integration1

Integration

Integrated System

Sensors

Controller

Position and

Goal Information

Move To Goal(Gm)

Obstacle

Information

Avoid Obstacles(Go,S)

Output

direction

Wander(Gw,P)

  • Gm = goal gain

  • Go = obstacle gain

  • S = obstacle sphere of

    influence

New Gm, Go, S, Gw,

and P parameters.

  • Gw = wander gain

LM Module

  • P = wander persistence


Experiments in simulation

Experiments in Simulation

  • 150m x 150m area

  • robot moves from (10m, 10m) to (140m, 90m)

  • Obstacle densities of 15% and 20% were used.

  • Obstacle radii varied between 0.38m and 1.43m.


Ballooning

Ballooning


Observations on ballooning

Observations on Ballooning

  • Covers a lot of area

  • Not as easily trapped in box canyon situations

  • May settle in locally clear areas

  • May require a high wander gain to carry the robot through closely spaced obstacles


Squeezing

Squeezing


Observations on squeezing

Observations on Squeezing

  • Results in a straighter path

  • Moves easily through closely spaced obstacles

  • May get trapped in small box canyon situations for large amounts of time


Simulations of the real world

Simulations of the Real World

End

Place

Start

Place

24m x 10m

Simulated setup of the real world environment.


Completion rates for simulation

Completion Rates For Simulation

Uniform Obstacle Size (1m radii)

Varying Obstacle Sizes (0.38m - 1.43m radii)


Average steps to completion

Average Steps to Completion

Uniform Obstacle Size (1m radii)

Varying Obstacle Sizes (0.38m - 1.43m radii)


Results from simulated real environment

Results FromSimulated Real Environment

% Complete

Steps to Completion

  • As before, there is an increase in completion rates with an accompanying increase in steps to completion.


Simulation results

Simulation Results

  • Completion rates can be drastically improved.

  • Completion rate improvements come at a cost of time.

  • Ballooning and squeezing strategies are geared toward different situations.


Physical robot experiments

Physical Robot Experiments

  • Nomad 150 robot

  • Sonar ring for obstacle avoidance

  • Traverses the length of a 24m x 10m room while negotiating obstacles


Outdoor run adaptive

Outdoor Run (adaptive)


Outdoor run non adaptive

Outdoor Run (non-adaptive)


Physical experiment results

Physical Experiment Results

  • Non-learning robots became stuck.

  • Learning robots successfully negotiated the obstacles.

  • Squeezing was faster than ballooning in this case.

Average steps to goal.


Conclusions

Conclusions

  • Improved success has a price of time.

  • Performance of one strategy is very poor in situations better suited for another strategy.

  • The ballooning strategy is generally faster.

  • Ballooning robots can move through closely spaced objects faster than squeezing robots can move out of box canyon situations.


Conclusions cont d

Conclusions (cont’d)

  • If some general knowledge of the terrain is know a priori, an appropriate strategy can be chosen.

  • If terrain is totally unknown, ballooning is probably the better choice.

  • A way to dynamically switch strategies should improve performance.


  • Login