1 / 15

Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence Laboratory MIT

Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence Laboratory MIT. Progress to Date. Erik the Red Video game environment Optical flow implementation Fast bootstrapped reinforcement learning. Erik the Red. RWI B21 robot

leiko
Download Presentation

Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence Laboratory MIT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adaptive Intelligent Mobile Robotics • Leslie Pack Kaelbling • Artificial Intelligence Laboratory • MIT

  2. Progress to Date • Erik the Red • Video game environment • Optical flow implementation • Fast bootstrapped reinforcement learning

  3. Erik the Red • RWI B21 robot • camera, sonars, laser range-finder, infrareds • 3 Linux machines • ported our framework for writing debuggable code

  4. Erik the Red

  5. Crystal Space • Public-domain video-game environment • complex graphics • other agents • highly modifiable

  6. Crystal Space

  7. Optical Flow • Get range information visually by computing optical flow field • nearer objects cause flow of higher magnitude • expansion pattern means you’re going to hit • rate of expansion tells you when • elegant control laws based on center and rate of expansion (derived from human and fly behavior)

  8. Optical Flow in Crystal Space

  9. Making RL Really Work • Typical RL methods require far too much data to be practical in an online setting. Address the problem by • strong generalization techniques • using human input to bootstrap

  10. JAQL • Learning a value function in a continuous state and action space • based on locally weighted regression (fancy version of nearest neighbor) • algorithm knows what it knows • use meta-knowledge to be conservative about dynamic-programming updates

  11. Incorporating Human Input • Humans can help a lot, even if they can’t perform the task very well. • Provide some initial successful trajectories through the space • Trajectories are not used for supervised learning, but to guide the reinforcement-learning methods through useful parts of the space • Learn models of the dynamics of the world and of the reward structure • Once learned models are good, use them to update the value function and policy as well.

  12. Simple Experiment • The “hill-car” problem in two continuous dimensions • Regular RL methods take thousands of trials to learn a reasonable policy • JAQL takes 11 inefficient but eventually successful trails generated by humans to get 80% performance • 10 more subsequent trials generate high quality performance in the whole space

  13. Success Percentage

  14. Trial Length (200 max) 54-step optimum

  15. Next Steps • Implement optical-flow control algorithms on robot • Apply RL techniques to tune parameters in control algorithms on robot in real time • corridor following using sonar and laser • obstacle avoidance using optical flow • Build highly complex simulated environment • Integrate planning and learning in multi-layer system

More Related