1 / 19

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning. Maxim Likhachev, Michael Kaess, and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech. This research was funded under the DARPA MARS program.

ojal
Download Presentation

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech This research was funded under the DARPA MARS program.

  2. Constant parameterization of robotic behavior results in inefficient robot performance Manual selection of “right” parameters is difficult and tedious work Motivation Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  3. Use of Case-Based Reasoning (CBR) methodology an automatic selection of optimal parameters at run-time (ICRA’01) each case is a set of behavioral parameters indexed by environmental features Motivation (cont’d) “front-obstructed” case “clear-to-goal” case Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  4. The CBR module improves robot performance (in simulations and on real robots) avoids the manual configuration of behavioral parameters The CBR module still required the creation of a case library which is dependent on a robot architecture needs extensive experimentation to optimize cases requires good understanding of how CBR works Solution: to extend the CBR module to learn new cases from scratch or optimize existing cases in a separate training process or during missions Motivation for the Current Research Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  5. Use of Case-Based Reasoning in the selection of behavioral parameters ACBARR [Georgia Tech ’92] , SINS [Georgia Tech ’93] KINS [Chagas and Hallam] Automatic optimization of behavioral parameters genetic programming (e.g., GA-ROBOT [Ram, et. al.]) reinforcement learning (e.g., Learning Momentum [Lee, et. al.]) Related Work Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  6. CBR Module controls (case output parameters): Weights for each behavior BiasMove Vector Noise Persistence Obstacle Sphere Behavioral Control and CBR Module Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  7. f1=0.58 f0=0.92 f2=1.0 f3=0.68 Vspatial: f0=0.92 f1=0.58 f2=1.00 f3=0.68 Vtemporal ShortTerm: Rs=1.0 LongTerm: Rl=0.7 f1=0.22 Vspatial: f0=0.02 f1=0.22 f2=0.63 f3=0.02 Vtemporal ShortTerm:Rs=0.01 LongTerm: Rl=1.0 f0=0.02 f2=0.63 f3=0.02 Case Indices: Environmental Features • Spatial features: traversability vector • split environment into K = 4 angular regions • compute obstacle density within each region • transform the density into traversability • Temporal features: • Short-term velocity towards the goal • Long-term velocity towards the goal Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  8. set of spatially matching cases Spatial Features Vector Matching (1st stage of Case Selection) Temporal Features Vector Matching (2nd stage of Case Selection) Feature Identification spatial & temporal feature vectors current environment set of spatially and temporally matching cases all the cases in the library Case Library Random Selection Process (3rd stage of Case Selection) best matching case case output parameters (behavioral assemblage parameters) Case Adaptation best matching or currently used case Case Application case ready for application Case switching Decision tree Overview of non-learning CBR Module Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  9. Making CBR Module to Learn set of spatially matching cases Spatial Features Vector Matching (1st stage of Case Selection) Temporal Features Vector Matching (2nd stage of Case Selection) Feature Identification spatial & temporal feature vectors current environment set of spatially and temporally matching cases all the cases in the library Case output parameters ( behavioral assemblage parameters) Case Application Random Selection Biased by Case Success and Spatial and Temporal Similarities Case Library case ready for application last K cases last K cases with adjusted performance history Case Adaptation best matching case new or existing best matching case Case switching Decision tree New Case Creation (if necessary) Old Case Performance Evaluation best matching or currently used case best matching or currently used case Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  10. P(selection) P(selection) P(selection) 1.0 1.0 1.0 best matching case: C1 set of spatially matching cases: {C1, C2, C4} set of spatially & temporally matching cases: {C1,,C4} 1.0 1.0 0.0 C5 C2 0.0 C1 C3 C4 C4 0.0 C4 C1 C2 C1 spatial similarity temporal similarity weighted sum of spatial and temporal similarities and case success Extensive Exploration of Cases: Modified Case Selection Process • Random selection of cases with the probability of the selection proportional to: • spatial similarity with the environment ( 1st step) • temporal similarity with the environment (2nd step) • weighted sum of the case past performance and spatial and temporal similarities (3rd step) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  11. Positive and Negative Reinforcement: Case Performance Evaluation • Criteria for the evaluation of the case performance : the average velocity with which the robot approaches its goal during the application of the case • opportunities for intermediate case performance evaluations • may not always be the right criteria • such cases exhibit no positive velocity towards the goal • the evaluation of the performance is delayed by K (=2) cases • case_success (represents case performance) is: • increased if the average velocity is increased or sustained high • decreased otherwise Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  12. Maximization of Reinforcement: Case Adaptation • Maximize case_success as a noisy function of case output parameters (behavioral assemblage parameters) • maintain the adaptation vector A(C) for each case C • if the last series of adaptations result in the increase of case_success then continue the adaptation: O(C) = O(C) + A(C) • otherwise switch the direction of the adaptation, add a random component and scale proportionally to case_success: A(C) = -·A(C) +  ·R O(C) = O(C) + A(C) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  13. Maximization of Reinforcement: Case Adaptation (cont’d) • Incorporate prior knowledge into the search: • fixed adaptation of the Noise_Gain and Noise_Persistence parameters based on the short- and long-term velocities of the robot • Constrain the search: • limit Obstacle_Gain to be higher than the sum of the other schema gains (to avoid collisions) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  14. The Growth of the Case Library: Case Creation Decision • To avoid divergence a new case is created whenever: • case_success of the selected case is high and spatial and temporal similarities with the environment are low to moderate • case_success of the selected case is low to moderate and spatial and temporal similarities are low • Limit the maximum size of the library (10 in this work) • New case is initialized with: • the spatial and temporal features of the environment • the output parameter values of the selected case Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  15. Experimental Analysis: Example Learning CBR: first run (starting with an empty library) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  16. Experimental Analysis: Example • Learning CBR: a run after 54 training runs on various environments • library of ten cases was learned • 36 percent shorter travel distance A case of a “clear-to-goal” strategy is learned for such environments A case of a “squeezing” strategy is learned for such environments Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  17. Experiments: Statistical Results Homogeneous environment Heterogeneous environment Simulation results (after 250 training runs for learning CBR system) Mission completion rate non-adaptive CBR learning CBR non-adaptive CBR learning CBR Average number of steps learning CBR non-adaptive non-adapt. CBR CBR learn Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  18. Real Robot Experiments: In Progress • RWI ATRV-Jr • Sensors: • SICK laser scanners in front and back • Compass • Gyroscope • Experiments in progress, no statistical results yet Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

  19. New and existing cases are learned and optimized during a training process or as part of mission executions Performance: substantially better than that of a non-adaptive system comparable to a non-learning CBR system Neither manual selection of behavioral parameters nor careful creation and optimization of case library is required from a user Future Work real robot experiments case “forgetting” component integration with other adaptation & learning methods (e.g., Learning Momentum, RL for Behavioral Assemblage Selection) Conclusions Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

More Related