Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech This research was funded under the DARPA MARS program.

Constant parameterization of robotic behavior results in inefficient robot performance Manual selection of “right” parameters is difficult and tedious work Motivation Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Use of Case-Based Reasoning (CBR) methodology an automatic selection of optimal parameters at run-time (ICRA’01) each case is a set of behavioral parameters indexed by environmental features Motivation (cont’d) “front-obstructed” case “clear-to-goal” case Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

The CBR module improves robot performance (in simulations and on real robots) avoids the manual configuration of behavioral parameters The CBR module still required the creation of a case library which is dependent on a robot architecture needs extensive experimentation to optimize cases requires good understanding of how CBR works Solution: to extend the CBR module to learn new cases from scratch or optimize existing cases in a separate training process or during missions Motivation for the Current Research Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Use of Case-Based Reasoning in the selection of behavioral parameters ACBARR [Georgia Tech ’92] , SINS [Georgia Tech ’93] KINS [Chagas and Hallam] Automatic optimization of behavioral parameters genetic programming (e.g., GA-ROBOT [Ram, et. al.]) reinforcement learning (e.g., Learning Momentum [Lee, et. al.]) Related Work Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

CBR Module controls (case output parameters): Weights for each behavior BiasMove Vector Noise Persistence Obstacle Sphere Behavioral Control and CBR Module Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

f1=0.58 f0=0.92 f2=1.0 f3=0.68 Vspatial: f0=0.92 f1=0.58 f2=1.00 f3=0.68 Vtemporal ShortTerm: Rs=1.0 LongTerm: Rl=0.7 f1=0.22 Vspatial: f0=0.02 f1=0.22 f2=0.63 f3=0.02 Vtemporal ShortTerm:Rs=0.01 LongTerm: Rl=1.0 f0=0.02 f2=0.63 f3=0.02 Case Indices: Environmental Features • Spatial features: traversability vector • split environment into K = 4 angular regions • compute obstacle density within each region • transform the density into traversability • Temporal features: • Short-term velocity towards the goal • Long-term velocity towards the goal Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

set of spatially matching cases Spatial Features Vector Matching (1st stage of Case Selection) Temporal Features Vector Matching (2nd stage of Case Selection) Feature Identification spatial & temporal feature vectors current environment set of spatially and temporally matching cases all the cases in the library Case Library Random Selection Process (3rd stage of Case Selection) best matching case case output parameters (behavioral assemblage parameters) Case Adaptation best matching or currently used case Case Application case ready for application Case switching Decision tree Overview of non-learning CBR Module Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Making CBR Module to Learn set of spatially matching cases Spatial Features Vector Matching (1st stage of Case Selection) Temporal Features Vector Matching (2nd stage of Case Selection) Feature Identification spatial & temporal feature vectors current environment set of spatially and temporally matching cases all the cases in the library Case output parameters ( behavioral assemblage parameters) Case Application Random Selection Biased by Case Success and Spatial and Temporal Similarities Case Library case ready for application last K cases last K cases with adjusted performance history Case Adaptation best matching case new or existing best matching case Case switching Decision tree New Case Creation (if necessary) Old Case Performance Evaluation best matching or currently used case best matching or currently used case Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

P(selection) P(selection) P(selection) 1.0 1.0 1.0 best matching case: C1 set of spatially matching cases: {C1, C2, C4} set of spatially & temporally matching cases: {C1,,C4} 1.0 1.0 0.0 C5 C2 0.0 C1 C3 C4 C4 0.0 C4 C1 C2 C1 spatial similarity temporal similarity weighted sum of spatial and temporal similarities and case success Extensive Exploration of Cases: Modified Case Selection Process • Random selection of cases with the probability of the selection proportional to: • spatial similarity with the environment ( 1st step) • temporal similarity with the environment (2nd step) • weighted sum of the case past performance and spatial and temporal similarities (3rd step) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Positive and Negative Reinforcement: Case Performance Evaluation • Criteria for the evaluation of the case performance : the average velocity with which the robot approaches its goal during the application of the case • opportunities for intermediate case performance evaluations • may not always be the right criteria • such cases exhibit no positive velocity towards the goal • the evaluation of the performance is delayed by K (=2) cases • case_success (represents case performance) is: • increased if the average velocity is increased or sustained high • decreased otherwise Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Maximization of Reinforcement: Case Adaptation • Maximize case_success as a noisy function of case output parameters (behavioral assemblage parameters) • maintain the adaptation vector A(C) for each case C • if the last series of adaptations result in the increase of case_success then continue the adaptation: O(C) = O(C) + A(C) • otherwise switch the direction of the adaptation, add a random component and scale proportionally to case_success: A(C) = -·A(C) +  ·R O(C) = O(C) + A(C) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Maximization of Reinforcement: Case Adaptation (cont’d) • Incorporate prior knowledge into the search: • fixed adaptation of the Noise_Gain and Noise_Persistence parameters based on the short- and long-term velocities of the robot • Constrain the search: • limit Obstacle_Gain to be higher than the sum of the other schema gains (to avoid collisions) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

The Growth of the Case Library: Case Creation Decision • To avoid divergence a new case is created whenever: • case_success of the selected case is high and spatial and temporal similarities with the environment are low to moderate • case_success of the selected case is low to moderate and spatial and temporal similarities are low • Limit the maximum size of the library (10 in this work) • New case is initialized with: • the spatial and temporal features of the environment • the output parameter values of the selected case Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Experimental Analysis: Example Learning CBR: first run (starting with an empty library) Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Experimental Analysis: Example • Learning CBR: a run after 54 training runs on various environments • library of ten cases was learned • 36 percent shorter travel distance A case of a “clear-to-goal” strategy is learned for such environments A case of a “squeezing” strategy is learned for such environments Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Experiments: Statistical Results Homogeneous environment Heterogeneous environment Simulation results (after 250 training runs for learning CBR system) Mission completion rate non-adaptive CBR learning CBR non-adaptive CBR learning CBR Average number of steps learning CBR non-adaptive non-adapt. CBR CBR learn Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Real Robot Experiments: In Progress • RWI ATRV-Jr • Sensors: • SICK laser scanners in front and back • Compass • Gyroscope • Experiments in progress, no statistical results yet Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

New and existing cases are learned and optimized during a training process or as part of mission executions Performance: substantially better than that of a non-adaptive system comparable to a non-learning CBR system Neither manual selection of behavioral parameters nor careful creation and optimization of case library is required from a user Future Work real robot experiments case “forgetting” component integration with other adaptation & learning methods (e.g., Learning Momentum, RL for Behavioral Assemblage Selection) Conclusions Maxim Likhachev, Michael Kaess, and Ronald C. Arkin

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning

Presentation Transcript

Case-based reasoning

Case Based Reasoning

Case Based Reasoning

Case-based reasoning

Case-based reasoning

Case-based Reasoning

Case-Based Reasoning

Spatio-temporal HAC

Knowledge Learning by Using Case Based Reasoning (CBR)

Case-Based Reasoning

Spatio-Temporal Aggregation Using Sketches

Case-Based Reasoning

Spatio – Temporal Cluster Detection Using AMOEBA

Spatio-Temporal Case-Based Reasoning for Behavioral Selection

Spatio-Temporal Clustering

Spatio-Temporal Databases

Heuristic Formalism for Spatio-Temporal Qualitative Reasoning

Spatio-Temporal and Context Reasoning in Smart Homes

Case-Based Reasoning

Heuristic Formalism for Spatio-Temporal Qualitative Reasoning

Spatio-Temporal Predicates

Case-Based Reasoning