Distributed Evolution for Swarm Robotics

Distributed Evolution for Swarm Robotics Suranga Hettiarachchi Computer Science Department University of Wyoming Committee Members: Dr. William Spears – Computer Science (Committee Chair / Research Advisor) Dr. Diana Spears – Computer Science Dr. Thomas Bailey – Computer Science Dr. Richard Anderson-Sprecher – Statistics Dr. David Thayer – Physics and Astronomy

Outline • Goals and Contributions • Robot Swarms • Physicomimetics Framework • Offline Evolutionary Learning • Novel Distributed Online Learning • Obstacle Avoidance with Physical Robots • Conclusion and Future Work

Goals • To improve the state-of-the-art of obstacle avoidance in swarm robotics. • To create a novel real-time learning algorithm for swarm robotics, to improve performance in changing environments.

Contributions • Improved performance in obstacle avoidance: • Scales to far higher numbers of robots and obstacles than the norm • Invented an online population-based learning algorithm: • Demonstrate feasibility of algorithm with obstacle avoidance, in environments that change dynamically and are three times denser than the norm, with obstructed perception • Hardware Implementation • Implemented obstacle avoidance algorithm on real robots Obstacle Avoidance Online Learning Algorithm Hardware Implementation

Robot Swarms • Robot swarms can act as distributed computers, solving problems that a single robot cannot • For many tasks, having a swarm maintain cohesiveness while avoiding obstacles and performing the task is of vital importance • Example Task: Chemical Plume Source Tracing

Chemical Plume Source Tracing Link to this movie may not work properly

Physicomimetics for Robot Control • Biomimetics: Gain inspiration from biological systems and ethology. • Physicomimetics: Gain inspiration from • physical systems. Good for formations.

Physicomimetics Framework Robots have limited sensor range, and friction for stabilization Robots are controlled via “virtual” forces from nearby robots, goals, and obstacles. F = ma control law. Seven robots form a hexagon

Two Classes of Force Laws The “classic” law Novel use of LJ force law for robot control The left “Newtonian” force law, is good for creating swarms in rigid formations. The right “Lennard-Jones” force law (LJ) more easily models fluid behavior, which is potentially better for maintaining cohesion while avoiding obstacles.

What do these force laws look like? Change in Force Magnitude With Varying Distance for Robot – Robot Interactions Fmax = 1.0 Fmax = 4.0 Desired Robot Separation Distance = 50

Swarm Learning (Offline) • Typically, the interactions between the swarm robots are learned via simulation in “offline” mode. Swarm Simulation Rules Fitness Final Rules that achieve the desired behavior Offline Learning, such as an Evolutionary Algorithm (EA) Initial Rules

Swarm Simulation Environment

Offline Learning Approach • An Evolutionary Algorithm (EA) is used to evolve the rules for the robots in the swarm. • A global observer assigns fitness to the rules based on the collective behavior of the swarm in the simulation. • Each member of the swarm uses the same rules. The swarm is a homogeneous distributed system. • For physicomimetics, the rules consists of force law parameters.

Force Law Parameters • Parameters of the “Newtonian” force law G- “gravitational” constant of robot-robot interactions P- power of the force law for robot-robot interactions Fmax- maximum force of robot-robot interactions Similar 3-tuples for obstacle/goal-robot interactions. • Parameters of the LJ force law ε- strength of the robot-robot interactions c- non-negative attractive robot-robot parameter d- non-negative repulsive robot-robot parameter Fmax- maximum force of robot-robot interactions Similar 4-tuples for obstacle/goal-robot interactions.

Measuring Fitness • Connectivity (Cohesion) : maximum number of robots connected via a communication path. • Reachability (Survivability) : percentage of robots that reach the goal. • Time to Goal : time taken by at least 80% of the robots to reach the goal. High fitness corresponds to high connectivity, high reachability, and low time to goal. goal connectivity 4R reachability

Summary of Results • We compared the performance of the best “Newtonian” force law found by the EA to the best LJ force law. • The “Newtonian” force law produces more rigid structures making it difficult to navigate through obstacles. This causes poor performance, despite high connectivity. • Lennard-Jones is superior, because the swarm acts as a viscous fluid. Connectivity is maintained while allowing the robots to reach the goal in a timely manner. • The Lennard-Jones force law demonstrates scalability in the number of robots and obstacles.

Connectivity of Robots

Time for 80% of the Robots to Reach the Goal

A Problem • The simulation assumes a certain environment. What happens if the environment changes when the swarm is fielded? • We can’t go back to the simulation world. • Can the swarm adapt “on-line” in the field? Environment trained on. Environment changes. Performance degrades.

Frequently Proposed Solution • Each robot has sufficient CPU power and memory to maintain a complete map of the environment. • When environment changes, each robot runs an EA internally, on a simulation of the new environment. • Robots wait until new rules are evolved. • It is better to learn in the field, in real time. 4 days of simulation time

Example • The maximum velocity is increased by 1.5x. • Obstacles are tripled in size. • High obstacle density creates cul-de-sacs and robots are left behind. Collisions also occur. • Obstructed perception is also introduced. • The learned offline rules are no longer sufficient. Environment trained on. Environment changes. Performance degrades.

Novel Online Learning Approach • Borrow from evolution. • Each robot in the swarm is an individual in a population that interacts with its neighbors. • Each robot contains a slightly mutated copy of the best rule set found with offline learning. • When the environment changes, some mutations perform better than others. • Better performing robots share their knowledge with poorer performing neighbors. • We call this “Distributed Agent Evolution with Dynamic Adaptation to Local Unexpected Scenarios” (DAEDALUS).

DAEDALUS for Obstacle Avoidance • Each robot is initialized with randomly perturbed (via mutation) versions of the force laws learned with the offline simulation. • Robots are penalized if they collide with obstacles and/or are left behind. • Robots that are most successful and are moving will retain the highest worth, and share their force laws with neighboring robots that were not as successful.

Experimental Setup • There are five goals to reach in a long corridor. • Between each goal is a different obstacle course. • Robots that are left behind (due to obstacle cul-de-sacs) do not proceed to the next goal. • The number of robots that survive to reach the last goal is low. We want the robots to learn to do better, while in the field.

DAEDALUS Results • DAEDALUS succeeded in dramatically reducing the number of collisions and improving survivability, despite the difficulties caused by obstructed perception. • Our results depended on the mutation rate. Can DAEDALUS learn that also? 20 minutes of simulation time

Further DAEDALUS Results • DAEDALUS also succeeded in learning the appropriate mutation rate for the robots. Hence, the system is striking a balance between exploration and exploitation.

Effect of Mutation Rate on Survival

Collision Reduction 60 Robots moving towards 5 goals through 90 obstacles in between each goal

Summary of DAEDALUS • Creating rapidly adapting robots in changing environments is challenging. • Offline learning can yield initial “seed” rules, which must then be perturbed. • The key is to maintain “diversity” in the rules that control the members of the swarm. • Collective behaviors still arise from the local interactions of diverse population of robots.

Outline • Goals and Contributions • Robot Swarms • Physicomimetics Framework • Traditional Offline Learning • Novel Distributed Online Learning • Obstacle Avoidance with Physical Robots • Conclusion and Future Work

Obstacle Avoidance with Robots • Use three Maxelbot robots • Use 2D trilateration localization algorithm (Not a part of this thesis) • Design and develop obstacle avoidance module (OAM) • Implement physicomimetics on a real outdoor robot

Hardware Architecture of Maxelbot MiniDRAGON for trilateration, provides robot coordinates RF and acoustic sensors I2C MiniDRAGON for motor control, executes Physicomimetics OAM AtoD conversion I2C I2C IR sensors

Physicomimetics for Obstacle Avoidance • Constant “virtual” attractive goal force in front of the leader • “Virtual” repulsive forces from four sensors mounted on the front of the leader, if obstacles detected • The resultant force creates a change in velocity due to F = ma • Power supply to motors are changed based on the forces acting on the leader.

Obstacle Avoidance Methodology • Measure the performance of physicomimetics with repulsion from obstacles • All experiments are conducted outdoor in the “Prexy’s Pasture” • Three Maxelbots: One leader and two followers • Graphs show the correlation between raw sensor readings and motor power • Leader uses the physicomimetics algorithm with the obstacle avoidance module • Focus is on the obstacle avoidance by the leader, not the formation control

If there is an obstacle on the right, power to left motor is reduced

If there is an obstacle on the left, power to right motor is reduced

If there is an obstacle in front, power to both motors is reduced

Further Analysis of Sensor Reading and Motor Power • Scatter plots give more information • Provide a broader picture of data • Shows the correlation of motor power with distance to an obstacle in inches (the robots ignore obstacles greater than 30” away) Movie of 3 Maxelbots, Leader has OAM

Left sensor sees obstacle Left middle sensor also sees obstacle

Contributions • Improved performance in obstacle avoidance: • Applied a new force law for robot control, to improve performance • Provided novel objective performance metrics for obstacle avoiding swarms • Improved scalability of the swarm in obstacle avoidance • Improved performance of obstacle avoidance with obstructed perception • Invented a real-time learning algorithm (DAEDALUS): • Demonstrate that a swarm can improve performance by mutating and exchanging force laws • Demonstrate feasibility of DAEDALUS with obstacle avoidance, in environments three times denser than the norm • Explore the trade-offs of mutation on homogeneous and heterogeneous swarm learning • Hardware Implementation • Present a novel robot control algorithm that merges physicomimetics with obstacle avoidance.

Future Work • Use DAEDALUS to provide practical solutions to real world problems • Provide obstacle avoidance capability to all the robots in the formation • Develop robots with greater data exchange capability • Adapt the physicomimetics framework to incorporate performance feedback for specific tasks and situational awareness • Extend the physicomimetics framework for sensing and performing tasks in a marine environment (with Harbor Branch) • Introduce robot/human roles and interactions to distributed evolution architecture

Work Published • Spears W., Spears D., Heil R., Kerr W. and Hettiarachchi S. An overview of physicomimetics. Lecture Notes in Computer Science - State of the Art Series Volume 3342, 2004. Springer. • Hettiarachchi S. and Spears W., Moving swarm formations through obstacle fields. Proceedings of the 2005 International Conference on Artificial Intelligence, Volume 1, 97-103, CSREA Press. • Hettiarachchi S., Spears W., Green D., and Kerr W., Distributed agent evolution with dynamic adaptation to local unexpected scenarios . Proceedings of the 2005 Second GSFC/IEEE Workshop on Radical Agent Concepts. Springer. • Spears, W., D. Zarzhitsky, S. Hettiarachchi, W. Kerr. Strategies for multi-asset surveillance. IEEE International Conference on Networking, Sensing and Control, 2005, 929-934. IEEE Press. • Hettiarachchi, S. and W. Spears (2006). DAEDALUS for agents with obstructed perception. In SMCals/06 IEEE Mountain Workshop on Adaptive and Learning Systems, pp. 195-200. IEEE Press, Best Paper Award. • Hettiarachchi, S. (2006). Distributed online evolution for swarm robotics. In Doctoral Mentoring Program AAMAS06, T. Ishida and A. B. Hassine (Eds.), Autonomous Agents and Multi Agent Systems, pp. 17-18.. • Hettiarachchi, S., P. Maxim, and W. Spears (2007). An architecture for adaptive swarms. In Robotics Research Trends, X. P Guo (Ed.). Nova Publishers (Book Chapter).

Thank YouQuestions?

Backup Slides Next set of slides may be confusing because they are intended to be placed between the slides from 1-49.

Distributed Evolution for Swarm Robotics

Distributed Evolution for Swarm Robotics

Presentation Transcript

On swarm robotics. A beginner ’ s view

Swarm Robotics at the University of Wyoming

Programming for Swarm

Swarm Robotics

Swarm Intelligence

Swarm Robotics

Swarm!

Swarm Intelligence

Roboti mobili in servicii Swarm robotics

Swarm, swarm JAS , jaslibrary.sourceforge/

Modified Particle Swarm Algorithm for Decentralized Swarm Agent

Networked Robotics: From Distributed Robotics to Sensor Networks

Implementing a Swarm Robotics Scheme with Radio and Optical Communications

CotsBots: An Off-the-Shelf Platform for Distributed Robotics

SWARM

Swarm Intelligence

What is Swarm Robotics?

Swarm Intelligence Services | Swarm Technology

Swarm Robotics

Swarm School: Distributed Averaging

Swarm!

Swarm Basics