Distributed Evolution for Swarm Robotics Suranga Hettiarachchi Computer Science Department University of Wyoming Committee Members: Dr. William Spears.

Distributed Evolution for Swarm Robotics Suranga Hettiarachchi Computer Science Department University of Wyoming Committee Members: Dr. William Spears – Computer Science (Committee Chair / Research Advisor) Dr. Diana Spears – Computer Science Dr. Thomas Bailey – Computer Science Dr. Richard Anderson-Sprecher – Statistics Dr. David Thayer – Physics and Astronomy

Outline Goals and Contributions Robot Swarms Physicomimetics Framework Offline Evolutionary Learning Novel Distributed Online Learning Obstacle Avoidance with Physical Robots Conclusion and Future Work

Goals To improve the state-of-the-art of obstacle avoidance in swarm robotics. To create a novel real-time learning algorithm for swarm robotics, to improve performance in changing environments.

Contributions Improved performance in obstacle avoidance: Scales to far higher numbers of robots and obstacles than the norm Invented an online population-based learning algorithm: Demonstrate feasibility of algorithm with obstacle avoidance, in environments that change dynamically and are three times denser than the norm, with obstructed perception Hardware Implementation Implemented obstacle avoidance algorithm on real robots Obstacle Avoidance Hardware Implementation Online Learning Algorithm

Robot Swarms Robot swarms can act as distributed computers, solving problems that a single robot cannot For many tasks, having a swarm maintain cohesiveness while avoiding obstacles and performing the task is of vital importance Example Task: Chemical Plume Source Tracing

Chemical Plume Source Tracing Link to this movie may not work properly

Physicomimetics for Robot Control Biomimetics: Gain inspiration from biological systems and ethology. Physicomimetics: Gain inspiration from physical systems. Good for formations.

Physicomimetics Framework Robots have limited sensor range, and friction for stabilization Robots are controlled via “virtual” forces from nearby robots, goals, and obstacles. F = ma control law. Seven robots form a hexagon

Two Classes of Force Laws The left “Newtonian” force law, is good for creating swarms in rigid formations. The right “Lennard- Jones” force law (LJ) more easily models fluid behavior, which is potentially better for maintaining cohesion while avoiding obstacles. The “classic” lawNovel use of LJ force law for robot control

What do these force laws look like? Change in Force Magnitude With Varying Distance for Robot – Robot Interactions F max = 1.0 F max = 4.0 Desired Robot Separation Distance = 50

Swarm Learning (Offline) Typically, the interactions between the swarm robots are learned via simulation in “offline” mode. Swarm Simulation Initial Rules Final Rules that achieve the desired behavior Offline Learning, such as an Evolutionary Algorithm (EA) FitnessRules

Swarm Simulation Environment

Offline Learning Approach An Evolutionary Algorithm (EA) is used to evolve the rules for the robots in the swarm. A global observer assigns fitness to the rules based on the collective behavior of the swarm in the simulation. Each member of the swarm uses the same rules. The swarm is a homogeneous distributed system. For physicomimetics, the rules consists of force law parameters.

Force Law Parameters Parameters of the “Newtonian” force law G- “gravitational” constant of robot-robot interactions P- power of the force law for robot-robot interactions F max - maximum force of robot-robot interactions Similar 3-tuples for obstacle/goal-robot interactions. Parameters of the LJ force law ε- strength of the robot-robot interactions c- non-negative attractive robot-robot parameter d- non-negative repulsive robot-robot parameter F max - maximum force of robot-robot interactions Similar 4-tuples for obstacle/goal-robot interactions. G r-r P r-r Fmax r-r G r-o P r-o Fmax r-o G r-g P r-g Fmax r-g ε r-r c r-r d r-r Fmax r-r ε r-o c r-o d r-o Fmax r- o ε r-g c r-g d r-g Fmax r- g

Measuring Fitness Connectivity (Cohesion) : maximum number of robots connected via a communication path. Reachability (Survivability) : percentage of robots that reach the goal. Time to Goal : time taken by at least 80% of the robots to reach the goal. goal connectivity 4R reachability High fitness corresponds to high connectivity, high reachability, and low time to goal.

Summary of Results We compared the performance of the best “Newtonian” force law found by the EA to the best LJ force law. The “Newtonian” force law produces more rigid structures making it difficult to navigate through obstacles. This causes poor performance, despite high connectivity. Lennard-Jones is superior, because the swarm acts as a viscous fluid. Connectivity is maintained while allowing the robots to reach the goal in a timely manner. The Lennard-Jones force law demonstrates scalability in the number of robots and obstacles.

Connectivity of Robots

Force Law Robots Obstacles 20406080100 Newt 2011601260129015301920 100----- LJ 20470480490510520 100640650670680690 Time for 80% of the Robots to Reach the Goal

A Problem The simulation assumes a certain environment. What happens if the environment changes when the swarm is fielded? We can’t go back to the simulation world. Can the swarm adapt “on-line” in the field? Environment trained on. Environment changes. Performance degrades.

Frequently Proposed Solution Each robot has sufficient CPU power and memory to maintain a complete map of the environment. When environment changes, each robot runs an EA internally, on a simulation of the new environment. Robots wait until new rules are evolved. It is better to learn in the field, in real time. 4 days of simulation time

Example The maximum velocity is increased by 1.5x. Obstacles are tripled in size. High obstacle density creates cul-de-sacs and robots are left behind. Collisions also occur. Obstructed perception is also introduced. The learned offline rules are no longer sufficient. Environment trained on. Environment changes. Performance degrades.

Novel Online Learning Approach Borrow from evolution. Each robot in the swarm is an individual in a population that interacts with its neighbors. Each robot contains a slightly mutated copy of the best rule set found with offline learning. When the environment changes, some mutations perform better than others. Better performing robots share their knowledge with poorer performing neighbors. We call this “Distributed Agent Evolution with Dynamic Adaptation to Local Unexpected Scenarios” (DAEDALUS).

DAEDALUS for Obstacle Avoidance Each robot is initialized with randomly perturbed (via mutation) versions of the force laws learned with the offline simulation. Robots are penalized if they collide with obstacles and/or are left behind. Robots that are most successful and are moving will retain the highest worth, and share their force laws with neighboring robots that were not as successful.

Experimental Setup There are five goals to reach in a long corridor. Between each goal is a different obstacle course. Robots that are left behind (due to obstacle cul-de-sacs) do not proceed to the next goal. The number of robots that survive to reach the last goal is low. We want the robots to learn to do better, while in the field.

DAEDALUS Results DAEDALUS succeeded in dramatically reducing the number of collisions and improving survivability, despite the difficulties caused by obstructed perception. Our results depended on the mutation rate. Can DAEDALUS learn that also? 20 minutes of simulation time

Further DAEDALUS Results DAEDALUS also succeeded in learning the appropriate mutation rate for the robots. Hence, the system is striking a balance between exploration and exploitation.

Number of Robots Surviving with Different Mutation Rates 1%3%5%7%9% 60-start12 53-goal18101112 45-goal29610911 40-goal3761089 34-goal456986 32-goal555976 Effect of Mutation Rate on Survival

60 Robots moving towards 5 goals through 90 obstacles in between each goal Collision Reduction

Summary of DAEDALUS Creating rapidly adapting robots in changing environments is challenging. Offline learning can yield initial “seed” rules, which must then be perturbed. The key is to maintain “diversity” in the rules that control the members of the swarm. Collective behaviors still arise from the local interactions of diverse population of robots.

Outline Goals and Contributions Robot Swarms Physicomimetics Framework Traditional Offline Learning Novel Distributed Online Learning Obstacle Avoidance with Physical Robots Conclusion and Future Work

Obstacle Avoidance with Robots Use three Maxelbot robots Use 2D trilateration localization algorithm (Not a part of this thesis) Design and develop obstacle avoidance module (OAM) Implement physicomimetics on a real outdoor robot

Hardware Architecture of Maxelbot MiniDRAGON for motor control, executes Physicomimetics MiniDRAGON for trilateration, provides robot coordinates OAM AtoD conversion RF and acoustic sensors IR sensors I2CI2C I2CI2C I2CI2C

Physicomimetics for Obstacle Avoidance Constant “virtual” attractive goal force in front of the leader “Virtual” repulsive forces from four sensors mounted on the front of the leader, if obstacles detected The resultant force creates a change in velocity due to F = ma Power supply to motors are changed based on the forces acting on the leader.

Obstacle Avoidance Methodology Measure the performance of physicomimetics with repulsion from obstacles All experiments are conducted outdoor in the “Prexy’s Pasture” Three Maxelbots: One leader and two followers Graphs show the correlation between raw sensor readings and motor power Leader uses the physicomimetics algorithm with the obstacle avoidance module Focus is on the obstacle avoidance by the leader, not the formation control

If there is an obstacle on the right, power to left motor is reduced

If there is an obstacle on the left, power to right motor is reduced

If there is an obstacle in front, power to both motors is reduced

Further Analysis of Sensor Reading and Motor Power Scatter plots give more information Provide a broader picture of data Shows the correlation of motor power with distance to an obstacle in inches (the robots ignore obstacles greater than 30” away) Movie of 3 Maxelbots, Leader has OAM

Left sensor sees obstacle Left middle sensor also sees obstacle

Contributions Improved performance in obstacle avoidance: Applied a new force law for robot control, to improve performance Provided novel objective performance metrics for obstacle avoiding swarms Improved scalability of the swarm in obstacle avoidance Improved performance of obstacle avoidance with obstructed perception Invented a real-time learning algorithm (DAEDALUS): Demonstrate that a swarm can improve performance by mutating and exchanging force laws Demonstrate feasibility of DAEDALUS with obstacle avoidance, in environments three times denser than the norm Explore the trade-offs of mutation on homogeneous and heterogeneous swarm learning Hardware Implementation Present a novel robot control algorithm that merges physicomimetics with obstacle avoidance.

Future Work Use DAEDALUS to provide practical solutions to real world problems Provide obstacle avoidance capability to all the robots in the formation Develop robots with greater data exchange capability Adapt the physicomimetics framework to incorporate performance feedback for specific tasks and situational awareness Extend the physicomimetics framework for sensing and performing tasks in a marine environment (with Harbor Branch) Introduce robot/human roles and interactions to distributed evolution architecture

Work Published Spears W., Spears D., Heil R., Kerr W. and Hettiarachchi S. An overview of physicomimetics. Lecture Notes in Computer Science - State of the Art Series Volume 3342, 2004. Springer. Hettiarachchi S. and Spears W., Moving swarm formations through obstacle fields. Proceedings of the 2005 International Conference on Artificial Intelligence, Volume 1, 97-103, CSREA Press. Hettiarachchi S., Spears W., Green D., and Kerr W., Distributed agent evolution with dynamic adaptation to local unexpected scenarios. Proceedings of the 2005 Second GSFC/IEEE Workshop on Radical Agent Concepts. Springer. Spears, W., D. Zarzhitsky, S. Hettiarachchi, W. Kerr. Strategies for multi- asset surveillance. IEEE International Conference on Networking, Sensing and Control, 2005, 929-934. IEEE Press. Hettiarachchi, S. and W. Spears (2006). DAEDALUS for agents with obstructed perception. In SMCals/06 IEEE Mountain Workshop on Adaptive and Learning Systems, pp. 195-200. IEEE Press, Best Paper Award. Hettiarachchi, S. (2006). Distributed online evolution for swarm robotics. In Doctoral Mentoring Program AAMAS06, T. Ishida and A. B. Hassine (Eds.), Autonomous Agents and Multi Agent Systems, pp. 17-18.. Hettiarachchi, S., P. Maxim, and W. Spears (2007). An architecture for adaptive swarms. In Robotics Research Trends, X. P Guo (Ed.). Nova Publishers (Book Chapter).

Thank You Questions?

Backup Slides Next set of slides may be confusing because they are intended to be placed between the slides from 1-49.

DAEDALUS for Reducing Collisions Slightly mutate robot-obstacle force law interactions. Those robots that do not collide give their force laws to poorer performing robots.

DAEDALUS for Improving Survival Previous experiment did not attempt to alleviate the situation where robots are left behind. This is caused by large number of cul-de- sacs produced by large obstacle density. Slightly mutate robot-robot interaction, if there is a nearby moving neighbor. Rapidly mutate robot-goal interaction, if there are no neighbors.

Improved Survival Two Online experiments are independent from each other.

Task: Obstacle Avoidance with Obstructed Perception goal Robots must organize themselves into a formation and then move toward a goal, while avoiding obstacles. A robot may not see another robot, due to the presence of obstacles. If r > minD, then robot A and robot B have their perception obstructed.

DAEDALUS Results Results averaged over 100 independent runs We do not train children on hard problems immediately, instead, we train them on easier problems first. This is counter to accepted wisdom in the EA community. DAEDALUS online learning is improving performance.

Homogeneous DAEDALUS All robots had the same mutation rate, which was 5%. The results may depend quite heavily on choosing the correct mutation rate. The best mutation rate may also depend on the environment, and should potentially change as the environment changes. We decided to explore this effect by conducting several experiments with different mutation rates.

Heterogeneous DAEDALUS We attempted to address the problem of choosing the correct mutation rate. We divided the robots into five groups of equal size. Each group of 12 robots was assigned a mutation rate of 1%, 3%, 5%, 7%, and 9%, respectively. This mimics the behavior of children that have different “comfort zones” in their rate of exploration.

Heterogeneous Results Results averaged over 100 independent runs The result at the final goal is essentially identical to the average of the five performance curves in the previous graph. Can DAEDALUS learn the proper “comfort zone”, instead?

Analogy – Children Learning Borrowed from the analogy of a “swarm” of children learning some task. They share useful information as to the rules they might use, but they also share meta-information as to the level of exploration that is actually safe! Very bold children might encourage their more timid comrades to explore more than they would initially. If a very bold child has an accident, the rest of the children will become more timid.

Extended Heterogeneous DAEDALUS - Results Results averaged over 100 independent runs DAEDALUS now allows the robots to receive a neighbor’s mutation rate, in addition to the neighbor’s rules. The results are close to those achieved by the homogenous DAEDALUS with the best mutation rate!

Why Physicomimetics? Capable of maintaining formations of robots Designed as a leader-follower algorithm Allows robots to move quickly, due to minimal communication Can use theory to set parameters

Physcomimetics for Formation Control The leader provides an attractive goal force for the followers The follower uses F = ma to compute the change in velocity that is required to follow the leader Power supply to motors are changed based on the changes in velocity

Formation Control Methodology Measure the quality of Physicomimetics without repulsions from obstacles All experiments are conducted outdoor in the “Prexy’s Pasture” Three Maxelbots: One leader and two followers Results averaged over 10 runs Leader remotely controlled (NO Physicomimetics) Leader DO NOT have obstacle avoidance capability Focus is on the formation control, not the obstacle avoidance

Triangular Formation

Triangular Formation Results

Linear Formation

Linear Formation Results

Lag in stopping due to physicomimetic inertia. Helps counteract noisy sensors. Lag in starting due to physicomimetic inertia. Helps counteract noisy sensors. Left sensor sees obstacle Left middle sensor sees obstacle

Lag in starting due to AP inertia. Helps counteract noisy sensors. Lag in stopping due to AP inertia. Helps counteract noisy sensors. Right sensor sees obstacle Right middle sensor sees obstacle

Power will be reduced if the outermost sensors see an obstacle when the inner sensors do not.

Distributed Evolution for Swarm Robotics Suranga Hettiarachchi Computer Science Department University of Wyoming Committee Members: Dr. William Spears.

Similar presentations

Presentation on theme: "Distributed Evolution for Swarm Robotics Suranga Hettiarachchi Computer Science Department University of Wyoming Committee Members: Dr. William Spears."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Evolution for Swarm Robotics Suranga Hettiarachchi Computer Science Department University of Wyoming Committee Members: Dr. William Spears.

Similar presentations

Presentation on theme: "Distributed Evolution for Swarm Robotics Suranga Hettiarachchi Computer Science Department University of Wyoming Committee Members: Dr. William Spears."— Presentation transcript:

Similar presentations

About project

Feedback