Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Momentum: Integration and Experimentation Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA.

Similar presentations


Presentation on theme: "Learning Momentum: Integration and Experimentation Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA."— Presentation transcript:

1 Learning Momentum: Integration and Experimentation Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA

2 Motivation It’s hard to manually derive controller parameters. The parameter space increases exponentially with the number of parameters. You don’t always have a priori knowledge of the environment. Without prior knowledge, a user can’t confidently derive appropriate parameter values, so it becomes necessary for the robot to adapt on its own to what it finds. Obstacle densities and layout in the environment may be heterogeneous. Parameters that work well for one type of environment may not work well with another type.

3 Adaptation and Learning Methods – DARPA MARS Investigate robot shaping at five distinct levels in a hybrid robot software architecture Implement algorithms within MissionLab mission specification system Conduct experiments to evaluate performance of each technique Combine techniques where possible Integrate on a platform more suitable for realistic missions and continue development

4 Overview of techniques CBR Wizardry Guide the operator Probabilistic Planning Manage complexity for the operator RL for Behavioral Assemblage Selection Learn what works for the robot CBR for Behavior Transitions Adapt to situations the robot can recognize Learning Momentum Vary robot parameters in real time THE LEARNING CONTINUUM: Deliberative (premission). Behavioral switching. Reactive (online adaptation)......

5 Basic Concepts of LM Provides adaptability to behavior-based systems A crude form of reinforcement learning. If the robot is doing well, keep doing what it’s doing, otherwise try something different. Behavior parameters are changed in response to progress and obstacles. The system is still fully reactive. Although the robot changes its behavior, there is no deliberation.

6 Currently Used Behaviors Move to Goal Always returns a vector pointing toward the goal position. Avoid Obstacles Returns a sum of weighted vectors pointing away from obstacles. Wander Returns vectors pointing in random directions.

7 Adjustable Parameters Move to goal vector gain Avoid obstacle vector gain Avoid obstacle sphere of influence Radius around the robot inside of which obstacles are perceived Wander vector gain Wander persistence The number of consecutive steps the wander vector points in the same direction

8 Four Predefined Situations no movement M < T movement progress toward the goal M > T movement P > T progress no progress with obstacles M > T movement P < T progress O count > T obstacles no progress without obstacles M > T movement P < T progress O count < T obstacles M = average movement M goal = average movement to the goal P = M goal / M O count = obstacles encountered T movement = movement threshold T progress = progress threshold T obstacles = obstacles threshold

9 Parameter adjustments Sample adjustment parameters for ballooning.

10 Two Possible Strategies Ballooning - Sphere of influence is increased when obstacles impede progress. The robot moves around large objects. Squeezing - Sphere of influence is decreased when obstacles impede progress. The robot moves between closely spaced objects.

11 Integration Base System Position and Goal Information Obstacle Information Move To Goal(G m ) Avoid Obstacles(G o,S) Wander(G w,P) ∑ Output direction SensorsController G m = goal gain G o = obstacle gain S = obstacle sphere of influence G w = wander gain P = wander persistence

12 Integration Integrated System Position and Goal Information Obstacle Information Move To Goal(G m ) Avoid Obstacles(G o,S) Wander(G w,P) ∑ Output direction SensorsController LM Module New G m, G o, S, G w, and P parameters. G m = goal gain G o = obstacle gain S = obstacle sphere of influence G w = wander gain P = wander persistence

13 Experiments in Simulation 150m x 150m area robot moves from (10m, 10m) to (140m, 90m) Obstacle densities of 15% and 20% were used. Obstacle radii varied between 0.38m and 1.43m.

14 Ballooning

15 Observations on Ballooning Covers a lot of area Not as easily trapped in box canyon situations May settle in locally clear areas May require a high wander gain to carry the robot through closely spaced obstacles

16 Squeezing

17 Observations on Squeezing Results in a straighter path Moves easily through closely spaced obstacles May get trapped in small box canyon situations for large amounts of time

18 Simulations of the Real World Start Place End Place 24m x 10m Simulated setup of the real world environment.

19 Completion Rates For Simulation Uniform Obstacle Size (1m radii) Varying Obstacle Sizes (0.38m - 1.43m radii)

20 Average Steps to Completion Uniform Obstacle Size (1m radii) Varying Obstacle Sizes (0.38m - 1.43m radii)

21 Results From Simulated Real Environment % CompleteSteps to Completion As before, there is an increase in completion rates with an accompanying increase in steps to completion.

22 Simulation Results Completion rates can be drastically improved. Completion rate improvements come at a cost of time. Ballooning and squeezing strategies are geared toward different situations.

23 Physical Robot Experiments Nomad 150 robot Sonar ring for obstacle avoidance Traverses the length of a 24m x 10m room while negotiating obstacles

24 Outdoor Run (adaptive)

25 Outdoor Run (non-adaptive)

26 Physical Experiment Results Non-learning robots became stuck. Learning robots successfully negotiated the obstacles. Squeezing was faster than ballooning in this case. Average steps to goal.

27 Conclusions Improved success has a price of time. Performance of one strategy is very poor in situations better suited for another strategy. The ballooning strategy is generally faster. Ballooning robots can move through closely spaced objects faster than squeezing robots can move out of box canyon situations.

28 Conclusions (cont’d) If some general knowledge of the terrain is know a priori, an appropriate strategy can be chosen. If terrain is totally unknown, ballooning is probably the better choice. A way to dynamically switch strategies should improve performance.


Download ppt "Learning Momentum: Integration and Experimentation Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA."

Similar presentations


Ads by Google