Download presentation
Presentation is loading. Please wait.
Published byStewart Gibson Modified over 9 years ago
1
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Authors: h Areolino de Almeida Neto - UFMA h Bodo Heimann - University of Hannover h Luiz Carlos S. Góes - ITA h Cairo L. Nascimento Jr. - ITA Avoidance of Multiple Dynamic Obstacles
2
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Objective To drive a mobile robot by a safe path using indications of directions which avoid dynamic and static obstacles goal robot obstacle
3
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Reinforcement Learning Characteristics: h Intuitive data h Cumulative learning h Constructive solution h Direct knowledge acquisition h Very adequate to decision making
4
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Reinforcement Learning States: h Distance to the possible point of collision (4) h Direction of the obstacle (8) h Shortest distance between obstacle and the robot’s path (8) h Time condition of arriving (3)
5
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Reinforcement Learning Actions: h lateral velocity 3 to the right 1 null 3 to the left h frontal velocity 3 ahead 1 null 3 back
6
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Reinforcement Learning h State are mapped to actions using a coding scheme. There are 768 states. h For each state there are 49 possible actions and their corresponding “evaluation value”. h Training means creating the “evaluation values” for each state and the possible actions.
7
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Reinforcement Learning Training using only one obstacle: h 1 st level: Monte Carlo (~450000 runs) direct fast computation evaluation function h 2 nd level: Q-learning necessary in around 50 situations t: duration of movement a: number of actions n: iteration number
8
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Obstacle Avoidance Architecture: h Use of a path a priori (static environment) h Detection of a possibility of collision h Classification of a collision: possible immediate
9
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Obstacle Avoidance Algorythm for Multiple Obstacles Avoidance: h One obstacle is defined as main and actions are indicated based on this obstacle; h The situation is divided in sectors:
10
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Obstacle Avoidance Algorythm for Multiple Obstacles Avoidance: h If the last action decided has a chance to avoid the obstacles (if it can drive the robot by a free and sufficient large sector), than it is maintained; h If not, then the RL technique indicates 10 actions for the present situation. For a new situation, the actions are the 10 best actions, otherwise they are the 10 best actions belong to the same quadrant of the last action;
11
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Obstacle Avoidance Algorythm for Multiple Obstacles Avoidance: h For the 10 actions indicated, if more than one can drive the robot by a safe trajectory, then the action with the fewest changing in lateral velocity is chosen; h If none, so the 10 actions indicated are reflected to the other side (left or right side) and a safe trajectory is searched again; h If none, an action, considering the 10 best actions for all quadrants, that presents the possibility of no collision is immediately chosen
12
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Obstacle Avoidance Algorythm for Multiple Obstacles Avoidance: h If none, so an action, considering the 10 best actions for all quadrants, that presents the possibility of arrival at the collision point before or after the obstacle is immediately chosen; h Finally, if none was found, then the robot should stop.
13
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Reinforcement Learning Neural Representation: h Problem: state-action matrix explosion (37632) h Solution: neural representation h Use of multiple neural networks training 1 st NN: E = D – Y1 training the 2 nd NN: E = (D – Y1) – Y2
14
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Obstacle Avoidance Results:
15
17º International Congress of Mechanical Engineering November 10–14, 2003 – Holiday Inn Select Jaraguá - Hotel São Paulo - SP - Brazil Obstacle Avoidance Conclusion: h Complex avoidance with primitives actions h Direct knowledge with Monte Carlo technique h Improvement in knowledge with Q-learning h Neural representation can compact well the state-action matrix Acknowledgements: h CAPES, DAAD, UFMA and ITA for the financial support
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.