Download presentation
Presentation is loading. Please wait.
Published byHortense Mitchell Modified over 9 years ago
1
By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering
2
Introduction Robotic Planning under uncertainty MDP solutions Limited real-world application
3
Assumptions for Multi-Robot teams Communication (Inexpensive, free, or costly) Synchronous and steady state transitions Discretization of environment
4
A Different Approach States and actions discrete (like MDP) Continuous measure of time State transitions regarded as random ‘events’
5
Advantages Non-Markovian effects of discretization minimized Fully reactive to changes Communication only required for ‘events’
6
GSMDPs Generic temporal probability distributions over events Can model concurrent (persistently enabled) events Solvable by discrete-time MDP algorithms by obtaining an equivalent (semi-)Markovian model Avoids negative effects of synchronous alternatives
7
Why GSMDPs for Robotics Cooperative Robotics requires: Operation in inherently continuous environments Uncertainty in actions (and observations) Joint decision making for optimization Reactive
8
Definitions multiagent GSMDP: tuple d = number agents S = state space (contains state factors) X = state factors A = set of joint actions T = transition function F = time model R = instantaneous reward function C = cumulative reward rate h = planning over continuous time
9
Definitions Event in a GSMDP: An abstraction to state transitions that share the same properties Persistently enabled events: Events that are enabled from step ‘t’ to step ‘t+1’, but not triggered at step ‘t’
10
Common Approach Synchronous action Pre-defined time step Performance Reaction time
11
GSMDPs Persistently enabled events modeled by allowing their temporal distributions to depend on the time they were enabled Explicit modeling of non-Markovian effects from discretization Communication efficiency
12
Modeling Events Group state transitions as events to minimize temporal distributions and transitions(battery low) Transition function found by estimating relative frequency of each transition in the event Time model found by timing the transition data Approximated as a phase-type distribution Replaces events with acyclic Markov chains
13
Events (cont.) Not always possible Decompose events with minimum duration into deterministically timed transitions Can then better approximate using phase-type distribution
14
Solving a GSMDP Can be viewed as an equivalent discrete-time MDP Almost all solution algorithms for MDPs work
15
Experiment Robotic soccer Score a goal (reward 150) Passing around obstacle (reward 60)
16
Results MDP: T = 4s GSMDP
17
Results No idle time Reduced communication Improved scoring efficiency System failures (zero goals) independent of model
18
Example Video
19
Future Work Extend to partially observable domains Apply bilateral phase distributions to increase the class of non-Markovian events that are able to be modeled
20
Questions?
21
MESSIAS, J.; SPAAN, M.; LIMA, P.. GSMDPs for Multi-Robot Sequential Decision- Making. AAAI Conference on Artificial Intelligence, North America, jun. 2013. Available at:. Date accessed: 06 Apr. 2014http://www.aaai.org/ocs/index.php/AAAI/AAAI13/paper/view/6432/6843
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.