By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering

Introduction Robotic Planning under uncertainty MDP solutions Limited real-world application

Assumptions for Multi-Robot teams Communication (Inexpensive, free, or costly) Synchronous and steady state transitions Discretization of environment

A Different Approach States and actions discrete (like MDP) Continuous measure of time State transitions regarded as random ‘events’

Advantages Non-Markovian effects of discretization minimized Fully reactive to changes Communication only required for ‘events’

GSMDPs Generic temporal probability distributions over events Can model concurrent (persistently enabled) events Solvable by discrete-time MDP algorithms by obtaining an equivalent (semi-)Markovian model Avoids negative effects of synchronous alternatives

Why GSMDPs for Robotics Cooperative Robotics requires: Operation in inherently continuous environments Uncertainty in actions (and observations) Joint decision making for optimization Reactive

Definitions multiagent GSMDP: tuple d = number agents S = state space (contains state factors) X = state factors A = set of joint actions T = transition function F = time model R = instantaneous reward function C = cumulative reward rate h = planning over continuous time

Definitions Event in a GSMDP: An abstraction to state transitions that share the same properties Persistently enabled events: Events that are enabled from step ‘t’ to step ‘t+1’, but not triggered at step ‘t’

Common Approach Synchronous action Pre-defined time step Performance Reaction time

GSMDPs Persistently enabled events modeled by allowing their temporal distributions to depend on the time they were enabled Explicit modeling of non-Markovian effects from discretization Communication efficiency

Modeling Events Group state transitions as events to minimize temporal distributions and transitions(battery low) Transition function found by estimating relative frequency of each transition in the event Time model found by timing the transition data Approximated as a phase-type distribution Replaces events with acyclic Markov chains

Events (cont.) Not always possible Decompose events with minimum duration into deterministically timed transitions Can then better approximate using phase-type distribution

Solving a GSMDP Can be viewed as an equivalent discrete-time MDP Almost all solution algorithms for MDPs work

Experiment Robotic soccer Score a goal (reward 150) Passing around obstacle (reward 60)

Results MDP: T = 4s GSMDP

Results No idle time Reduced communication Improved scoring efficiency System failures (zero goals) independent of model

Example Video

Future Work Extend to partially observable domains Apply bilateral phase distributions to increase the class of non-Markovian events that are able to be modeled

Questions?

MESSIAS, J.; SPAAN, M.; LIMA, P.. GSMDPs for Multi-Robot Sequential Decision- Making. AAAI Conference on Artificial Intelligence, North America, jun. 2013. Available at:. Date accessed: 06 Apr. 2014http://www.aaai.org/ocs/index.php/AAAI/AAAI13/paper/view/6432/6843

By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Similar presentations

Presentation on theme: "By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Similar presentations

Presentation on theme: "By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering."— Presentation transcript:

Similar presentations

About project

Feedback