Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression.

Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression

Reliable Navigation Conventional trajectories may not be robust to localisation error Estimated robot position Robot position distribution True robot position Goal position

Perception and Control PerceptionControl World state Control algorithms

Perception and Control Assumed full observability Exact POMDP planning Probabilistic Perception Model P(x) argmax P(x)Control World state Probabilistic Perception Model P(x)Control Brittle Intractable

Perception and Control Assume full observability Exact POMDP planning Brittle World state Probabilistic Perception Model P(x)Compressed P(x)Control Intractable

Main Insight World state Probabilistic Perception Model P(x)Low-dimensional P(x)Control Good policies for real world POMDPs can be found by planning over low-dimensional representations of the belief space.

but not usually. The controller may be globally uncertain... Belief Space Structure

Coastal Navigation Represent beliefs using Discretise into low-dimensional belief space MDP

Coastal Navigation

A Hard Navigation Problem Distance in M Average Distance to Goal

Dimensionality Reduction Principal Components Analysis Original Beliefs Weights Characteristic Beliefs

Principal Components Analysis Given belief b  n, we want b  m, m«n. Collection of beliefs drawn from 200 state problem Probability of being in state State ~

One sample distribution m=9 gives this representation for one sample distribution Principal Components Analysis Given belief b  n, we want b  m, m«n. Probability of being in state State ~

Principal Components Analysis Many real world POMDP distributions are characterised by large regions of low probability. Idea: Create fitting criterion that is (exponentially) stronger in low-probability regions (E-PCA)

1 basis2 bases3 bases4 bases Example EPCA State Probability of being in state

Example Reduction

E-PCA will indicate appropriate number of bases, depending on beliefs encountered Finding Dimensionality

Planning S1S1 S2S2 S3S3 Original POMDP Low-dimensional belief space B E-PCA Discrete belief space MDP Discretise ~

Model Parameters Reward function R(b) s1s1 s2s2 s3s3 p(s) Back-project to high dimensional belief Compute expected reward from belief: ~ ~

Model Parameters Low dimension Full dimension ~ 1. For each belief b i and action a bibi ~ 3.Propagate according to action bjbj 4.Propagate according to observation bjbj ~ ~ 5. Recover b j 6. Set T(b i, a, b j ) to probability of observation ~~ bibi ~ 2. Recover full belief b i

Robot Navigation Example True (hidden) robot position Goal position Goal state Initial Distribution

Robot Navigation Example True robot position Goal position

Policy Comparison Average Distance to Goal Distance in M 6 bases

People Finding

People Finding as a POMDP Fully Observable Robot Position of person unknown Robot position True person position

Finding and Tracking People Robot position True person position

People Finding as a POMDP Factored belief space 2 dimensions: fully-observable robot position 6 dimensions: distribution over person positions Regular grid gives ≈ 10 16 states

Variable Resolution Non-regular grid using samples b1b1 b2b2 b3b3 b4b4 b5b5 T(b 1, a 1, b 2 ) T(b 1, a 2, b 5 ) Compute model parameters using nearest-neighbour ~~ ~ ~ ~ ~ ~ ~ ~

Refining the Grid V(b 1 ) ~ V(b' 1 ) ~ Sample beliefs according to policy b1b1 ~ b'b' ~ Construct new model ~ ~ Keep new belief if V(b'1) > V(b1)

The Optimal Policy Original distribution Reconstruction using EPCA and 6 bases Robot position True person position

E-PCA Policy Comparison Average time to find person Average # of Actions to find Person E-PCA: 72 states Refined E-PCA: 260 states Fully observable MDP

Nick’s Thesis Contributions Good policies for real world POMDPs can be found by planning over a low-dimensional representation of the belief space, using E-PCA. POMDPs can scale to bigger, more complicated real-world problems. POMDPs can be used for real deployed robots.

Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression.

Similar presentations

Presentation on theme: "Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression.

Similar presentations

Presentation on theme: "Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression."— Presentation transcript:

Similar presentations

About project

Feedback