Presentation is loading. Please wait.

Presentation is loading. Please wait.

Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression.

Similar presentations


Presentation on theme: "Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression."— Presentation transcript:

1 Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression

2 Reliable Navigation Conventional trajectories may not be robust to localisation error Estimated robot position Robot position distribution True robot position Goal position

3 Perception and Control PerceptionControl World state Control algorithms

4 Perception and Control Assumed full observability Exact POMDP planning Probabilistic Perception Model P(x) argmax P(x)Control World state Probabilistic Perception Model P(x)Control Brittle Intractable

5 Perception and Control Assume full observability Exact POMDP planning Brittle World state Probabilistic Perception Model P(x)Compressed P(x)Control Intractable

6 Main Insight World state Probabilistic Perception Model P(x)Low-dimensional P(x)Control Good policies for real world POMDPs can be found by planning over low-dimensional representations of the belief space.

7 but not usually. The controller may be globally uncertain... Belief Space Structure

8 Coastal Navigation Represent beliefs using Discretise into low-dimensional belief space MDP

9 Coastal Navigation

10 A Hard Navigation Problem Distance in M Average Distance to Goal

11 Dimensionality Reduction Principal Components Analysis Original Beliefs Weights Characteristic Beliefs

12 Principal Components Analysis Given belief b  n, we want b  m, m«n. Collection of beliefs drawn from 200 state problem Probability of being in state State ~

13 One sample distribution m=9 gives this representation for one sample distribution Principal Components Analysis Given belief b  n, we want b  m, m«n. Probability of being in state State ~

14 Principal Components Analysis Many real world POMDP distributions are characterised by large regions of low probability. Idea: Create fitting criterion that is (exponentially) stronger in low-probability regions (E-PCA)

15 1 basis2 bases3 bases4 bases Example EPCA State Probability of being in state

16 Example Reduction

17 E-PCA will indicate appropriate number of bases, depending on beliefs encountered Finding Dimensionality

18 Planning S1S1 S2S2 S3S3 Original POMDP Low-dimensional belief space B E-PCA Discrete belief space MDP Discretise ~

19 Model Parameters Reward function R(b) s1s1 s2s2 s3s3 p(s) Back-project to high dimensional belief Compute expected reward from belief: ~ ~

20 Model Parameters Low dimension Full dimension ~ 1. For each belief b i and action a bibi ~ 3.Propagate according to action bjbj 4.Propagate according to observation bjbj ~ ~ 5. Recover b j 6. Set T(b i, a, b j ) to probability of observation ~~ bibi ~ 2. Recover full belief b i

21 Robot Navigation Example True (hidden) robot position Goal position Goal state Initial Distribution

22 Robot Navigation Example True robot position Goal position

23 Policy Comparison Average Distance to Goal Distance in M 6 bases

24 People Finding

25 People Finding as a POMDP Fully Observable Robot Position of person unknown Robot position True person position

26 Finding and Tracking People Robot position True person position

27 People Finding as a POMDP Factored belief space 2 dimensions: fully-observable robot position 6 dimensions: distribution over person positions Regular grid gives ≈ 10 16 states

28 Variable Resolution Non-regular grid using samples b1b1 b2b2 b3b3 b4b4 b5b5 T(b 1, a 1, b 2 ) T(b 1, a 2, b 5 ) Compute model parameters using nearest-neighbour ~~ ~ ~ ~ ~ ~ ~ ~

29 Refining the Grid V(b 1 ) ~ V(b' 1 ) ~ Sample beliefs according to policy b1b1 ~ b'b' ~ Construct new model ~ ~ Keep new belief if V(b'1) > V(b1)

30 The Optimal Policy Original distribution Reconstruction using EPCA and 6 bases Robot position True person position

31 E-PCA Policy Comparison Average time to find person Average # of Actions to find Person E-PCA: 72 states Refined E-PCA: 260 states Fully observable MDP

32 Nick’s Thesis Contributions Good policies for real world POMDPs can be found by planning over a low-dimensional representation of the belief space, using E-PCA. POMDPs can scale to bigger, more complicated real-world problems. POMDPs can be used for real deployed robots.


Download ppt "Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression."

Similar presentations


Ads by Google