Download presentation
Presentation is loading. Please wait.
1
Heuristic Search Value Iteration
for POMDPs Presenter: Hui Li January 12, 2007
2
Outline Value Approximation Heuristic Search Results Conclusions
3
Value Approximation in HSVI
Optimal Value function Vn(b) in POMDPs for a horizon of length n is piecewise linear and convex Where is the gradient vector of Vn(b) in the k-th polyhedral belief region
4
Value Approximation in HSVI
V*(b) b1 b2 V(b) V(b) is the upper bound V*(b) is the exact true value function V(b) is the lower bound b 1
5
Value Approximation in HSVI
V*(b) b1 b2 V(b) Locally Updating at b b
6
Value Approximation in HSVI
Vector set representation for the low bound V(b) Initialization Updating using
7
Value Approximation in HSVI
Point set representation for the upper bound V(b) is Upper bound is the convex hull formed by a finite set of belief/value points Initialization Using MDP solution as initial value Updating using is the projection of b’ onto the convex hull, which can be solved by linear program.
8
Value Approximation in HSVI
It can be proved that the lower bound V(b) and the upper bound V(b) are uniformly provable and converge to the true value function V*(b) . V0(b) V1(b) Vn(b) Upper bound Lower bound
9
Heuristic Search in HSVI
Adding one belief point at each update iteration
10
Heuristic Search in HSVI
Interval function Width of interval function Uncertainty at b
11
Heuristic Search in HSVI
How to select next belief point b The selection of the action a* It turns out convergence can be guaranteed only by choosing the action with the greatest upper bound. The selection of the observation o* Selecting o* with the maximized weighted uncertainty
12
Results of HSVI on Benchmark Problems
13
Results of HSVI on Benchmark Problems
Comparison between PBVI and HSVI
14
Results of HSVI on Benchmark Problems
15
Conclusions HSVI utilizes the upper bound and lower bound to approximate the value function; The heuristic search for next belief HSVI brings a faster convergence.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.