Presentation is loading. Please wait.

Presentation is loading. Please wait.

Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL.

Similar presentations


Presentation on theme: "Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL."— Presentation transcript:

1 Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL & LIDS

2 Outline Motivation & assumptions Constrained Markov Decision Process formulation Communication constraint Entropy constraint Approximations Linearized Gaussian Greedy sensor subset selection n -Scan pruning Simulation results

3 Problem statement Object tracking using a network of sensors Sensors provide localization in close vicinity Sensors can communicate locally at cost / f ( range ) Energy is a limited resource Consumed by communication, sensing and computation Communication is orders of magnitude more expensive than other tasks Sensor management algorithm must determine Which sensors to activate at each time Where the probabilistic model should be stored at each time While trading off the competing objectives of estimation performance and communication cost

4 Heuristic solution At each time step activate the sensor with the most informative measurement (e.g. minimum expected posterior entropy)

5 Heuristic solution At each time step activate the sensor with the most informative measurement (e.g. minimum expected posterior entropy) Must transmit probabilistic model of object state every time most informative sensor changes Could benefit from using more measurements Estimation quality vs energy cost trade-off

6 Object state / location: x k / Lx k Probabilistic model: X k = p(x k | z 0:k{1 ) Notation Leader node: l k Position of sensor s : l s Time index: k Activated sensors: S k µ S Sensor s measurement: z k s History of incorporated measurements: z 0:k{1

7 Formulation We formulate as a constrained Markov Decision Process Maximize estimation performance subject to a constraint on communication cost Minimize communication cost subject to a constraint on estimation performance State is PDF ( X k = p(x k | z 0:k{1 ) ) and previous leader node ( l k{1 ) Dynamic programming equation for an N -step rolling horizon: such that E { G(X k, l k{1, u k:k+N{1 ) | X k, l k{1 }  0

8 Lagrangian relaxation We address the constraint using a Lagrangian relaxation: Strong duality does not hold since the primal problem is discrete

9 Communication-constrained formulation Cost-per-stage is such that the system minimizes joint expected conditional entropy of object state over planning horizon: Constraint applies to expected communication cost over planning horizon:

10 Communication-constrained formulation Integrating the communication costs into the per-stage cost: Per-stage cost now contains both information gain and communication cost

11 Information-constrained formulation Cost-per-stage is such that the system minimizes the energy consumed over the planning horizon: Constraint ensures that the joint entropy over the planning horizon is less than given threshold:

12 Solving the dual problem The dual optimization can be solved using a subgradient method Iteratively perform the following procedure Evaluate the DP for a given value of ¸ Adjust value of ¸ Increase if constraint is exceeded Decrease (to a minimum value of zero) if constraint has slack We employ a practical approximation which suits problems involving sequential replanning

13 Evaluating the DP DP has infinite state space, hence it cannot be evaluated exactly Conceptually, it could be evaluated through simulation Complexity is

14 Branching due to measurement values The first source of branching is that due to different values of measurement If we approximate the measurement model as linear Gaussian locally around a nominal trajectory, then the future costs are dependent only on the control choices, not on the measurement values Hence this source of branching can be eliminated entirely

15 Greedy sensor subset selection For large sensor networks complexity is high even for a single lookahead step due to consideration of sensor subsets We decompose each decision stage into a generalized stopping problem, where at each substage we can Add unselected sensor to the current selection Terminate with the current selection The per-stage cost can be conveniently decomposed into a per- substage cost which directly trades off the cost of obtaining each measurement against the information it returns:

16 Greedy sensor subset selection Outer DP recursion Inner DP sub-problem

17 Greedy sensor subset selection Ts1s1 s2s2 s3s3 Ts2s2 s3s3 Ts2s2 The per-stage cost can be conveniently decomposed into a per- substage cost which directly trades off the cost of obtaining each measurement against the information it returns:

18 n-Scan approximation The greedy subset selection is embedded within a n -scan pruning method (similar to the MHT) which addresses the growth due to different choices of leader node sequence s1s1 s2s2 s3s3 G G G GGG s1s1 s2s2 s3s3 s1s1 s2s2 s3s3 s1s1 s2s2 s3s3 s1s1 s2s2 s3s3

19 Sequential subgradient update This method can be used to evaluate the dualized DP for a given value of the dual variable ¸ Subgradient method evaluates DP for many different ¸ Updates between adjacent time intervals will typically be small since planning horizons are highly overlapping: Thus, as an approximation, in each decision stage we plan with a single value of ¸, and update it for the next time step using a single subgradient step Planning horizon time k Planning horizon time k+1 k k+1 k+N – 1 k+1 k+2 k+N

20 Sequential subgradient update We also need to constrain the value of the dual variable to avoid anomalous behavior when the constrained problem is infeasible Update has two forms: additive and multiplicative

21 Surrogate constraints The information constraint formulation allows you to constrain the joint entropy of object state within the planning horizon: Commonly, a more meaningful constraint would be the minimum entropy in the planning horizon: This form cannot be integrated into the per-stage cost An approximation: use the joint entropy constraint formulation in the DP, and the minimum entropy constraint in the subgradient update

22 Example – observation/communications Observation model:Communications cost: i = i 0 j = i 4 i1i1 i2i2 i3i3 r

23

24 Simulation results

25 Typical behavior of subgradient adaptation in communication-constrained case is: Take lots of measurements when there is no sensor hand-off in the planning horizon Take no measurements when there is a sensor hand-off in the planning horizon Longer planning horizons reduce this anomalous behavior

26 Simulation results

27 Significant Cost Reduction Decomposed cost structure into a form in which greedy approximations are able to capture the trade-off Complexity O([N s 2 N s ] N N p N ) reduced to O(NN s 3 ) N s = 20 / N p = 50 / N = 10 : 1.6  10 90  8  10 5 Strong non-myopic planning for horizon lengths > 20 steps possible Proposed an approximate subgradient update which can be used for adaptive sequential re- planning

28 Conclusions We have presented a principled formulation for explicitly trading off estimation performance and energy consumed within an integrated objective Twin formulations for optimizing estimation performance subject to energy constraint, and optimizing energy consumption subject to estimation performance constraint Decomposed cost structure into a form in which greedy approximations are able to capture the trade-off Complexity O([N s 2 N s ] N N p N ) reduced to O(NN s 3 ) N s = 20 / N p = 50 / N = 10 : 1.6  10 90  8  10 5 Strong non-myopic planning for horizon lengths > 20 steps possible Proposed an approximate subgradient update which can be used for adaptive sequential re-planning


Download ppt "Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL."

Similar presentations


Ads by Google