Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL.

Slides:



Advertisements
Similar presentations
A Hierarchical Multiple Target Tracking Algorithm for Sensor Networks Songhwai Oh and Shankar Sastry EECS, Berkeley Nest Retreat, Jan
Advertisements

1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
Greedy Algorithms Be greedy! always make the choice that looks best at the moment. Local optimization. Not always yielding a globally optimal solution.
EE 553 Integer Programming
Near Optimal Rate Selection for Wireless Control Systems Abusayeed Saifullah, Chengjie Wu, Paras Tiwari, You Xu, Yong Fu, Chenyang Lu, Yixin Chen.
KAIST Adaptive Triangular Deployment Algorithm for Unattended Mobile Sensor Networks Suho Yang (September 4, 2008) Ming Ma, Yuanyuan Yang IEEE Transactions.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.
Computational Methods for Management and Economics Carla Gomes
Non-myopic Informative Path Planning in Spatio-Temporal Models Alexandra Meliou Andreas Krause Carlos Guestrin Joe Hellerstein.
Cache Placement in Sensor Networks Under Update Cost Constraint Bin Tang, Samir Das and Himanshu Gupta Department of Computer Science Stony Brook University.
1 A Shifting Strategy for Dynamic Channel Assignment under Spatially Varying Demand Harish Rathi Advisors: Prof. Karen Daniels, Prof. Kavitha Chandra Center.
© Honglei Miao: Presentation in Ad-Hoc Network course (19) Minimal CDMA Recoding Strategies in Power-Controlled Ad-Hoc Wireless Networks Honglei.
1 A Second Stage Network Recourse Problem in Stochastic Airline Crew Scheduling Joyce W. Yen University of Michigan John R. Birge Northwestern University.
Exploiting Correlated Attributes in Acquisitional Query Processing Amol Deshpande University of Maryland Joint work with Carlos Sam
Efficient Methodologies for Reliability Based Design Optimization
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
UNC Chapel Hill Lin/Manocha/Foskey Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject.
Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.
Extending Network Lifetime for Precision-Constrained Data Aggregation in Wireless Sensor Networks Xueyan Tang School of Computer Engineering Nanyang Technological.
Branch and Bound Algorithm for Solving Integer Linear Programming
Maximum Network lifetime in Wireless Sensor Networks with Adjustable Sensing Ranges Mihaela Cardei, Jie Wu, Mingming Lu, and Mohammad O. Pervaiz Department.
Optimal Adaptive Data Transmission over a Fading Channel with Deadline and Power Constraints Murtaza Zafer and Eytan Modiano Laboratory for Information.
Particle Filters++ TexPoint fonts used in EMF.
Chen Cai, Benjamin Heydecker Presentation for the 4th CREST Open Workshop Operation Research for Software Engineering Methods, London, 2010 Approximate.
Column Generation Approach for Operating Rooms Planning Mehdi LAMIRI, Xiaolan XIE and ZHANG Shuguang Industrial Engineering and Computer Sciences Division.
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.
A Framework for Distributed Model Predictive Control
An Overview of Dynamic Programming Seminar Series Joe Hartman ISE October 14, 2004.
Quasi-static Channel Assignment Algorithms for Wireless Communications Networks Frank Yeong-Sung Lin Department of Information Management National Taiwan.
1 Optimization Based Power Generation Scheduling Xiaohong Guan Tsinghua / Xian Jiaotong University.
1 1 Slide © 2000 South-Western College Publishing/ITP Slides Prepared by JOHN LOUCKS.
Maximum Network Lifetime in Wireless Sensor Networks with Adjustable Sensing Ranges Cardei, M.; Jie Wu; Mingming Lu; Pervaiz, M.O.; Wireless And Mobile.
Optimization Flow Control—I: Basic Algorithm and Convergence Present : Li-der.
EE 685 presentation Utility-Optimal Random-Access Control By Jang-Won Lee, Mung Chiang and A. Robert Calderbank.
Thinning Measurement Models and Questionnaire Design Ricardo Silva University College London CSML PI Meeting, January 2013.
Energy-Efficient Sensor Network Design Subject to Complete Coverage and Discrimination Constraints Frank Y. S. Lin, P. L. Chiu IM, NTU SECON 2005 Presenter:
Linear Programming Erasmus Mobility Program (24Apr2012) Pollack Mihály Engineering Faculty (PMMK) University of Pécs João Miranda
Optimal Fueling Strategies for Locomotive Fleets in Railroad Networks Seyed Mohammad Nourbakhsh Yanfeng Ouyang 1 William W. Hay Railroad Engineering Seminar.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Efficient k-Coverage Algorithms for Wireless Sensor Networks Mohamed Hefeeda.
TSV-Constrained Micro- Channel Infrastructure Design for Cooling Stacked 3D-ICs Bing Shi and Ankur Srivastava, University of Maryland, College Park, MD,
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.
Outline The role of information What is information? Different types of information Controlling information.
The Restricted Matched Filter for Distributed Detection Charles Sestok and Alan Oppenheim MIT DARPA SensIT PI Meeting Jan. 16, 2002.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
SCALABLE INFORMATION-DRIVEN SENSOR QUERYING AND ROUTING FOR AD HOC HETEROGENEOUS SENSOR NETWORKS Paper By: Maurice Chu, Horst Haussecker, Feng Zhao Presented.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1 Inventory Control with Time-Varying Demand. 2  Week 1Introduction to Production Planning and Inventory Control  Week 2Inventory Control – Deterministic.
Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject to some constraints. (There may.
Resource Allocation in Hospital Networks Based on Green Cognitive Radios 王冉茵
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
1 1 © 2003 Thomson  /South-Western Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Bing Wang, Wei Wei, Hieu Dinh, Wei Zeng, Krishna R. Pattipati (Fellow IEEE) IEEE Transactions on Mobile Computing, March 2012.
Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.
On Mobile Sink Node for Target Tracking in Wireless Sensor Networks Thanh Hai Trinh and Hee Yong Youn Pervasive Computing and Communications Workshops(PerComW'07)
Introduction and Preliminaries D Nagesh Kumar, IISc Water Resources Planning and Management: M4L1 Dynamic Programming and Applications.
1 Power Efficient Monitoring Management in Sensor Networks A.Zelikovsky Georgia State joint work with P. BermanPennstate G. Calinescu Illinois IT C. Shah.
1 Chapter 6 Reformulation-Linearization Technique and Applications.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Discrete Optimization MA2827 Fondements de l’optimisation discrète Material from P. Van Hentenryck’s course.
6.5 Stochastic Prog. and Benders’ decomposition
Thinning Measurement Models and Questionnaire Design
Power Efficient Monitoring Management In Sensor Networks
Department of Information Management National Taiwan University
ADVISOR : Professor Yeong-Sung Lin STUDENT : Hung-Shi Wang
Maximum Lifetime of Sensor Networks with Adjustable Sensing Range
6.5 Stochastic Prog. and Benders’ decomposition
Constraints.
Presentation transcript:

Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL & LIDS

Outline Motivation & assumptions Constrained Markov Decision Process formulation Communication constraint Entropy constraint Approximations Linearized Gaussian Greedy sensor subset selection n -Scan pruning Simulation results

Problem statement Object tracking using a network of sensors Sensors provide localization in close vicinity Sensors can communicate locally at cost / f ( range ) Energy is a limited resource Consumed by communication, sensing and computation Communication is orders of magnitude more expensive than other tasks Sensor management algorithm must determine Which sensors to activate at each time Where the probabilistic model should be stored at each time While trading off the competing objectives of estimation performance and communication cost

Heuristic solution At each time step activate the sensor with the most informative measurement (e.g. minimum expected posterior entropy)

Heuristic solution At each time step activate the sensor with the most informative measurement (e.g. minimum expected posterior entropy) Must transmit probabilistic model of object state every time most informative sensor changes Could benefit from using more measurements Estimation quality vs energy cost trade-off

Object state / location: x k / Lx k Probabilistic model: X k = p(x k | z 0:k{1 ) Notation Leader node: l k Position of sensor s : l s Time index: k Activated sensors: S k µ S Sensor s measurement: z k s History of incorporated measurements: z 0:k{1

Formulation We formulate as a constrained Markov Decision Process Maximize estimation performance subject to a constraint on communication cost Minimize communication cost subject to a constraint on estimation performance State is PDF ( X k = p(x k | z 0:k{1 ) ) and previous leader node ( l k{1 ) Dynamic programming equation for an N -step rolling horizon: such that E { G(X k, l k{1, u k:k+N{1 ) | X k, l k{1 }  0

Lagrangian relaxation We address the constraint using a Lagrangian relaxation: Strong duality does not hold since the primal problem is discrete

Communication-constrained formulation Cost-per-stage is such that the system minimizes joint expected conditional entropy of object state over planning horizon: Constraint applies to expected communication cost over planning horizon:

Communication-constrained formulation Integrating the communication costs into the per-stage cost: Per-stage cost now contains both information gain and communication cost

Information-constrained formulation Cost-per-stage is such that the system minimizes the energy consumed over the planning horizon: Constraint ensures that the joint entropy over the planning horizon is less than given threshold:

Solving the dual problem The dual optimization can be solved using a subgradient method Iteratively perform the following procedure Evaluate the DP for a given value of ¸ Adjust value of ¸ Increase if constraint is exceeded Decrease (to a minimum value of zero) if constraint has slack We employ a practical approximation which suits problems involving sequential replanning

Evaluating the DP DP has infinite state space, hence it cannot be evaluated exactly Conceptually, it could be evaluated through simulation Complexity is

Branching due to measurement values The first source of branching is that due to different values of measurement If we approximate the measurement model as linear Gaussian locally around a nominal trajectory, then the future costs are dependent only on the control choices, not on the measurement values Hence this source of branching can be eliminated entirely

Greedy sensor subset selection For large sensor networks complexity is high even for a single lookahead step due to consideration of sensor subsets We decompose each decision stage into a generalized stopping problem, where at each substage we can Add unselected sensor to the current selection Terminate with the current selection The per-stage cost can be conveniently decomposed into a per- substage cost which directly trades off the cost of obtaining each measurement against the information it returns:

Greedy sensor subset selection Outer DP recursion Inner DP sub-problem

Greedy sensor subset selection Ts1s1 s2s2 s3s3 Ts2s2 s3s3 Ts2s2 The per-stage cost can be conveniently decomposed into a per- substage cost which directly trades off the cost of obtaining each measurement against the information it returns:

n-Scan approximation The greedy subset selection is embedded within a n -scan pruning method (similar to the MHT) which addresses the growth due to different choices of leader node sequence s1s1 s2s2 s3s3 G G G GGG s1s1 s2s2 s3s3 s1s1 s2s2 s3s3 s1s1 s2s2 s3s3 s1s1 s2s2 s3s3

Sequential subgradient update This method can be used to evaluate the dualized DP for a given value of the dual variable ¸ Subgradient method evaluates DP for many different ¸ Updates between adjacent time intervals will typically be small since planning horizons are highly overlapping: Thus, as an approximation, in each decision stage we plan with a single value of ¸, and update it for the next time step using a single subgradient step Planning horizon time k Planning horizon time k+1 k k+1 k+N – 1 k+1 k+2 k+N

Sequential subgradient update We also need to constrain the value of the dual variable to avoid anomalous behavior when the constrained problem is infeasible Update has two forms: additive and multiplicative

Surrogate constraints The information constraint formulation allows you to constrain the joint entropy of object state within the planning horizon: Commonly, a more meaningful constraint would be the minimum entropy in the planning horizon: This form cannot be integrated into the per-stage cost An approximation: use the joint entropy constraint formulation in the DP, and the minimum entropy constraint in the subgradient update

Example – observation/communications Observation model:Communications cost: i = i 0 j = i 4 i1i1 i2i2 i3i3 r

Simulation results

Typical behavior of subgradient adaptation in communication-constrained case is: Take lots of measurements when there is no sensor hand-off in the planning horizon Take no measurements when there is a sensor hand-off in the planning horizon Longer planning horizons reduce this anomalous behavior

Simulation results

Significant Cost Reduction Decomposed cost structure into a form in which greedy approximations are able to capture the trade-off Complexity O([N s 2 N s ] N N p N ) reduced to O(NN s 3 ) N s = 20 / N p = 50 / N = 10 : 1.6   8  10 5 Strong non-myopic planning for horizon lengths > 20 steps possible Proposed an approximate subgradient update which can be used for adaptive sequential re- planning

Conclusions We have presented a principled formulation for explicitly trading off estimation performance and energy consumed within an integrated objective Twin formulations for optimizing estimation performance subject to energy constraint, and optimizing energy consumption subject to estimation performance constraint Decomposed cost structure into a form in which greedy approximations are able to capture the trade-off Complexity O([N s 2 N s ] N N p N ) reduced to O(NN s 3 ) N s = 20 / N p = 50 / N = 10 : 1.6   8  10 5 Strong non-myopic planning for horizon lengths > 20 steps possible Proposed an approximate subgradient update which can be used for adaptive sequential re-planning