Neeraj Jaggi ASSISTANT PROFESSOR DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE WICHITA STATE UNIVERSITY 1 Rechargeable Sensor Activation under Temporally.

Slides:



Advertisements
Similar presentations
Markov Decision Process
Advertisements

Value Iteration & Q-learning CS 5368 Song Cui. Outline Recap Value Iteration Q-learning.
Partially Observable Markov Decision Process (POMDP)
Network Utility Maximization over Partially Observable Markov Channels 1 1 Channel State 1 = ? Channel State 2 = ? Channel State 3 = ? Restless.
Coverage by Directional Sensors Jing Ai and Alhussein A. Abouzeid Dept. of Electrical, Computer and Systems Engineering Rensselaer Polytechnic Institute.
Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks By C. K. Toh.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.
Maximal Lifetime Scheduling in Sensor Surveillance Networks Hai Liu 1, Pengjun Wan 2, Chih-Wei Yi 2, Siaohua Jia 1, Sam Makki 3 and Niki Pissionou 4 Dept.
Distributed Association Control in Shared Wireless Networks Krishna C. Garikipati and Kang G. Shin University of Michigan-Ann Arbor.
Planning under Uncertainty
1 Stochastic Event Capture Using Mobile Sensors Subject to a Quality Metric Nabhendra Bisnik, Alhussein A. Abouzeid, and Volkan Isler Rensselaer Polytechnic.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Review.
1-1 Topology Control. 1-2 What’s topology control?
Energy-Efficient Target Coverage in Wireless Sensor Networks Mihaela Cardei, My T. Thai, YingshuLi, WeiliWu Annual Joint Conference of the IEEE Computer.
Watchdog Confident Event Detection in Heterogeneous Sensor Networks Matthew Keally 1, Gang Zhou 1, Guoliang Xing 2 1 College of William and Mary, 2 Michigan.
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Optimal Fixed-Size Controllers for Decentralized POMDPs Christopher Amato Daniel.
Markov Decision Processes
Maximum Network lifetime in Wireless Sensor Networks with Adjustable Sensing Ranges Mihaela Cardei, Jie Wu, Mingming Lu, and Mohammad O. Pervaiz Department.
Reinforcement Learning: Learning algorithms Yishay Mansour Tel-Aviv University.
Department of Computer Science Undergraduate Events More
More RL. MDPs defined A Markov decision process (MDP), M, is a model of a stochastic, dynamic, controllable, rewarding process given by: M = 〈 S, A,T,R.
CS Dept, City Univ.1 Maximal Lifetime Scheduling for Wireless Sensor Surveillance Networks Prof. Xiaohua Jia Dept. of Computer Science City University.
1 Target-Oriented Scheduling in Directional Sensor Networks Yanli Cai, Wei Lou, Minglu Li,and Xiang-Yang Li* The Hong Kong Polytechnic University, Hong.
Scheduling of Wireless Metering for Power Market Pricing in Smart Grid Husheng Li, Lifeng Lai, and Robert Caiming Qiu. "Scheduling of Wireless Metering.
Yi Wang, Bhaskar Krishnamachari, Qing Zhao, and Murali Annavaram 1 The Tradeoff between Energy Efficiency and User State Estimation Accuracy in Mobile.
MDP Reinforcement Learning. Markov Decision Process “Should you give money to charity?” “Would you contribute?” “Should you give money to charity?” $
Resource Allocation for E-healthcare Applications
MAKING COMPLEX DEClSlONS
Computational Stochastic Optimization: Bridging communities October 25, 2012 Warren Powell CASTLE Laboratory Princeton University
Steady and Fair Rate Allocation for Rechargeable Sensors in Perpetual Sensor Networks Zizhan Zheng Authors: Kai-Wei Fan, Zizhan Zheng and Prasun Sinha.
The Coverage Problem in Wireless Ad Hoc Sensor Networks Supervisor: Prof. Sanjay Srivastava By, Rucha Kulkarni
Lifetime and Coverage Guarantees Through Distributed Coordinate- Free Sensor Activation ACM MOBICOM 2009.
1 Optimal Power Allocation and AP Deployment in Green Wireless Cooperative Communications Xiaoxia Zhang Department of Electrical.
Stochastic sleep scheduling (SSS) for large scale wireless sensor networks Yaxiong Zhao Jie Wu Computer and Information Sciences Temple University.
SoftCOM 2005: 13 th International Conference on Software, Telecommunications and Computer Networks September 15-17, 2005, Marina Frapa - Split, Croatia.
Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and David H.C. Du Dept. of.
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Computer Science CPSC 502 Lecture 14 Markov Decision Processes (Ch. 9, up to 9.5.3)
Efficient Deployment Algorithms for Prolonging Network Lifetime and Ensuring Coverage in Wireless Sensor Networks Yong-hwan Kim Korea.
Maximum Network Lifetime in Wireless Sensor Networks with Adjustable Sensing Ranges Cardei, M.; Jie Wu; Mingming Lu; Pervaiz, M.O.; Wireless And Mobile.
Energy Efficient Phone-to-Phone Communication Based on WiFi Hotspots in PSN En Wang 1,2, Yongjian Yang 1, and Jie Wu 2 1 Dept. of Computer Science and.
On Energy-Efficient Trap Coverage in Wireless Sensor Networks Junkun Li, Jiming Chen, Shibo He, Tian He, Yu Gu, Youxian Sun Zhejiang University, China.
Distributed Monitoring and Aggregation in Wireless Sensor Networks INFOCOM 2010 Changlei Liu and Guohong Cao Speaker: Wun-Cheng Li.
Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.
Optimal Selection of Power Saving Classes in IEEE e Lei Kong, Danny H.K. Tsang Department of Electronic and Computer Engineering Hong Kong University.
Utilities and MDP: A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor SIUC.
1 Virtual Patrol : A New Power Conservation Design for Surveillance Using Sensor Networks Prasant Mohapatra, Chao Gui Computer Science Dept. Univ. California,
An Energy-Efficient MAC Protocol for Wireless Sensor Networks Qingchun Ren and Qilian Liang Department of Electrical Engineering, University of Texas at.
Decision Making Under Uncertainty Lec #8: Reinforcement Learning UIUC CS 598: Section EA Professor: Eyal Amir Spring Semester 2006 Most slides by Jeremy.
Efficient Energy Management Protocol for Target Tracking Sensor Networks X. Du, F. Lin Department of Computer Science North Dakota State University Fargo,
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
Maximizing Lifetime per Unit Cost in Wireless Sensor Networks
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
A Wakeup Scheme for Sensor Networks: Achieving Balance between Energy Saving and End-to-end Delay Xue Yang, Nitin H.Vaidya Department of Electrical and.
Resource Allocation in Hospital Networks Based on Green Cognitive Radios 王冉茵
Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
U of Minnesota DIWANS'061 Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and.
Some Final Thoughts Abhijit Gosavi. From MDPs to SMDPs The Semi-MDP is a more general model in which the time for transition is also a random variable.
Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.
1 (Chapter 3 of) Planning and Control in Stochastic Domains with Imperfect Information by Milos Hauskrecht CS594 Automated Decision Making Course Presentation.
Reinforcement Learning: Learning algorithms Yishay Mansour Tel-Aviv University.
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Asynchronous Control for Coupled Markov Decision Systems Michael J. Neely University of Southern California Information Theory Workshop (ITW) Lausanne,
Reliability Engineering
Towards Optimal Sleep Scheduling in Sensor Networks for Rare-Event Detection Qing Cao, Tarek Abdelzaher, Tian He, John Stankovic Department of Computer.
Professor Arne Thesen, University of Wisconsin-Madison
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Presentation transcript:

Neeraj Jaggi ASSISTANT PROFESSOR DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE WICHITA STATE UNIVERSITY 1 Rechargeable Sensor Activation under Temporally Correlated Events

Outline Sensor Networks Rechargeable Sensor System  Design of energy-efficient algorithms  Activation question – Single sensor scenario Temporally correlated event occurrence  Perfect state information  Structure of optimal policy  Imperfect state information  Practical algorithm with performance guarantees 2 Neeraj Jaggi Dept of EECS Wichita State University

3 Sensor Nodes  Tiny, low cost Devices  Prone to Failures  Redundant Deployment  Rechargeable Sensor Nodes Range of Applications Important Issues  Energy Management  Quality of Coverage Sensor Networks Neeraj Jaggi Dept of EECS Wichita State University

4 Rechargeable Sensor System Quality of Coverage Discharge Recharge Event Phenomena Renewable Energy Randomness Spatio-temporal Correlations Activation Policy Control Rechargeable Sensors Neeraj Jaggi Dept of EECS Wichita State University

Research Question 5 How should a sensor be activated (“switched on”) dynamically so that the quality of coverage is maximized over time ? A sensor became ready. What should it do ?  Activate itself now :  Gain some utility in the short-term  Activate itself later :  No utility in the short term  Activate when the system “needs it more” Neeraj Jaggi Dept of EECS Wichita State University

Temporal Correlations 6 Event Process (e.g. Forest fire)  On period (HOT)  Off period (COLD)  Correlation probabilities 0.5 < (, ) < 1 ( = = 0.8) Performance Criteria – Single Sensor Node  Fraction of Events Detected over time Neeraj Jaggi Dept of EECS Wichita State University

Sensor Energy Consumption Model 7 Discrete Time Energy Model  Operational Cost (  1 )  Detection Cost (  2 )  Recharge Rate (qc)  Probability (q)  Amount (c) recharge sensor not activated (no discharge) qc activation policy K δ1δ1 discharge - Off period discharge - On period δ1+δ2δ1+δ2 sensor activated Neeraj Jaggi Dept of EECS Wichita State University

System Observability 8 Perfect State Information  Sensor can always observe state of event process (even while inactive) Imperfect State Information  Inactive sensor can not observe event process Neeraj Jaggi Dept of EECS Wichita State University

Approach/Methodology 9 Perfect State Information  Formulate Markov Decision Problem (MDP)  Structure of Optimal Policy Imperfect State Information  Formulate Partially Observable MDP (POMDP)  Transform POMDP to equivalent MDP (Known techniques)  Structure of Optimal Policy  Near-optimal practical Algorithms Neeraj Jaggi Dept of EECS Wichita State University

Perfect State Information 10 Markov Decision Process  State Space = {(L, E); 0 ≤ L ≤ K, E є [0, 1]}  L – Current Energy Level, E – On/Off period  Reward r– one if event detected; zero otherwise  Action u є [0, 1]; Transition probabilities p Optimality equation (average reward criteria)  h* – state variables  λ* – optimal reward Neeraj Jaggi Dept of EECS Wichita State University

Perfect State Information (contd.) 11 Approximate Solution  Closed form solution for h* does not seem to exist Value Iteration  Activation Algorithm  L << K  Sensitive to system parameters when L ~ K Optimality equation (average reward criteria)  H* – variables  Lambda* – optimal reward Neeraj Jaggi Dept of EECS Wichita State University

Perfect State Information (contd.) 12 Optimal Policy Structure  Randomized algorithm  P* is directly proportional to the recharge rate  Energy balance  Average recharge rate equals average discharge rate in steady state On Period ? Activate Yes No Sufficient Energy ? No Do Not Activate Yes Prob. ≤ P* ? Yes No Neeraj Jaggi Dept of EECS Wichita State University

Imperfect State Information 13 Partially Observable Markov Decision Process  State Space  Observation Space  Optimal actions depend on current and past observations (y) and on past actions (u) Transformation to equivalent MDP 1  State – Information vector Z t of length |X|  Z t+1 is recursively computable given Z t, u t and y t+1  Z t forms a completely observable MDP  Equivalent rewards and actions 1 Neeraj Jaggi Dept of EECS Wichita State University

Equivalent MDP Structure 14 Active Sensor – Observation = (L, 1) or (L, 0)  State is the same as observation  Z t has only one non-zero component Inactive Sensor – Observation = (L, Φ)  Let state last observed = E, number of time slots inactive = i  Z t has only two non-zero components  Let p i = prob. that event process changed state from E to 1- E in i time slots  State = (L, E) with prob. 1 - p i  State = (L, 1 – E) with prob. p i  Z t is a function of (L, E, i) Neeraj Jaggi Dept of EECS Wichita State University

Transformed MDP State Space – (L, E, t)  L – Current Energy Level  E – State of Event process last observed  t – Number of time slots spent in inactive state Optimal Policy Structure f 0 – (L, 0, t), f 1 – (L, 1, t)  [  1 =c=1,  2 = 2, = 0.6, = 0.9, q = 0.1] Imperfect State Information (contd.) 15 Off Period – Reluctant Wakeup On Period – Aggressive Wakeup Neeraj Jaggi Dept of EECS Wichita State University

Practical Algorithm 16 Correlation dependent Wakeup (CW)  Activate during On Periods; Deactivate during Off Sleep Interval (SI*)  Derived using energy balance during a renewal interval  -optimal (  ~ O(1/β)); β =  2 /  1 A – Active I – Inactive Y – On, N – Off SI – sleep duration t 1, t 2 – renewal instances Y YYYY N AAAAAAI Y YYN AAAAI SI I t t2t2 t1t1 Neeraj Jaggi Dept of EECS Wichita State University

Simulation Results 17 Energy balancing Sleep Interval SI* [ = 0.6, = 0.9, SI * = 7][ = 0.7, = 0.8, SI * = 18] [  1 = c = 1,  1 = 6, q = 0.5, K = 2400] Neeraj Jaggi Dept of EECS Wichita State University

Contributions 18 Structure of Optimal Policy EB Policy is Optimal for Perfect State Information EB Policy is near Optimal for Imperfect State Information  Coauthors Prof. Koushik Kar, Rensselaer Polytechnic Institute Prof. Ananth Krishnamurthy, Univ. of Wisconsin Madison  5 th International Symposium on Modeling and Optimization in Mobile Ad hoc and Wireless Networks (WIOPT) April 2007  ACM/KLUWER Wireless Networks 2008 (Accepted ) Neeraj Jaggi Dept of EECS Wichita State University

Q & A 19 THANK YOU !! Neeraj Jaggi Dept of EECS Wichita State University

Policies – AW, CW 20 AW (Aggressive Wakeup) Policy  Activate whenever L ≥  2 +  1  Ignores temporal correlations  Optimal if no temporal correlations CW (Correlation dependent Wakeup) Policies  Activate during On periods; deactivate during Off  Upper Bound ( U * CW )  State unobservable during inactive state  Performance depends upon sleep duration Neeraj Jaggi Dept of EECS Wichita State University How long should sensor sleep ?

MDP – State Transitions 21 State (L, 1): L ≥  2 +  1 Action u = 1 (activate)  Next state :  (L + qc – δ 1 – δ 2, 1) with probability q.p c on  (L + qc – δ 1, 0) with probability q.(1 – p c on )  (L – δ 1 – δ 2, 1) with probability (1 – q ).p c on  (L – δ 1, 0) with probability (1 – q ).(1 – p c on )  Reward r = 1 with probability p c on ; 0 otherwise. Action u = 0 (deactivate)  Next state :  (L + qc, 1) with probability q.p c on  (L + qc, 0) with probability q.(1 – p c on )  (L, 1) with probability (1 – q).p c on  (L, 0) with probability (1 – q).(1 – p c on )  Reward r = 0 Neeraj Jaggi Dept of EECS Wichita State University