Network Utility Maximization over Partially Observable Markov Channels 1 1 Channel State 1 = ? Channel State 2 = ? Channel State 3 = ? 2 2 3 3 Restless.

Slides:



Advertisements
Similar presentations
Delay Analysis and Optimality of Scheduling Policies for Multihop Wireless Networks Gagan Raj Gupta Post-Doctoral Research Associate with the Parallel.
Advertisements

Optimal Pricing in a Free Market Wireless Network Michael J. Neely University of Southern California *Sponsored in part.
Hadi Goudarzi and Massoud Pedram
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
DBLA: D ISTRIBUTED B LOCK L EARNING A LGORITHM F OR C HANNEL S ELECTION I N C OGNITIVE R ADIO N ETWORKS Chowdhury Sayeed Hyder Department of Computer Science.
Stochastic optimization for power-aware distributed scheduling Michael J. Neely University of Southern California t ω(t)
Tradeoffs between performance guarantee and complexity for distributed scheduling in wireless networks Saswati Sarkar University of Pennsylvania Communication.
Delay Reduction via Lagrange Multipliers in Stochastic Network Optimization Longbo Huang Michael J. Neely WiOpt *Sponsored in part by NSF.
Resource Allocation in Wireless Networks: Dynamics and Complexity R. Srikant Department of ECE and CSL University of Illinois at Urbana-Champaign.
EE 685 presentation Optimal Control of Wireless Networks with Finite Buffers By Long Bao Le, Eytan Modiano and Ness B. Shroff.
DYNAMIC POWER ALLOCATION AND ROUTING FOR TIME-VARYING WIRELESS NETWORKS Michael J. Neely, Eytan Modiano and Charles E.Rohrs Presented by Ruogu Li Department.
Stochastic Network Optimization with Non-Convex Utilities and Costs Michael J. Neely University of Southern California
Intelligent Packet Dropping for Optimal Energy-Delay Tradeoffs for Wireless Michael J. Neely University of Southern California
Dynamic Product Assembly and Inventory Control for Maximum Profit Michael J. Neely, Longbo Huang (University of Southern California) Proc. IEEE Conf. on.
Dynamic Index Coding Broadcast Station N N Michael J. Neely, Arash Saber Tehrani, Zhen Zhang University of Southern California Paper available.
Universal Scheduling for Networks with Arbitrary Traffic, Channels, and Mobility Michael J. Neely, University of Southern California Proc. IEEE Conf. on.
Utility Optimization for Dynamic Peer-to-Peer Networks with Tit-for-Tat Constraints Michael J. Neely, Leana Golubchik University of Southern California.
Stock Market Trading Via Stochastic Network Optimization Michael J. Neely (University of Southern California) Proc. IEEE Conf. on Decision and Control.
Delay-Based Network Utility Maximization Michael J. Neely University of Southern California IEEE INFOCOM, San Diego, March.
Dynamic Optimization and Learning for Renewal Systems Michael J. Neely, University of Southern California Asilomar Conference on Signals, Systems, and.
Neeraj Jaggi ASSISTANT PROFESSOR DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE WICHITA STATE UNIVERSITY 1 Rechargeable Sensor Activation under Temporally.
Dynamic Index Coding User set N Packet set P Broadcast Station N N p p p Michael J. Neely, Arash Saber Tehrani, Zhen Zhang University.
Dynamic Optimization and Learning for Renewal Systems -- With applications to Wireless Networks and Peer-to-Peer Networks Michael J. Neely, University.
Max Weight Learning Algorithms with Application to Scheduling in Unknown Environments Michael J. Neely University of Southern California
Dynamic Data Compression for Wireless Transmission over a Fading Channel Michael J. Neely University of Southern California CISS 2008 *Sponsored in part.
Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn)
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
Multi-Hop Networking with Hard Delay Constraints Michael J. Neely, University of Southern California DARPA IT-MANET Presentation, January 2011 PDF of paper.
Cross Layer Adaptive Control for Wireless Mesh Networks (and a theory of instantaneous capacity regions) Michael J. Neely, Rahul Urgaonkar University of.
A Fair Scheduling Policy for Wireless Channels with Intermittent Connectivity Saswati Sarkar Department of Electrical and Systems Engineering University.
1 Optimization and Stochastic Control of MANETs Asu Ozdaglar Electrical Engineering and Computer Science Massachusetts Institute of Technology CBMANET.
Scheduling of Wireless Metering for Power Market Pricing in Smart Grid Husheng Li, Lifeng Lai, and Robert Caiming Qiu. "Scheduling of Wireless Metering.
Optimal Energy and Delay Tradeoffs for Multi-User Wireless Downlinks Michael J. Neely University of Southern California
A Lyapunov Optimization Approach to Repeated Stochastic Games Michael J. Neely University of Southern California Proc.
Resource Allocation for E-healthcare Applications
MAKING COMPLEX DEClSlONS
EE 685 presentation Distributed Cross-layer Algorithms for the Optimal Control of Multi-hop Wireless Networks By Atilla Eryılmaz, Asuman Özdağlar, Devavrat.
Delay Analysis for Maximal Scheduling in Wireless Networks with Bursty Traffic Michael J. Neely University of Southern California INFOCOM 2008, Phoenix,
By Avinash Sridrahan, Scott Moeller and Bhaskar Krishnamachari.
Utility-Optimal Scheduling in Time- Varying Wireless Networks with Delay Constraints I-Hong Hou P.R. Kumar University of Illinois, Urbana-Champaign 1/30.
1 A Simple Asymptotically Optimal Energy Allocation and Routing Scheme in Rechargeable Sensor Networks Shengbo Chen, Prasun Sinha, Ness Shroff, Changhee.
Michael J. Neely, University of Southern California CISS, Princeton University, March 2012 Asynchronous Scheduling for.
Stochastic Optimal Networking: Energy, Delay, Fairness Michael J. Neely University of Southern California
Energy-Aware Wireless Scheduling with Near Optimal Backlog and Convergence Time Tradeoffs Michael J. Neely University of Southern California INFOCOM 2015,
Super-Fast Delay Tradeoffs for Utility Optimal Scheduling in Wireless Networks Michael J. Neely University of Southern California
ITMANET PI Meeting September 2009 ITMANET Nequ-IT Focus Talk (PI Neely): Reducing Delay in MANETS via Queue Engineering.
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
Fairness and Optimal Stochastic Control for Heterogeneous Networks Time-Varying Channels     U n (c) (t) R n (c) (t) n (c) sensor.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Order Optimal Delay for Opportunistic Scheduling In Multi-User Wireless Uplinks and Downlinks Michael J. Neely University of Southern California
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.
Delay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems Michael J. Neely --- University of Southern California
Energy Optimal Control for Time Varying Wireless Networks Michael J. Neely University of Southern California
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Asynchronous Control for Coupled Markov Decision Systems Michael J. Neely University of Southern California Information Theory Workshop (ITW) Lausanne,
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Optimization-based Cross-Layer Design in Networked Control Systems Jia Bai, Emeka P. Eyisi Yuan Xue and Xenofon D. Koutsoukos.
Basics of Multi-armed Bandit Problems
Online Fractional Programming for Markov Decision Systems
Joint work with Bo Ji, Kannan Srinivasan, and Ness Shroff Zhenzhi Qian
Scheduling Algorithms for Multi-Carrier Wireless Data Systems
Delay Efficient Wireless Networking
Abdul Kader Kabbani (Stanford University)
Resource Allocation in Non-fading and Fading Multiple Access Channel
IEEE Student Paper Contest
Throughput-Optimal Broadcast in Dynamic Wireless Networks
Utility Optimization with “Super-Fast”
Optimal Control for Generalized Network-Flow Problems
Presentation transcript:

Network Utility Maximization over Partially Observable Markov Channels 1 1 Channel State 1 = ? Channel State 2 = ? Channel State 3 = ? Restless Multi-Arm Bandit

This work is from the following papers:* Li, Neely WiOpt 2010 Li, Neely ArXiv 2010, submitted for conference Neely Asilomar 2010 Chih-Ping Li is graduating and is currently looking for post-doc positions! *The above paper titles are given below, and are available at: C. Li and M. J. Neely “Exploiting Channel Memory for Multi-User Wireless Scheduling without Channel Measurement: Capacity Regions and Algorithms,” Proc. WiOpt C. Li and M. J. Neely, “Network Utility Maximization over Partially Observable Markovian Channels,” arXiv: , Aug M. J. Neely, “Dynamic Optimization and Learning for Renewal Systems,” Proc. Asilomar Conf. on Signals, Systems, and Computers, Nov

1 1 S 1 (t) = ? S 2 (t) = ? S 3 (t) = ? N-user wireless system. Timeslots t in {0, 1, 2, …}. Choose one channel for transmission every slot t. Channels S i (t) ON/OFF Markov, current states S i (t) unknown. Process S i (t) for Channel i: εiεi δiδi Restless Multi-Arm Bandit with vector rewards

1 1 S 1 (t) = ? S 2 (t) = ? S 3 (t) = ? Restless Multi-Arm Bandit with vector rewards Suppose we serve channel i on slot t: Process S i (t) for Channel i: εiεi δiδi

1 1 S 1 (t) = ? S 2 (t) = ? S 3 (t) = ? Suppose we serve channel i on slot t: If S i (t)=ON  ACK  Reward vector r(t) = (0, …, 0, 1, 0, …, 0). Process S i (t) for Channel i: εiεi δiδi Restless Multi-Arm Bandit with vector rewards = r(t)

1 1 S 1 (t) = ? S 2 (t) = ? S 3 (t) = ? Suppose we serve channel i on slot t: If S i (t)=ON  ACK  Reward vector r(t) = (0, …, 0, 1, 0, …, 0). If S i (t)=OFF  NACK  Reward vector r(t) = (0, …, 0, 0, 0, …, 0). Process S i (t) for Channel i: εiεi δiδi = r(t) Restless Multi-Arm Bandit with vector rewards

1 1 S 1 (t) = ? S 2 (t) = ? S 3 (t) = ? Let ω i (t) = Pr[S i (t) = ON]. If we serve channel i, we update: ω i (t+1) = { (1-ε i ) if we get “ACK” { δ i if we get “NACK” Process S i (t) for Channel i: εiεi δiδi Restless Multi-Arm Bandit with vector rewards

1 1 S 1 (t) = ? S 2 (t) = ? S 3 (t) = ? Let ω i (t) = Pr[S i (t) = ON]. If we do not serve channel i, we update: ω i (t+1) = ω i (t)(1-ε i ) + (1-ω i (t))δ i Process S i (t) for Channel i: εiεi δiδi Restless Multi-Arm Bandit with vector rewards

We want to: 1)Characterize the capacity region Λ of the system. Λ = { all stabilizable input rate vectors (λ 1,..., λ Ν ) } = { all possible time average reward vectors } 2) Perform concave utility maximization over Λ. Maximize: g(r 1,..., r Ν ) Subject to: (r 1,..., r Ν ) in Λ λ1λ1 λ2λ2 λ3λ3 

What is known about such systems? 1)If (S 1 (t), …, S N (t)) known every slot: Capacity Region known [Tassiulas, Ephremides 1993]. Greedy “Max-Weight” optimal [Tassiulas, Ephremides 1993]. Capacity Region is same, and Max-Weight works, for both iid vectors and time-correlated Markov vectors. 2) If (S 1 (t), …, S N (t)) unknown but iid over slots: Capacity Region is known. Greedy Max-Weight decisions are optimal. [Gopalan, Caramanis, Shakkottai Allerton 2007] [Li, Neely CDC 2007, TMC 2010] 3) If (S 1 (t), …, S N (t)) unknown and time-correlated: Capacity Region is unknown. Seems to be an intractable multi-dimensional Markov Decision Problem (MDP). Current decisions affect future (ω 1 (t), …, ω N (t)) probability vectors.

Our Contributions: 1) We construct an operational capacity region (inner bound). Our Contributions: 1) We construct an operational capacity region (inner bound). 2) We construct a novel frame based technique for utility maximization over this region.

Assume channels are positively correlated: ε i + δ i ≤ 1. εiεi δiδi ω i (t) t 1-ε i δiδi After “ACK”  ω i (t) > Steady state Pr[S i (t) = ON] = δ i /(δ i +ε i ) After “NACK”  ω i (t) < Steady state Pr[S i (t) = ON] = δ i /(δ i +ε i ) Gives good intuition for scheduling decisions. For Special Case of channel symmetry (ε i = ε, δ i = δ for all i), “round-robin” maximizes sum output rate. [Ahmad, Liu, Javidi, Zhao, Krishnamachari, Trans IT 2009] How to use intuition to construct a capacity region (for possibly asymmetric channels)?

Inner Bound on Λ int (“Operational Capacity Region”): N N λ1λ1 λ2λ2 λNλN Variable Length Frame Every frame, randomly pick a subset and an ordering according to some probability distribution over the ≈ N!2 N choices. Λ int = Convex hull of all randomized round-robin policies.

Inner Bound Properties: Bound contains a huge number of policies. Touches true capacity boundary as N  ∞. Even a good bound for N=2: Can obtain efficient algorithms for optimizing over this region! Let’s see how…

New Lyapunov Drift Analysis Technique: Lyapunov Function: L(t) = ∑ Q i (t) 2 T-Slot Drift for frame k: Δ[k] = L(t[k] + T[k]) – L(t[k]) New Drift-Plus-Penalty Ratio Method on each frame: Variable Length Frame t[k]t[k]+T[k] Minimize: E{ Δ[k] + V x Penalty[k] | Q(t[k]) } E{ T[k] | Q(t[k]) }

New Lyapunov Drift Analysis Technique: Lyapunov Function: L(t) = ∑ Q i (t) 2 T-Slot Drift for frame k: Δ[k] = L(t[k] + T[k]) – L(t[k]) New Drift-Plus-Penalty Ratio Method on each frame: Variable Length Frame t[k]t[k]+T[k] Minimize: E{ Δ[k] + V x Penalty[k] | Q(t[k]) } E{ T[k] | Q(t[k]) } Tassiulas, Ephremides 90, 92, 93 (queue stability) Tassiulas, Ephremides 90, 92, 93 (queue stability)

New Lyapunov Drift Analysis Technique: Lyapunov Function: L(t) = ∑ Q i (t) 2 T-Slot Drift for frame k: Δ[k] = L(t[k] + T[k]) – L(t[k]) New Drift-Plus-Penalty Ratio Method on each frame: Variable Length Frame t[k]t[k]+T[k] Minimize: E{ Δ[k] + V x Penalty[k] | Q(t[k]) } E{ T[k] | Q(t[k]) } Neely, Modiano 2003, 2005 (queue stability + utility optimization) Neely, Modiano 2003, 2005 (queue stability + utility optimization)

New Lyapunov Drift Analysis Technique: Lyapunov Function: L(t) = ∑ Q i (t) 2 T-Slot Drift for frame k: Δ[k] = L(t[k] + T[k]) – L(t[k]) New Drift-Plus-Penalty Ratio Method on each frame: Variable Length Frame t[k]t[k]+T[k] Minimize: E{ Δ[k] + V x Penalty[k] | Q(t[k]) } E{ T[k] | Q(t[k]) } Li, Neely 2010 (queue stability + utility optimization for variable frames) Li, Neely 2010 (queue stability + utility optimization for variable frames)

Conclusions: Quick Advertisement: New Book: M. J. Neely, Stochastic Network Optimization with Application to Communication and Queueing Systems. Morgan & Claypool, PDF also available from “Synthesis Lecture Series” (on digital library) Link available on Mike Neely homepage. Lyapunov Optimization theory (including renewal system problems) Detailed Examples and Problem Set Questions. Multi-Armed Bandit Problem with Reward Vectors (complex MDP). Operational Capacity Region = Convex Hull over Frame- Based Randomized Round-Robin Policies. Stochastic Network Optimization via the Drift-Plus- Penalty Ratio method.