Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting.

Slides:

Advertisements

Similar presentations

The Weighted Proportional Resource Allocation Milan Vojnović Microsoft Research Joint work with Thành Nguyen Microsoft Research Asia, Beijing, April, 2011.

Advertisements

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS

Lecture #11: Introduction to the New Empirical Industrial Organization (NEIO) - What is the old empirical IO? The old empirical IO refers to studies that.

1 Strategic choice of financing systems in regulated and interconnected industries Anna BassaniniJerome Pouyet Rome & IDEICREST-LEI & CERAS-ENPC

CROWN “Thales” project Optimal ContRol of self-Organized Wireless Networks WP1 Understanding and influencing uncoordinated interactions of autonomic wireless.

Relaying in networks with multiple sources has new aspects: 1. Relaying messages to one destination increases interference to others 2. Relays can jointly.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project FLoWS Team Update: Andrea Goldsmith ITMANET PI Meeting Jan 27, 2011.

1 Game Theory. 2 Agenda Game Theory Matrix Form of a Game Dominant Strategy and Dominated Strategy Nash Equilibrium Game Trees Subgame Perfection.

Multiple Criteria Decision Analysis with Game-theoretic Rough Sets Nouman Azam and JingTao Yao Department of Computer Science University of Regina CANADA.

How Bad is Selfish Routing? By Tim Roughgarden Eva Tardos Presented by Alex Kogan.

Xu Chen Xiaowen Gong Lei Yang Junshan Zhang

Gabriel Tsang Supervisor: Jian Yang.  Initial Problem  Related Work  Approach  Outcome  Conclusion  Future Work 2.

1 ENS, June 21, 2007 Jean-Yves Le Boudec, EPFL joint work with David McDonald, U. of Ottawa and Jochen Mundinger, EPFL …or an Art ? Is Mean Field a Technology…

*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.

Nash Q-Learning for General-Sum Stochastic Games Hu & Wellman March 6 th, 2006 CS286r Presented by Ilan Lobel.

On Spectrum Selection Games in Cognitive Radio Networks

The Impact of Spatial Correlation on Routing with Compression in WSN Sundeep Pattem, Bhaskar Krishnamachri, Ramesh Govindan University of Southern California.

Chess Review May 11, 2005 Berkeley, CA Closing the loop around Sensor Networks Bruno Sinopoli Shankar Sastry Dept of Electrical Engineering, UC Berkeley.

CS541 Advanced Networking 1 Cognitive Radio Networks Neil Tang 1/28/2009.

EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.

Maximizing the Lifetime of Wireless Sensor Networks through Optimal Single-Session Flow Routing Y.Thomas Hou, Yi Shi, Jianping Pan, Scott F.Midkiff Mobile.

Multiple timescales for multiagent learning David Leslie and E. J. Collins University of Bristol David Leslie is supported by CASE Research Studentship.

Learning and Planning for POMDPs Eyal Even-Dar, Tel-Aviv University Sham Kakade, University of Pennsylvania Yishay Mansour, Tel-Aviv University.

Asaf Cohen (joint work with Rami Atar) Department of Mathematics University of Michigan Financial Mathematics Seminar University of Michigan March 11,

Multiple-access Communication in Networks A Geometric View W. Chen & S. Meyn Dept ECE & CSL University of Illinois.

Incentivizing Sharing in Realtime D2D Streaming Networks: A Mean Field Game Perspective Jian Li Texas A&M University April 30 th, 2015 Jointly with R.

Decentralised load balancing in closed and open systems A. J. Ganesh University of Bristol Joint work with S. Lilienthal, D. Manjunath, A. Proutiere and.

When rate of interferer’s codebook small Does not place burden for destination to decode interference When rate of interferer’s codebook large Treating.

Social Group Utility Maximization Game with Applications in Mobile Social Networks Xiaowen Gong, Xu Chen, Junshan Zhang Arizona State University Allerton.

Fluid Limits for Gossip Processes Vahideh Manshadi and Ramesh Johari DARPA ITMANET Meeting March 5-6, 2009 TexPoint fonts used in EMF. Read the TexPoint.

A Framework for Distributed Model Predictive Control

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Thrust 2 Layerless Dynamic Networks Lizhong Zheng, Todd Coleman.

1 Performance Analysis of Coexisting Secondary Users in Heterogeneous Cognitive Radio Network Xiaohua Li Dept. of Electrical & Computer Engineering State.

Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 3 Differential Game Zhu Han, Dusit Niyato, Walid Saad, Tamer.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Thrust 3 Application Metrics and Network Performance Asu Ozdaglar and Devavrat.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Thrust 3 Application Metrics and Network Performance Asu Ozdaglar and Devavrat.

Mean Field Equilibria of Multi-Armed Bandit Games Ramki Gummadi (Stanford) Joint work with: Ramesh Johari (Stanford) Jia Yuan Yu (IBM Research, Dublin)

MD-based scheme could outperform MR-based scheme while preserving the source- channel interface Rate is not sufficient as source- channel interface, ordering.

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

International Environmental Agreements with Uncertain Environmental Damage and Learning Michèle Breton, HEC Montréal Lucia Sbragia, Durham University Game.

Dynamic Programming for Partially Observable Stochastic Games Daniel S. Bernstein University of Massachusetts Amherst in collaboration with Christopher.

A Continuity Theory of Source Coding over Networks WeiHsin Gu, Michelle Effros, Mayank Bakshi, and Tracey Ho FLoWS PI Meeting, Washington DC, September.

Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 4 Evolutional Game Zhu Han, Dusit Niyato, Walid Saad, Tamer.

Congestion Control in CSMA-Based Networks with Inconsistent Channel State V. Gambiroza and E. Knightly Rice Networks Group

Superposition encoding A distorted version of is is encoded into the inner codebook Receiver 2 decodes using received signal and its side information Decoding.

EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.

MAIN RESULT: Depending on path loss and the scaling of area relative to number of nodes, a novel hybrid scheme is required to achieve capacity, where multihop.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.

Optimal Placement of Energy Storage in Power Networks Christos Thrampoulidis Subhonmesh Bose and Babak Hassibi Joint work with 52 nd IEEE CDC December.

Algorithmic Game Theory and Internet Computing Vijay V. Vazirani Georgia Tech Primal-Dual Algorithms for Rational Convex Programs II: Dealing with Infeasibility.

1 Economic Concepts For Strategy Besanko, Dranove, and Shanley Primer Chapter.

Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.

1 Monte-Carlo Planning: Policy Improvement Alan Fern.

1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,

QR 38 Conclusion, 5/3/07 I.Why game theory is a useful tool for studying international relations II.Key insights from game theory for IR III.Limitations.

Coordination and Learning in Dynamic Global Games: Experimental Evidence Olga Shurchkov MIT The Economic Science Association World Meeting 2007.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Thrust 3 Application Metrics and Network Performance Asu Ozdaglar and Devavrat.

Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.

MAIN RESULT: We assume utility exhibits strategic complementarities. We show: Membership in larger k-core implies higher actions in equilibrium Higher.

Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project A Distributed Newton Method for Network Optimization Ali Jadbabaie and Asu Ozdaglar.

Q-Learning for Policy Improvement in Network Routing

Resource Allocation in Non-fading and Fading Multiple Access Channel

Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 2 Bayesian Games Zhu Han, Dusit Niyato, Walid Saad, Tamer.

Multiagent Systems Game Theory © Manfred Huber 2018.

Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 10 Stochastic Game Zhu Han, Dusit Niyato, Walid Saad, and.

Oblivious Equilibrium for Stochastic Games with Concave Utility

Application Metrics and Network Performance

ACHIEVEMENT DESCRIPTION

Presentation transcript:

Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting September 13-14, 2009.

MAIN RESULT: Taxonomy of Stochastic Games HOW IT WORKS: Existence results for competitive model are based on continuity arguments. AME property for a competitive model is derived from the fact that opponents at higher states lead to lower payoff. ASSUMPTIONS AND LIMITATIONS: Mean field requires all nodes to interact with each other – applies to Dense networks only Coordination model requires different existence proof General Framework for interaction of multiple devices Further, our results: provide common thread to analyze both competitive and coordination models. provide exogenous conditions for existence and AME for competitive models provide results on a special class of coordination model – linear quadratic tracking games. Asymptotic Analysis for Large Scale Dynamic Stochastic Games S. Adlakha, R. Johari, G. Weintraub, A. Goldsmith In principle, tracking state of other devices is complex. We approximate state of other devices via a mean field limit. Many cognitive radio models do not account for reaction of other devices to a single device’s action. In prior work, we developed a general stochastic game model to tractably capture interactions of many devices. State of device i State of other devices Action of device i IMPACT NEXT-PHASE GOALS ACHIEVEMENT DESCRIPTION STATUS QUO NEW INSIGHTS Provide existence and AME results for general class of coordination games. Our main goal is to develop a related model that applies when a single node interacts with a small number of other nodes each period. General Stochastic Games Competitive Model Non-cooperative games. Sub modular payoff Existence results for OE. AME property. Coordination Model Cooperative games. Super modular payoff structure. Results for special class of linear quadratic games. New Paradigm for analyzing large scale competitive and coordination games

Modeling Interaction between Devices Wireless spectrum sharing –Nodes interact with each other. –The environment for a single node comprises of active devices. –Nodes operate in a reactive environment. Markov Perfect Equilibrium (MPE) –Standard solution concept for stochastic games. –The action of each player depends on the state of everyone. Problems: 1.MPE is hard to compute. 2.Requires excessive information exchange.

Oblivious Equilibrium (OE) Mean field equilibrium concept. Each device reacts to an average state of other players. Requires little information exchange. Easy to compute and implement. Questions: 1.When does such policies exist? 2.How close is OE to MPE in terms of payoff received to a device?

Taxonomy of Stochastic Games Competitive models –Non-cooperative games –Payoff characterized by non-increasing differences between own state and opponent states – sub modular structure. –Opponents at higher state leads to lower payoff. Coordination models –Cooperative games –Payoff has increasing differences between own state and opponent state – super modular structure. –Payoff depends on how close are nodes to other players. Contribution: Provide common thread to analyze both these models

State of the Art Generalized the idea of OE to general stochastic games [Allerton 07]. Unified existing models, such as LQG games, via our framework [CDC 08]. Exogenous conditions for approximating MPE using OE for linear dynamics and separable payoffs [Allerton 08]. Current Results: Exogenous conditions on model primitives which Prove the existence of an oblivious equilibrium for competitive models. Show that OE is close to MPE asymptotically for competitive models.

Our model m players State of player i is x i ; action of player i is a i State evolution: Payoff: where f - i = empirical distribution of other players’ states state # of players

Common Assumptions [A1] The state transition function is concave in state and action and has decreasing differences in state and action. [A2] For any action, is a non-increasing function of state, non-decreasing function of action and has negative drift at zero action. [A3] The payoff function is jointly concave in state and action and has decreasing differences in state and action. [A4] The first derivative of the payoff function w.r.t state becomes negative as the state increases. These assumptions imply that the optimal policy is non-increasing and asymptotically goes to zero.

Competitive Model - Assumptions [A5] The payoff function has decreasing difference between state and f - i and between action and f - i. Ordering relation on f - i – first order stochastic dominance. [A6] The payoff decreases as f increases. That is, if f 1 ¸ f 2, then ¼ (x, a, f 1 ) · ¼ (x, a, f 2 ). [A7] The logarithm of the payoff is Gateaux differentiable w.r.t. f - i. Define [A8] Assume that the payoff function is such that g ( y ) » O( y K ) for some K.

Main Result – Competitive Model Under [A1]-[A8], OE exists for competitive models and OE payoff is approximately optimal over Markov policies, as m  1. In other words, OE is approximately an MPE. The key point here is that no single player is overly influential and the true state distribution is close to the time average—so knowledge of other player’s policies does not significantly improve payoff. Advantage: Each player can use oblivious policy without loss in performance.

Competitive vs. Coordination Models Competitive and coordination models have significant differences. Existence in competitive models comes from continuity arguments. For coordination model, assumption [A6] does not holds - requires different existence proof. This dichotomy exists even in single shot games. We have results for a special class of coordination games – linear quadratic tracking models (a generalization of model by Caines et. al.)

Main Contributions and Future Work Provide common thread to analyze both competitive and coordination model. Provide exogenous conditions for existence and AME property for competitive model. Existence results are important for these models to be meaningful. Provide results on a special class of coordination model – linear quadratic tracking games. Future Work: Provide exogenous conditions for existence and AME property for general coordination games. Develop similar models where a single node interacts with a small set of nodes at each time period. Apply these models to interfering transmissions between energy constrained nodes.