Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting.

Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting September 13-14, 2009.

MAIN RESULT: Taxonomy of Stochastic Games HOW IT WORKS: Existence results for competitive model are based on continuity arguments. AME property for a competitive model is derived from the fact that opponents at higher states lead to lower payoff. ASSUMPTIONS AND LIMITATIONS: Mean field requires all nodes to interact with each other – applies to Dense networks only Coordination model requires different existence proof General Framework for interaction of multiple devices Further, our results: provide common thread to analyze both competitive and coordination models. provide exogenous conditions for existence and AME for competitive models provide results on a special class of coordination model – linear quadratic tracking games. Asymptotic Analysis for Large Scale Dynamic Stochastic Games S. Adlakha, R. Johari, G. Weintraub, A. Goldsmith In principle, tracking state of other devices is complex. We approximate state of other devices via a mean field limit. Many cognitive radio models do not account for reaction of other devices to a single device’s action. In prior work, we developed a general stochastic game model to tractably capture interactions of many devices. State of device i State of other devices Action of device i IMPACT NEXT-PHASE GOALS ACHIEVEMENT DESCRIPTION STATUS QUO NEW INSIGHTS Provide existence and AME results for general class of coordination games. Our main goal is to develop a related model that applies when a single node interacts with a small number of other nodes each period. General Stochastic Games Competitive Model Non-cooperative games. Sub modular payoff Existence results for OE. AME property. Coordination Model Cooperative games. Super modular payoff structure. Results for special class of linear quadratic games. New Paradigm for analyzing large scale competitive and coordination games

Modeling Interaction between Devices Wireless spectrum sharing –Nodes interact with each other. –The environment for a single node comprises of active devices. –Nodes operate in a reactive environment. Markov Perfect Equilibrium (MPE) –Standard solution concept for stochastic games. –The action of each player depends on the state of everyone. Problems: 1.MPE is hard to compute. 2.Requires excessive information exchange.

Oblivious Equilibrium (OE) Mean field equilibrium concept. Each device reacts to an average state of other players. Requires little information exchange. Easy to compute and implement. Questions: 1.When does such policies exist? 2.How close is OE to MPE in terms of payoff received to a device?

Taxonomy of Stochastic Games Competitive models –Non-cooperative games –Payoff characterized by non-increasing differences between own state and opponent states – sub modular structure. –Opponents at higher state leads to lower payoff. Coordination models –Cooperative games –Payoff has increasing differences between own state and opponent state – super modular structure. –Payoff depends on how close are nodes to other players. Contribution: Provide common thread to analyze both these models

State of the Art Generalized the idea of OE to general stochastic games [Allerton 07]. Unified existing models, such as LQG games, via our framework [CDC 08]. Exogenous conditions for approximating MPE using OE for linear dynamics and separable payoffs [Allerton 08]. Current Results: Exogenous conditions on model primitives which Prove the existence of an oblivious equilibrium for competitive models. Show that OE is close to MPE asymptotically for competitive models.

Our model m players State of player i is x i ; action of player i is a i State evolution: Payoff: where f - i = empirical distribution of other players’ states state # of players

Common Assumptions [A1] The state transition function is concave in state and action and has decreasing differences in state and action. [A2] For any action, is a non-increasing function of state, non-decreasing function of action and has negative drift at zero action. [A3] The payoff function is jointly concave in state and action and has decreasing differences in state and action. [A4] The first derivative of the payoff function w.r.t state becomes negative as the state increases. These assumptions imply that the optimal policy is non-increasing and asymptotically goes to zero.

Competitive Model - Assumptions [A5] The payoff function has decreasing difference between state and f - i and between action and f - i. Ordering relation on f - i – first order stochastic dominance. [A6] The payoff decreases as f increases. That is, if f 1 ¸ f 2, then ¼ (x, a, f 1 ) · ¼ (x, a, f 2 ). [A7] The logarithm of the payoff is Gateaux differentiable w.r.t. f - i. Define [A8] Assume that the payoff function is such that g ( y ) » O( y K ) for some K.

Main Result – Competitive Model Under [A1]-[A8], OE exists for competitive models and OE payoff is approximately optimal over Markov policies, as m  1. In other words, OE is approximately an MPE. The key point here is that no single player is overly influential and the true state distribution is close to the time average—so knowledge of other player’s policies does not significantly improve payoff. Advantage: Each player can use oblivious policy without loss in performance.

Competitive vs. Coordination Models Competitive and coordination models have significant differences. Existence in competitive models comes from continuity arguments. For coordination model, assumption [A6] does not holds - requires different existence proof. This dichotomy exists even in single shot games. We have results for a special class of coordination games – linear quadratic tracking models (a generalization of model by Caines et. al.)

Main Contributions and Future Work Provide common thread to analyze both competitive and coordination model. Provide exogenous conditions for existence and AME property for competitive model. Existence results are important for these models to be meaningful. Provide results on a special class of coordination model – linear quadratic tracking games. Future Work: Provide exogenous conditions for existence and AME property for general coordination games. Develop similar models where a single node interacts with a small set of nodes at each time period. Apply these models to interfering transmissions between energy constrained nodes.

Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting.

Similar presentations

Presentation on theme: "Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting.

Similar presentations

Presentation on theme: "Asymptotic Analysis for Large Scale Dynamic Stochastic Games Sachin Adlakha, Ramesh Johari, Gabriel Weintraub and Andrea Goldsmith DARPA ITMANET Meeting."— Presentation transcript:

Similar presentations

About project

Feedback