Algorithms for solving two- player normal form games Tuomas Sandholm Carnegie Mellon University Computer Science Department.

Slides:

Advertisements

Similar presentations

Simple Search Methods for Finding a Nash Equilibrium

Advertisements

S. Ceppi, N. Gatti, and N. Basilico DEI, Politecnico di Milano Computing Bayes-Nash Equilibria through Support Enumeration Methods in Bayesian Two-Player.

Preprocessing Techniques for Computing Nash Equilibria Vincent Conitzer Duke University Based on: Conitzer and Sandholm. A Generalized Strategy Eliminability.

Complexity Results about Nash Equilibria Vincent Conitzer, Tuomas Sandholm International Joint Conferences on Artificial Intelligence 2003 (IJCAI’03) Presented.

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.

Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.

Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.

1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.

This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.

Totally Unimodular Matrices

Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.

COMP 553: Algorithmic Game Theory Fall 2014 Yang Cai Lecture 21.

6.896: Topics in Algorithmic Game Theory Lecture 11 Constantinos Daskalakis.

Game Theoretical Insights in Strategic Patrolling: Model and Analysis Nicola Gatti – DEI, Politecnico di Milano, Piazza Leonardo.

Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.

ECO290E: Game Theory Lecture 5 Mixed Strategy Equilibrium.

MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)

Nash Equilibria In Graphical Games On Trees Edith Elkind Leslie Ann Goldberg Paul Goldberg.

Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.

Extensive-form games. Extensive-form games with perfect information Player 1 Player 2 Player 1 2, 45, 33, 2 1, 00, 5 Players do not move simultaneously.

Equilibrium Concepts in Two Player Games Kevin Byrnes Department of Applied Mathematics & Statistics.

Vincent Conitzer CPS Normal-form games Vincent Conitzer

by Vincent Conitzer of Duke

Complexity Results about Nash Equilibria

Temporal Action-Graph Games: A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British.

This Wednesday Dan Bryce will be hosting Mike Goodrich from BYU. Mike is going to give a talk during :30 to 12:20 in Main 117 on his work on UAVs.

An Algorithm for Automatically Designing Deterministic Mechanisms without Payments Vincent Conitzer and Tuomas Sandholm Computer Science Department Carnegie.

1 Computing Nash Equilibrium Presenter: Yishay Mansour.

AWESOME: A General Multiagent Learning Algorithm that Converges in Self- Play and Learns a Best Response Against Stationary Opponents Vincent Conitzer.

An Introduction to Game Theory Part III: Strictly Competitive Games Bernhard Nebel.

Nash Q-Learning for General-Sum Stochastic Games Hu & Wellman March 6 th, 2006 CS286r Presented by Ilan Lobel.

1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.

The Computational Complexity of Finding a Nash Equilibrium Edith Elkind, U. of Warwick.

Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Simple search methods for finding a Nash equilibrium Ryan Porter, Eugene Nudelman, and Yoav Shoham Games and Economic Behavior, Vol. 63, Issue 2. pp ,

Minimax strategies, Nash equilibria, correlated equilibria Vincent Conitzer

CPS 170: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.

Computing Equilibria Christos H. Papadimitriou UC Berkeley “christos”

Mechanisms for Making Crowds Truthful Andrew Mao, Sergiy Nesterko.

Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie Mellon University.

A quantum protocol for sampling correlated equilibria unconditionally and without a mediator Iordanis Kerenidis, LIAFA, Univ Paris 7, and CNRS Shengyu.

Nash Equilibria In Graphical Games On Trees Revisited Edith Elkind Leslie Ann Goldberg Paul Goldberg (University of Warwick) (To appear in ACM EC’06)

6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 11.

CPS LP and IP in Game theory (Normal-form Games, Nash Equilibria and Stackelberg Games) Joshua Letchford.

Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.

Yang Cai Oct 08, An overview of today’s class Basic LP Formulation for Multiple Bidders Succinct LP: Reduced Form of an Auction The Structure of.

The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.

Algorithms for solving two-player normal form games

Approximation Algorithms Department of Mathematics and Computer Science Drexel University.

CPS Utility theory, normal-form games

1. 2 Some details on the Simplex Method approach 2x2 games 2xn and mx2 games Recall: First try pure strategies. If there are no saddle points use mixed.

1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.

5.1.Static Games of Incomplete Information

Parameterized Two-Player Nash Equilibrium Danny Hermelin, Chien-Chung Huang, Stefan Kratsch, and Magnus Wahlstrom..

Econ 805 Advanced Micro Theory 1 Dan Quint Fall 2009 Lecture 1 A Quick Review of Game Theory and, in particular, Bayesian Games.

2.5 The Fundamental Theorem of Game Theory For any 2-person zero-sum game there exists a pair (x*,y*) in S  T such that min {x*V. j : j=1,...,n} =

Keep the Adversary Guessing: Agent Security by Policy Randomization

Extensive-Form Game Abstraction with Bounds

Computing equilibria in extensive form games

Communication Complexity as a Lower Bound for Learning in Games

Structured Models for Multi-Agent Interactions

Commitment to Correlated Strategies

Vincent Conitzer Normal-form games Vincent Conitzer

Presented By Aaron Roth

Lecture 20 Linear Program Duality

Normal Form (Matrix) Games

A Technique for Reducing Normal Form Games to Compute a Nash Equilibrium Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University, Computer Science.

Presentation transcript:

Algorithms for solving two- player normal form games Tuomas Sandholm Carnegie Mellon University Computer Science Department

Recall: Nash equilibrium Let A and B be |M| x |N| matrices. Mixed strategies: Probability distributions over M and N If player 1 plays x, and player 2 plays y, the payoffs are x T Ay and x T By Given y, player 1’s best response maximizes x T Ay Given x, player 2’s best response maximizes x T By (x,y) is a Nash equilibrium if x and y are best responses to each other

Finding Nash equilibria Zero-sum games –Solvable in poly-time using linear programming General-sum games –PPAD-complete –Several algorithms with exponential worst-case running time Lemke-Howson [1964] – linear complementarity problem Porter-Nudelman-Shoham [AAAI-04] = support enumeration Sandholm-Gilpin-Conitzer [2005] - MIP Nash = mixed integer programming approach

Zero-sum games Among all best responses, there is always at least one pure strategy Thus, player 1’s optimization problem is: This is equivalent to: By LP duality, player 2’s optimal strategy is given by the dual variables

General-sum games: Lemke-Howson algorithm = pivoting algorithm similar to simplex algorithm We say each mixed strategy is “labeled” with the player’s unplayed pure strategies and the pure best responses of the other player A Nash equilibrium is a completely labeled pair (i.e., the union of their labels is the set of pure strategies)

Lemke-Howson Illustration Example of label definitions

Lemke-Howson Illustration Equilibrium 1

Lemke-Howson Illustration Equilibrium 2

Lemke-Howson Illustration Equilibrium 3

Lemke-Howson Illustration Run of the algorithm

Lemke-Howson Illustration

Simple Search Methods for Finding a Nash Equilibrium Ryan Porter, Eugene Nudelman & Yoav Shoham [AAAI-04, extended version in GEB]

A subroutine that we’ll need when searching over supports (Checks whether there is a NE with given supports) Solvable by LP

Features of PNS = support enumeration algorithm  Separately instantiate supports  for each pair of supports, test whether there is a NE with those supports (using Feasibility Problem solved as an LP)  To save time, don’t run the Feasibility Problem on supports that include conditionally dominated actions  a i is conditionally dominated, given if:  Prefer balanced (= equal-sized for both players) supports  Motivated by an old theorem: any nondegenerate game has a NE with balanced supports  Prefer small supports  Motivated by existing theoretical results for particular distributions (e.g., [MB02])

Pseudocode of two-player PNS algorithm

PNS: Experimental Setup  Most previous empirical tests only on “random” games:  Each payoff drawn independently from uniform distribution  GAMUT distributions [NWSL04]  Based on extensive literature search  Generates games from a wide variety of distributions  Available at D1Bertrand OligopolyD2Bidirectional LEG, Complete Graph D3Bidirectional LEG, Random GraphD4Bidirectional LEG, Star Graph D5 Covariance Game:  = 0.9 D6 Covariance Game:  = 0 D7 Covariance Game: Random  2 [-1/(N-1),1] D8Dispersion Game D9Graphical Game, Random GraphD10Graphical Game, Road Graph D11Graphical Game, Star GraphD12Location Game D13Minimum Effort GameD14Polymatrix Game, Random Graph D15Polymatrix Game, Road GraphD16Polymatrix Game, Small-World Graph D17Random GameD18Traveler’s Dilemma D19Uniform LEG, Complete GraphD20Uniform LEG, Random Graph D21Uniform LEG, Star GraphD22 War Of Attrition

PNS: Experimental results on 2-player games  Tested on player, 300-action games for each of 22 distributions  Capped all runs at 1800s

Mixed-Integer Programming Methods for Finding Nash Equilibria Tuomas Sandholm, Andrew Gilpin, Vincent Conitzer [AAAI-05]

Motivation of MIP Nash Regret of pure strategy s i is difference in utility between playing optimally (given other player’s mixed strategy) and playing s i. Observation: In any equilibrium, every pure strategy either is not played or has zero regret. Conversely, any strategy profile where every pure strategy is either not played or has zero regret is an equilibrium.

MIP Nash formulation For every pure strategy s i : –There is a 0-1 variable b s i such that If b s i = 1, s i is played with 0 probability If b s i = 0, s i is played with positive probability, and must have 0 regret –There is a [0,1] variable p s i indicating the probability placed on s i –There is a variable u s i indicating the utility from playing s i –There is a variable r s i indicating the regret from playing s i For each player i: –There is a variable u i indicating the utility player i receives –There is a constant that captures the diff between her max and min utility:

MIP Nash formulation: Only equilibria are feasible πiπi

Has the advantage of being able to specify objective function –Can be used to find optimal equilibria (for any linear objective)

MIP Nash formulation Other three formulations explicitly make use of regret minimization: –Formulation 2. Penalize regret on strategies that are played with positive probability –Formulation 3. Penalize probability placed on strategies with positive regret –Formulation 4. Penalize either the regret of, or the probability placed on, a strategy

MIP Nash: Comparing formulations These results are from a newer, extended version of the paper.

Games with medium-sized supports Since PNS performs support enumeration, it should perform poorly on games with medium-sized support There is a family of games such that there is a single equilibrium, and the support size is about half –And, none of the strategies are dominated (no cascades either)

MIP Nash: Computing optimal equilibria MIP Nash is best at finding optimal equilibria Lemke-Howson and PNS are good at finding sample equilibria –M-Enum is an algorithm similar to Lemke-Howson for enumerating all equilibria M-Enum and PNS can be modified to find optimal equilibria by finding all equilibria, and choosing the best one –In addition to taking exponential time, there may be exponentially many equilibria

Algorithms for solving other types of games

Structured games Graphical games –Payoff to i only depends on a subset of the other agents –Poly-time algorithm for undirected trees (Kearns, Littman, Singh 2001) –Graphs (Ortiz & Kearns 2003) –Directed graphs (Vickery & Koller 2002) Action-graph games (Bhat & Leyton-Brown 2004) –Each agent’s action set is a subset of the vertices of a graph –Payoff to i only depends on number of agents who take neighboring actions

>2 players Finding a Nash equilibrium –Problem is no longer a linear complementarity problem So Lemke-Howson does not apply –Simplicial subdivision method Path-following method derived from Scarf’s algorithm Exponential in worst-case –Govindan-Wilson method Continuation-based method Can take advantage of structure in games –Method like MIP Nash, where the indifference equations are approximated with piecewise linear [Ganzfried & Sandholm CMU-CS ] –Non globally convergent methods (i.e. incomplete) Non-linear complementarity problem Minimizing a function Slow in practice