Computing Nash Equilibrium

Slides:



Advertisements
Similar presentations
Thursday, March 7 Duality 2 – The dual problem, in general – illustrating duality with 2-person 0-sum game theory Handouts: Lecture Notes.
Advertisements

Tuesday, March 5 Duality – The art of obtaining bounds – weak and strong duality Handouts: Lecture Notes.
GAME THEORY.
1 LP Duality Lecture 13: Feb Min-Max Theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum.
IEOR 4004 Midterm review (Part II) March 12, 2014.
Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.
Totally Unimodular Matrices
Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.
Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.
COMP 553: Algorithmic Game Theory Fall 2014 Yang Cai Lecture 21.
6.896: Topics in Algorithmic Game Theory Lecture 11 Constantinos Daskalakis.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
Network Theory and Dynamic Systems Game Theory: Mixed Strategies
Equilibrium Concepts in Two Player Games Kevin Byrnes Department of Applied Mathematics & Statistics.
Separating Hyperplanes
Christos alatzidis constantina galbogini.  The Complexity of Computing a Nash Equilibrium  Constantinos Daskalakis  Paul W. Goldberg  Christos H.
Duality Lecture 10: Feb 9. Min-Max theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum Cut Both.
1 Computing Nash Equilibrium Presenter: Yishay Mansour.
Computational Methods for Management and Economics Carla Gomes
1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
Game Theory.
The Multiplicative Weights Update Method Based on Arora, Hazan & Kale (2005) Mashor Housh Oded Cats Advanced simulation methods Prof. Rubinstein.
Duality Theory 對偶理論.
6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 11.
Introduction to Operations Research
1 Multiplicative Weights Update Method Boaz Kaminer Andrey Dolgin Based on: Aurora S., Hazan E. and Kale S., “The Multiplicative Weights Update Method:
Advanced Operations Research Models Instructor: Dr. A. Seifi Teaching Assistant: Golbarg Kazemi 1.
Shall we play a game? Game Theory and Computer Science Game Theory /06/05 - Zero-sum games - General-sum games.
1 THE REVISED SIMPLEX METHOD CONTENTS Linear Program in the Matrix Notation Basic Feasible Solution in Matrix Notation Revised Simplex Method in Matrix.
Algorithms for solving two-player normal form games
Chapter 4 Sensitivity Analysis, Duality and Interior Point Methods.
1 a1a1 A1A1 a2a2 a3a3 A2A Mixed Strategies When there is no saddle point: We’ll think of playing the game repeatedly. We continue to assume that.
1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
Parameterized Two-Player Nash Equilibrium Danny Hermelin, Chien-Chung Huang, Stefan Kratsch, and Magnus Wahlstrom..
9.2 Mixed Strategy Games In this section, we look at non-strictly determined games. For these type of games the payoff matrix has no saddle points.
Linear Programming Piyush Kumar Welcome to CIS5930.
2.5 The Fundamental Theorem of Game Theory For any 2-person zero-sum game there exists a pair (x*,y*) in S  T such that min {x*V. j : j=1,...,n} =
Approximation Algorithms based on linear programming.
2.1 Matrix Operations 2. Matrix Algebra. j -th column i -th row Diagonal entries Diagonal matrix : a square matrix whose nondiagonal entries are zero.
Comp/Math 553: Algorithmic Game Theory
Nash Equilibrium: P or NP?
Linear Programming for Solving the DSS Problems
Matrix Algebra MATRIX OPERATIONS © 2012 Pearson Education, Inc.
Lap Chi Lau we will only use slides 4 to 19
The Duality Theorem Primal P: Maximize
Game Theory Just last week:
Tools for Decision Analysis: Analysis of Risky Decisions
Topics in Algorithms Lap Chi Lau.
Non-additive Security Games
Speaker: Dr. Michael Schapira Topic: Dynamics in Games (Part II)
Matrix Algebra MATRIX OPERATIONS © 2012 Pearson Education, Inc.
The Simplex Method The geometric method of solving linear programming problems presented before. The graphical method is useful only for problems involving.
Vincent Conitzer CPS Repeated games Vincent Conitzer
The Simplex Method The geometric method of solving linear programming problems presented before. The graphical method is useful only for problems involving.
Equlibrium Selection in Stochastic Games
Selfish Routing 第3回
Submodular Maximization Through the Lens of the Multilinear Relaxation
Multiagent Systems Repeated Games © Manfred Huber 2018.
Enumerating All Nash Equilibria for Two-person Extensive Games
Vincent Conitzer Repeated games Vincent Conitzer
Lecture 20 Linear Program Duality
9.3 Linear programming and 2 x 2 games : A geometric approach
Matrix Algebra MATRIX OPERATIONS © 2012 Pearson Education, Inc.
Collaboration in Repeated Games
Normal Form (Matrix) Games
Vincent Conitzer CPS Repeated games Vincent Conitzer
Linear Constrained Optimization
Presentation transcript:

Computing Nash Equilibrium Presenter: Yishay Mansour

Outline Problem Definition Notation Today: Zero-Sum game Next week: General Sum Games Multiple players

Model Multiple players N={1, ... , n} Strategy set Payoff functions Player i has m actions Si = {si1, ... , sim} Si are pure actions of player i S = i Si Payoff functions Player i ui : S  

Strategies Modified distribution Pure strategies: actions Mixed strategy Player i – pi distribution over Si Game - P = i pi Product distribution Modified distribution P-i = probability P except for player i (q, P-i ) = player i plays q other player pj

Notations Average Payoff Nash Equilibrium Player i: ui(P) = Es~P[ui(s)] =  P(s)ui(s) P(s) = i pi (si) Nash Equilibrium P* is a Nash Eq. If for every player i For any distribution qi ui(qi,P*-i)  ui(P*) Best Response

Notations Alternative payoff Difference in payoff xij(P) = ui(sij,P-i) = Es~P[ui(s) | si = sij] Difference in payoff zij(P) = xij(P) – ui(P) Improvement in payoff gij(P) = max{ zij(P),0}

Fixed point Theorems Intermediate Value Theorem domain [a,b] function f continuous f(a) f(b) < 0 exists z such that f(z)=0 Proof: M+ = { x | f(x) 0} M- ={x | f(x)  0} closed sets and have an intersection.

Brouwer’s Fixed point theorem f: S  S continuous, S compact and convex There exists z in S : z = f(z) For S=[0,1], previous theorem

Kakutani’ Fixed Point Theorem L: S  S correspondence L(x) is a convex set L semi-continuous S compact and convex There exists z: z in L(z)

Nash Equilibrium I Best response correspondence L(P) = argmaxQ { ui(qi, P-i)} L is a correspondence, continuous Nash is a fixed point of L P* in L(P*) Kakutani’s fixed point theorem

Nash Equilibrium II Fixed point K(P) has mN parameters Kij(P) = (pij+gij(P)) / (1 +  gij(P)) Nash is a fixed point of K P* = K(P*) Original proof of Nash Continuous function on a compact space Brouwer’s fixed point theorem

Nash Equilibrium III Non-linear complementary problem (NCP) Recall zij(P) For every player i and action aij: zij(P)*pij = 0 zi(P) is orthogonal to pi Nash: z(P*)  0 zij(P*)  0

Nash Equilibrium IV Stationary point problem Recall: x = alternative payoff Nash: P* For every P (P-P*) x(P*)  0 (pij –p*ij) x(P*)  0

Nash Equilibrium V Minimizing a function Objective function: V(P) = i j [gij(P)]2 V(P) is continuous and differentiable, non-negative function NASH: V(P*) = 0 Local Minima

Nash Equilibrium VI Semi-Algebraic set distribution P: j pij = 1 difference in payoff: zij(P)  0 zij(P) = xij(P) – ui(P)  0 Explicitly:

Two player games Payoff matrices (A,B) strategies p and q m rows and n columns player 1 has m action, player 2 has n actions strategies p and q Payoffs: u1(pq)=pAqt and u2(pq)= pBqt Zero sum game A= -B

Linear Programming Primal LP: x in SETprimal is feasible maximize <c,x> subject to x in SETprimal

Linear Programming Dual LP: y in SETdual is feasible minimize <b,y> subject to y in SETdual

Duality Theorem Weak duality: <c,x>  <b,y> Strong Duality for any feasible x and y proof! Strong Duality If there are feasible solutions then <c,x> = <b,y> for some feasible x and y sketch of proof.

Two players zero sum Fix strategy q of player 2, player 1 best response: maximize p (Aqt) such that j pj = 1 and pj 0 dual LP: minimize u such that u  Aqt Player 2: select strategy q : minimize u such that u  Aqt and i qi = 1 and qi 0 dual (strategy for player 1) maximize v such that v  pA, j pj = 1 and pj 0 There exists a unique value v.

Example

Summary Two players zero sum linear programming polynomial time can have multiple Nash unique value! If (p,q) and (p’,q’) Nash then (p,q’) and (p’,q) Nash

Online learning Playing with unknown payoff matrix Online algorithm: at each step selects an action. can be stochastic or fractional Observes all possible payoffs Updates its parameters Goal: Achieve the value of the game Payoff matrix of the “game” define at the end

Online learning - Algorithm Notations: Opponent distribution Qt Our distribution Pt Observed cost M(i, Qt) Should be MQt Goal: minimize cost Algorithm: Exponential weights Action i has weight proportional to bL(i,t) L(i,t) = loss of action i until time t

Online algorithm: Notations Formally: parameter: b 0< b < 1 wt+1(i) = wt(i) bM(i,Qt) Zt =  wt(i) Pt+1(i) = wt+1(i) / Zt Number of total steps T is known

Online algorithm: Theorem For any matrix M with entries in [0,1] Any sequence of dist. Q1 ... QT The algorithm generates P1, ... , PT RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]

Online algorithm: Analysis Lemma For any mixed strategy P Corollary

Online Algorithm: Optimization b= 1/(1 + sqrt{2 (ln n) / T}) Average Loss: v + O(sqrt{(ln n )/T})

Two players General sum games Input matrices (A,B) No unique value Computational issues: find some, all Nash player 1 best response: Like for zero sum: Fix strategy q of player 2 maximize p (Aqt) such that j pj = 1 and pj 0 dual LP: minimize u such that u  Aqt

Two players General sum games Assume the support of strategies known. p has support Sp and q has support Sq Can formulate the Nash as LP:

Approximate Nash

Lemke & Howson

Example