Graphical Models for Game Theory by Michael Kearns, Michael L. Littman, Satinder Singh Presented by: Gedon Rosner.

Graphical Models for Game Theory by Michael Kearns, Michael L. Littman, Satinder Singh Presented by: Gedon Rosner

Agenda Agenda Introduction Motivation Goals Terminology The Algorithm Outline Details Proof Back up

Introduction This paper describes a graphical representation of multi-player single-stage games. Presents a polynomial algorithm that provides approximations to well-defined problems that would otherwise be computationally hard. Presents an exponential algorithm with precise results that will not be described.

Introduction – cont. Multi-Player game theory uses Tables to represent games – payoffs to each player per their course of action. Tables require immense computational resources (space & time). In certain cases graphical structures succinctly describe the game and may be computationally less expensive as well (depending on what is computed).

Motivation -Tabular Form n agents with X possible actions require: n*X n space in matrix/tabular form. Each agent has X=2 possible actions {0,1} the possible results of the game is represented in n matrices (for each player) where each matrix is 2 n cells for every combination of actions v i that the other players may perform (v 1, v 2,…. v n ). The representation in itself is exponential by the number of players, computation seems at least as hard.

Motivation-Graphical Form Matrices ~ Graphs :- special graphs (e.g. trees) are better used to describe sparse Matrices. A full graph (V,E) is isomorphic to a matrix. flow Trees - graph traversal algorithms are “better” for flow computation – representing dependencies. If a game has dependencies between sets of localized players and mutual influence is propagated “across the board” a tree structure is inherent.

Motivation - Computation Nash Equilibriums are sets of strategies in which no player can unilaterally change his/her strategy and gain benefit (=local maxim). Radio stations music vs. rating benefit: A\BA\B מזרחית MTV ישראלית מזרחית 25,2550,3050,20 MTV30,5015,1530,20 ישראלית 20,5020,3010,10

Nash equilibrium The danger is that both stations will choose the more profitable מזרחית format -- and split the market, getting only 25 each! Actually, there is an even worse danger that each station might assume that the other station will choose מזרחית, and each choose MTV, splitting that market and leaving each with a market share of just 15.

Nash equil. – motivation The problem for the players is to figure out which equilibrium will in fact occur. “Coordination problem“: how can the players coordinate their strategies to avoid the danger of a mutually inferior outcome ? Tomas Schelling (1960) - any bit of information available to all participants in a coordination game, might enable them all to focus on the same equilibrium and might solve the problem…

Goals 1. Provide a complete graphical representation for multi-player one-stage games. 2. Define how/when the graphical structure may provide a succinct representation in an order of magnitude. (polynomial vs. exponential). 3. Provide a polynomial algorithm for computing approximate Nash equilibriums in one stage games by trees or sparse graphs.

Terminology Games in Tabular form: An n-player, two-action game is defined by n matrices M i with n indices. The entry M i (x 1,.. x n ) specifies the payoff to player i when the combined action of the n players is x  {0,1} n. Each matrix has 2 n entries. Pure and Mixed Strategies: The actions of either 0 or 1 are pure. A mixed strategy is a probability p i the player will play 0. 

Terminology – cont. Expected Payoff for mixed strategy: Player i expects the payoff M i (p) which is defined as the Exp. x~p [M i (p)]. here x~p indicates that: x j = 0 p j. x j = 1 1- p j. Nash Theorem (1951): For any game, there exists a Nash equilibrium in the space of joint mixed strategies.   { 

Terminology – cont. Terminology – cont. Nash equilibrium: A mixed strategy of all the players denoted as. p s.t. for any player i and for any other strategy p  [0,1]: M i (p)  M i (p[i:p i ]). This just means that no player can improve their payoff by deviating unilaterally from the Nash equilibrium.  -Nash equilibrium: M i (p)+   M i (p[i:p i ]) – improve by at most .    

Graphical Game description An n-player game is - (G,M): G is an undirected graph on n vertices and M i is a set of n matrices for each player. Player i is represented by a vertex labeled i. N G (i)  {1,…,n} – the neighbors j of i in G s.t. the undirected edge (i,j)  E(G) and (i,i)  N G (i). If | N G (i)|  k then p  [0,1] k  the expected payoff is effected by k neighbors only and M i (p) = Exp. x~p [M i (p)] = O(2 k ) << O(2 k ).   ?

A Complete Description Proof: There is a complete mapping between graph representation and tabular representation. Every game has a trivial representation as a graphical game by choosing the complete graph. In cases (like Bayesian networks) where a flow or a local neighborhood may be defined and can be bound by k << n, exponential space saving occurs. Attaining Goal #1 + #2

The Tree Algorithm - Abstract The graphical game is (G,M). G is a tree. The computation is an  -Nash equilibrium of the game. The algorithm traverses the tree in reverse depth-first order using a relaxation computation in each step. Inductively a group of Nash equilibrium is determined. Finally the tree is traversed in depth-first ordering where a single Nash equilibrium is chosen.

Terminology of the game V is the father of U, R is the root of the tree. Denote: G U - the sub-tree where U is the root to its leaves. M u V=v as the subset of matrices of M corresponding to the vertices in G u where the matrix M U has the index V=v. Theorem 1: A Nash Equilibrium of (G U, M U V=v ) is an equilibrium “downstream” from U given that V plays v.

Traversing the Tree Upstream traversal - each node U i will send V all the Nash equilibrium found on the corresponding sub-graph of G U i. V will then perform the relaxation of the algorithm which determines which equilibrium should be passed on. by all the possible values for the mixed strategies v  [0,1] of V, u i  [0,1] of U i (!!!!). In each step of the traversal, every vertex communicates a binary-valued table T which is indexed by all the possible values for the mixed strategies v  [0,1] of V, u i  [0,1] of U i (!!!!).

The Relaxation If U is a leaf then T(v,u)=1 iff U=u is a best response to V=v. T(v, u i ) = 1 iff there exists a Nash equilibrium for (G U i, M U i V=v ). V uses the k tables T i it received and computes the table for its parent W: For each pair of strategies (w,v), T(w,v)=1 iff there exists a set of strategies u 1,…u k (per child) s.t. T(v, u i )=1 (  i<k) and V=v is best for U i =u i, W=w. V remembers the list of (u 1,…u k ) – witnesses.

Abstract Algorithm Proof Base case: Every leaf U sends its parent V the table T(v,u) for every strategy pair (v,u). General case: If T(w,v)=1 for some pair (w,v) then there exists a witness (u 1,…u k ) s.t. T(v, u i )=1 for all i. Induction assumption & Theorem 1  there exists a downstream equilibrium s.t. each U i =u i ; since V=v is the best response - the equilibrium is from V.

Abstract Algorithm Proof – cont. If T(w,v)=0 using the same reasoning  there is no equilibrium in which W=w and V=v. Nash Theorem concludes and assures that for every game there exists at least one pair (w,v) s.t. T(w,v) = 1. R receives a table that along with the witnesses represent all Nash equilibriums. R chooses a strategy non-deterministically and informs its sons – one of the strategies is determined at the end of the downstream flow.

Details…Details … Claimed to find an approximation of a Nash equilibrium in O(n) – looks like we’ve found every Nash equilibrium ?? The table T(w,v) is unrealistic – w,v are continues not discrete. There may be exponential numbers of Nash Equilibrium – a deterministic algorithm can’t be polynomial.

Quantification Instead of continues values – discrete values with finite size and computational ease. Determine a grid {0, ,2 ,…,1}. Player i plays q i  {0, ,2 ,…,1} and q  {0, ,2 ,…,1} n. Each table consists of binary values for 1/  2 entries. Finding best responses is a simple search across the table and are now approximate best responses. 

Determining  1.  needs to insure that the loss suffered by any player in moving to the grid is bound. 2.  needs to insure the Nash equilibriums may be approximately preserved  existence of an  –Nash equilibrium. 3.  needs to be scalable to the size of the representation to allow the algorithm to be polynomial – 1/  = O(n x ).

Bound Loss of Players - #1 Let | N G (i)|=k then as defined M i (p) = Exp. x~p [M i (p)] Remember x j = {0,1} so this is merely the probability that x actually occurs. 

Lemma 1 Let p,q  [0,1] k satisfy |p i – q i |   (i=1..k). Then provided   4/ (k log 2 (k/2)) : Proof by induction on k: Base case k =2: k · logk = 2 2    ( p 2 +q 2 )  |p 1 - q 1 |·( p 2 +q 2 )  |p 1 p 2 - q 1 q 2 | + |p 1 q 2 - q 1 p 2 |  |p 1 p 2 - q 1 q 2 | 

Lemma 1 – Proof cont. Without loss of generality assume k is even. The lemma holds if -k  +((k/2)(log(k/2))  ) 2  0. So   4/(k·log 2 (k/2)).

Lemma 2 Let p,q the mixed strategies for (G,M) satisfy |p i – q i |   (i=1..k), then provided:   4/ (k log 2 (k/2)): | M i (p) - M i (q)|  2 k (k logk)·  This Lemma gives an upper bound on the loss suffered by any player in moving to the nearest joint strategy on the  -grid.  

Lemma 2 - Proof

 –Nash equilibrium - #2 Lemma 3: Let p be a Nash equilibrium for (G,M) and let q be the nearest mixed strategy on the grid. Then provided   4/(k log 2 (k/2)): q is a 2 k+1 (k·log(k) ·  - Nash Equilibrium for (G,M). Proof: Let r i be the best response for player i to q. We bound M i (q[i: r i ]) - M i (q) which is the benefit player i could attain from deviating from q.      

Lemma 3 - Proof By Lemma 2: M i (q[i: r i ]) - M i (p[i: r i ])  2 k (k logk)·  M i (q)  M i (p) - 2 k (k logk)·  Since p is equilibrium: M i (p)  M i (p[i: r i ])  M i (q[i: r i ])  M i (p) + 2 k (k logk)·  Sum the inequalities and result in… M i (q[i: r i ]) - M i (q)  2 k+1 (k logk)·  ≡        

Polynomial scalability We now choose  in accordance with the constraints: 2 k+1 (k logk)·      4/(k log 2 (k/2)) So:   min(  / 2 k+1 (k logk), 4/(k log 2 (k/2)) ) Notice that  is exponential to k << n. Each step in the algorithm computes over (1/  ) 2 entries totaling (1/  ) 2k, the complete run time is polynomial in n.

Back up Graphical Models for Game Theory

Graphical Models for Game Theory by Michael Kearns, Michael L. Littman, Satinder Singh Presented by: Gedon Rosner.

Similar presentations

Presentation on theme: "Graphical Models for Game Theory by Michael Kearns, Michael L. Littman, Satinder Singh Presented by: Gedon Rosner."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Graphical Models for Game Theory by Michael Kearns, Michael L. Littman, Satinder Singh Presented by: Gedon Rosner.

Similar presentations

Presentation on theme: "Graphical Models for Game Theory by Michael Kearns, Michael L. Littman, Satinder Singh Presented by: Gedon Rosner."— Presentation transcript:

Similar presentations

About project

Feedback