Game Theory Lecture 8
problem set 8 from Osborne’s Introd. To G.T. Ex. (426.1), 428.1, 429.1, 430.1, 431.1, (431.2)
Not in a finitely repeated Prisoners’ Dilemma. A reminder Repeated games The grim (trigger) strategy Begin by playing C and do not initiate a deviation from C If the other played D, play D for ever after. i.e. is the pair (grim , grim) a N.E. ?? Is the grim strategy a Nash equilibrium? Not in a finitely repeated Prisoners’ Dilemma. C D 2 , 2 0 , 3 3 , 0 1 , 1 Punishment does not seem to work in the finitely repeated game.
> > > Proof: player 1 … C D …. player 2 ? player 1 … D …. Every Nash equilibrium of the finitely repeated P.D. generates a path along which the players play only D Proof: Consider the last time that any of the players plays C along the Nash Equilibrium path. (assume it is player 1) After that period they both play D player 1 … C D …. player 2 ? > > > He is better off here If he switches to play D: player 1 … D …. player 2 ?
A reminder infinitely An infinitely repeated prisoners’ Dilemma sub-games 1 2 C D D 1 2 1 2 1 2 1 2 1 C D 1 C D
An infinitely repeated game A history at time t is: { a1, a2, ….. at } where ai is a vector of actions taken at time i ai is [C,C] or [DC] etc. A strategy is a function that assigns an action for each history.
An infinitely repeated game The payoff of player 1 following a history { a0, a1, ….. at,...… } is a stream { G1(a0), G1(a1), ….. G1(at)...… }
An infinitely repeated game If the payoff stream of a player is a cycle (of length n): w0,w1,w2,……wn-1,w0,w1,w2,……wn-1, w0,w1,w2,……wn-1, ……… his utility is:
An infinitely repeated game
Not in a finitely repeated Prisoners’ Dilemma. An infinitely repeated game Is the pair (grim , grim) a N.E. ?? Not in a finitely repeated Prisoners’ Dilemma. (grim,grim) is a N.E. in the infinitely repeated P.D. if the discount rate is sufficiently large i.e. if the future is sufficiently important
Assume that player 2 plays ‘grim’: If at some time t player 1 considers deviating from C (for the first time) time … t-2 t-1 t t+1 t+2 player 1 C D player 2 time … t-2 t-1 t t+1 t+2 player 1 C player 2 time … t-2 t-1 t t+1 t+2 player 1 C player 2 D …. ? D …. while if he did not deviate: time … t-2 t-1 t t+1 t+2 player 1 C …. player 2
time … t-2 t-1 t t+1 t+2 player 1 C D …. player 2 time … t-2 t-1 t t+1 2 , 2 0 , 3 3 , 0 1 , 1 The payoffs: time … t-2 t-1 t t+1 t+2 player 1 C D …. player 2 time … t-2 t-1 t t+1 t+2 player 1 C D …. player 2 time … t-2 t-1 t t+1 t+2 player 1 C …. player 2 time … t-2 t-1 t t+1 t+2 player 1 C …. player 2
(grim,grim) is a N.E. if the discount rate is sufficiently large 2 , 2 0 , 3 3 , 0 1 , 1 Player 1 will not deviate if: (grim,grim) is a N.E. if the discount rate is sufficiently large i.e. if the future is sufficiently important
(grim,grim) is not a Sub-game Perfect equilibrium of the game However, (grim,grim) is not a Sub-game Perfect equilibrium of the game Assume player 1 follows the grim strategy, and that in the last period C,D was played player 1 … C player 2 D D …. C D …. D .... Player 1’s (grim) reaction will be: If Player 1 follows grim: but he could do better with :
Strategies as Finite Automata A finite automaton has a finite xxxxnumber of states (+ initial state) Each state is characterized xxxxby an action Input changes the state of the xxxxautomaton
C C,D C D D The grim strategy A state and its action Inputs : The actions of the other player { C,D } The transition: How inputs change the state Initial State
Modified Grim 1 2 3 4 Some more strategies C D C,D C,D C C D D D D D
Some more strategies Tit for Tat C D D C D C
Axelrod’s Tournament (Nice !!!) Tit for Tat Robert Axelrod: The Evolution of Cooperation, 1984 C D Robert Axelrod Tit for Tat (Nice !!!) C C D D C D
Modified Tit for Tat Tit for Tat Can you still bite ??? D C C C D D C ‘simpler’ than D D D C
a strategy that exploits the weakness of Can you still bite ??? D C C C D C D C C D D a modification a strategy that exploits the weakness of C C C,D D D D C
What payoffs are N.E. payoffs of the infinitely repeated P.D. ?? The FEASIBLE payoffs as π1 π2 C D 2 , 2 0 , 3 3 , 0 1 , 1
The folk theorem ???? (R. Aumann, J. Friedman) What payoffs are N.E. payoffs of the infinitely repeated P.D. ?? Clearly, Nash Equilibria payoffs are ≥ (1,1) ???? All feasible payoffs above (1,1) can be obtained as Nash Equilibria payoffs π1 π2 The folk theorem (R. Aumann, J. Friedman) C D 2 , 2 0 , 3 3 , 0 1 , 1 (1,1)
All feasible payoffs above (1,1) can be obtained as Nash Equilibria payoffs Proof: choose a point in this region π1 π2 it can be represented as: C D 2 , 2 0 , 3 3 , 0 1 , 1 (1,1)
The coefficients αi can be approximated by rational numbers Proof: The coefficients αi can be approximated by rational numbers π1 π2 C D 2 , 2 0 , 3 3 , 0 1 , 1 (1,1)
Proof: { { { { π1 π2 If the players follow this cycle, their payoff will be approximately the chosen point when the discount rate is close to 1. C D 2 , 2 0 , 3 3 , 0 1 , 1 (1,1)
Follow the sequence of the cycle as long as the other player does. Proof: { { { { A strategy: Follow the sequence of the cycle as long as the other player does. If not, play D forever. π1 π2 C D 2 , 2 0 , 3 3 , 0 1 , 1 (1,1)
Follow the sequence of the cycle as long as the other player does. { { { { A strategy: Follow the sequence of the cycle as long as the other player does. If not, play D forever. π1 π2 This pair of strategies is a N.E. C D 2 , 2 0 , 3 3 , 0 1 , 1 (1,1)
One Deviation Property and Agent Equilibria A player cannot increase his payoff in a sub-game in which he is the first to move, by changing his action in that node only. A Theorem A strategy profile is a sub-game perfect equilibrium in an extensive game with perfect information iff both strategies have the one deviation property (no proof)