Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.

Slides:



Advertisements
Similar presentations
Infinitely Repeated Games
Advertisements

Lecture V: Game Theory Zhixin Liu Complex Systems Research Center,
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)
Evolution of Cooperation The importance of being suspicious.
Defender/Offender Game With Defender Learning. Classical Game Theory Hawk-Dove Game Hawk-Dove Game Evolutionary Stable Evolutionary Stable Strategy (ESS)
1 Evolution & Economics No Evolutionary Stability in Repeated Games Played by Finite Automata K. Binmore & L. Samuelson J.E.T Automata.
Infinitely Repeated Games. In an infinitely repeated game, the application of subgame perfection is different - after any possible history, the continuation.
Non-Cooperative Game Theory To define a game, you need to know three things: –The set of players –The strategy sets of the players (i.e., the actions they.
6-1 LECTURE 6: MULTIAGENT INTERACTIONS An Introduction to MultiAgent Systems
Chapter 14 Infinite Horizon 1.Markov Games 2.Markov Solutions 3.Infinite Horizon Repeated Games 4.Trigger Strategy Solutions 5.Investing in Strategic Capital.
The basics of Game Theory Understanding strategic behaviour.
M9302 Mathematical Models in Economics Instructor: Georgi Burlakov 2.5.Repeated Games Lecture
Infinitely Repeated Games Econ 171. Finitely Repeated Game Take any game play it, then play it again, for a specified number of times. The game that is.
1.Major Transitions in Evolution 2.Game Theory 3.Evolution of Cooperation.
Social Behavior & Game Theory Social Behavior: Define by Economic Interaction Individuals Affect Each Other’s Fitness, Predict by Modeling Fitness Currency.
EC941 - Game Theory Lecture 7 Prof. Francesco Squintani
What is a game?. Game: a contest between players with rules to determine a winner. Strategy: a long term plan of action designed to achieve a particular.
Game Theory Lecture 8.
Maynard Smith Revisited: Spatial Mobility and Limited Resources Shaping Population Dynamics and Evolutionary Stable Strategies Pedro Ribeiro de Andrade.
Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.
Games People Play. 8: The Prisoners’ Dilemma and repeated games In this section we shall learn How repeated play of a game opens up many new strategic.
Institutions and the Evolution of Collective Action Mark Lubell UC Davis.
Evolving Game Playing Strategies (4.4.3) Darren Gerling Jason Gerling Jared Hopf Colleen Wtorek.
Yale 11 and 12 Evolutionary Stability: cooperation, mutation, and equilibrium.
A Memetic Framework for Describing and Simulating Spatial Prisoner’s Dilemma with Coalition Formation Sneak Review by Udara Weerakoon.
Yale Lectures 21 and Repeated Games: Cooperation vs the End Game.
Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.
Evolutionary Game Theory
Why How We Learn Matters Russell Golman Scott E Page.
Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal.
Cooperation and ESSs Putting Humpty together again.
Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.
APEC 8205: Applied Game Theory Fall 2007
Repeated games - example This stage game is played 2 times Any SPNE where players behave differently than in a 1-time game? Player 2 LMR L1, 10, 05, 0.
Unit III: The Evolution of Cooperation
UNIT III: MONOPOLY & OLIGOPOLY Monopoly Oligopoly Strategic Competition 7/20.
UNIT III: COMPETITIVE STRATEGY Monopoly Oligopoly Strategic Behavior 7/21.
QR 38 3/15/07, Repeated Games I I.The PD II.Infinitely repeated PD III.Patterns of cooperation.
On Bounded Rationality and Computational Complexity Christos Papadimitriou and Mihallis Yannakakis.
Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.
Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.
Social Choice Session 7 Carmen Pasca and John Hey.
5. Alternative Approaches. Strategic Bahavior in Business and Econ 1. Introduction 2. Individual Decision Making 3. Basic Topics in Game Theory 4. The.
Learning in Multiagent systems
Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
1 Economics & Evolution Number 3. 2 The replicator dynamics (in general)
Example Department of Computer Science University of Bologna Italy ( Decentralised, Evolving, Large-scale Information Systems (DELIS)
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
Natural Computation and Behavioral Robotics Competition, Games and Evolution Harris Georgiou – 3.
Evolving cooperation in one-time interactions with strangers Tags produce cooperation in the single round prisoner’s dilemma and it’s.
Section 2 – Ec1818 Jeremy Barofsky
UNIT III: MONOPOLY & OLIGOPOLY Monopoly Oligopoly Strategic Competition 7/30.
Game Theory by James Crissey Luis Mendez James Reid.
Mohsen Afsharchi Multiagent Interaction. What are Multiagent Systems?
1 UNIVERSITY OF CALIFORNIA, IRVINE, GAME THEORY AND POLITICS 2, POL SCI 130B, Lecture 2.5: AGGRESSION, VIOLENCE, AND UNCERTAINTY Recall various interpretations.
Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.
Replicator Dynamics. Nash makes sense (arguably) if… -Uber-rational -Calculating.
Evolving Strategies for the Prisoner’s Dilemma Jennifer Golbeck University of Maryland, College Park Department of Computer Science July 23, 2002.
Ec1818 Economics of Discontinuous Change Section 1 [Lectures 1-4] Wei Huang Harvard University (Preliminary and subject to revisions)
Indirect Reciprocity in the Selective Play Environment Nobuyuki Takahashi and Rie Mashima Department of Behavioral Science Hokkaido University 08/07/2003.
Game Theory and Cooperation
Computer-Mediated Communication
LECTURE 6: MULTIAGENT INTERACTIONS
Multiagent Systems Repeated Games © Manfred Huber 2018.
Chapter 14 & 15 Repeated Games.
Chapter 14 & 15 Repeated Games.
Collaboration in Repeated Games
Presentation transcript:

Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote Cooperation/Unit Review 4/14 7/28 4/9

Repeated Games Some Questions: What happens when a game is repeated? Can threats and promises about the future influence behavior in the present? Cheap talk Finitely repeated games: Backward induction Indefinitely repeated games: Trigger strategies

Can threats and promises about future actions influence behavior in the present? Consider the following game, played 2X: C 3,3 0,5 D 5,0 1,1 Repeated Games C D See Gibbons:

Repeated Games Draw the extensive form game: (3,3) (0,5)(5,0) (1,1) (6,6) (3,8) (8,3) (4,4) (3,8)(0,10)(5,5)(1,6)(8,3) (5,5)(10,0) (6,1) (4,4) (1,6) (6,1) (2,2)

Repeated Games Now, consider three repeated game strategies: D (ALWAYS DEFECT): Defect on every move. C(ALWAYS COOPERATE):Cooperate on every move. T(TRIGGER): Cooperate on the first move, then cooperate after the other cooperates. If the other defects, then defect forever.

Repeated Games If the game is played twice, the V(alue) to a player using ALWAYS DEFECT (D) against an opponent using ALWAYS DEFECT(D) is: V (D/D) = = 2 V (C/C) =3 + 3 =6 V (T/T)=3 + 3 = 6 V (D/C)=5 + 5 =10 V (D/T)=5 + 1 = 6 V (C/D)=0 + 0 =0 V (C/T)=3 + 3 =6 V (T/D)=0 + 1 =1 V (T/C)=3 + 3 =6

Repeated Games And 3x: V (D/D) = = 3 V (C/C) = = 9 V (T/T)= = 9 V (D/C)= =15 V (D/T)= = 7 V (C/D)= =0 V (C/T)= = 9 V (T/D)= =2 V (T/C)= = 9

Repeated Games Time average payoffs: n=3 V (D/D) = = 3 /3= 1 V (C/C) = = 9/3= 3 V (T/T)= = 9/3= 3 V (D/C)= =15/3= 5 V (D/T)= = 7/3= 7/3 V (C/D)= =0/3= 0 V (C/T)= = 9/3= 3 V (T/D)= =2/3 = 2/3 V (T/C)= = 9/3= 3

Repeated Games Time average payoffs: n V (D/D) = /n= 1 V (C/C) = /n= 3 V (T/T)= /n= 3 V (D/C)= /n= 5 V (D/T)= /n= 1 +  V (C/D)= /n= 0 V (C/T)= … /n= 3 V (T/D)= /n = 1 -  V (T/C)= /n= 3

Repeated Games Now draw the matrix form of this game: 1x T3,3 0,5 3,3 C 3,3 0,53,3 D 5,0 1,15,0 C D T

Repeated Games T 3,3 1-  1+  3,3 C 3,3 0,5 3,3 D 5,0 1,1 1+ ,1-  C D T If the game is repeated, ALWAYS DEFECT is no longer dominant. Time Average Payoffs

Repeated Games T 3,3 1-  1+  3,3 C 3,3 0,5 3,3 D 5,0 1,1 1+ ,1-  C D T … and TRIGGER achieves “a NE with itself.”

Repeated Games Time Average Payoffs T(emptation)> R(eward)> P(unishment)> S(ucker) T R,R P-  P +  R,R C R,R S,T R,R D T,S P,P P + , P -  C D T

Discounting The discount parameter, , is the weight of the next payoff relative to the current payoff. In a indefinitely repeated game,  can also be interpreted as the likelihood of the game continuing for another round (so that the expected number of moves per game is 1/(1-  )). The V(alue) to someone using ALWAYS DEFECT (D) when playing with someone using TRIGGER (T) is the sum of T for the first move,  P for the second,  2 P for the third, and so on (Axelrod: 13-4): V (D/T) = T +  P +  2 P + … “The Shadow of the Future”

Discounting Writing this as V (D/T) = T +  P +   2 P +..., we have the following: V (D/D) = P +  P +  2 P + … = P/(1-  ) V (C/C) =R +  R +  2 R + … = R/(1-  ) V (T/T)=R +  R +  2 R + … = R/(1-  ) V (D/C)=T +  T +  2 T + … = T/(1-  ) V (D/T)=T +  P +  2 P + … = T+  P/(1-  ) V (C/D)=S +  S +  2 S + … = S/(1-  ) V (C/T)=R +  R +  2 R + … = R/(1-  ) V (T/D)=S +  P +  2 P + … = S+  P/(1-  ) V (T/C)=R +  R +  2 R + … = R/(1-  )

T C D Discounted Payoffs T > R > P > S 0 >  > 1 R /(1-  ) S /(1-  ) R /(1-  ) R /(1-  ) T /(1-  ) R /(1-  ) T /(1-  ) P /(1-  ) T +  P /(1-  ) S /(1-  ) P /(1-  ) S +  P /(1-  ) Discounting C D T R /(1-  ) S +  P /(1-  ) R /(1-  ) R /(1-  ) T +  P /(1-  ) R /(1-  )

T C D Discounted Payoffs T > R > P > S 0 >  > 1 T weakly dominates C R /(1-  ) S /(1-  ) R /(1-  ) R /(1-  ) T /(1-  ) R /(1-  ) T /(1-  ) P /(1-  ) T +  P /(1-  ) S /(1-  ) P /(1-  ) S +  P /(1-  ) Discounting C D T R /(1-  ) S +  P /(1-  ) R /(1-  ) R /(1-  ) T +  P /(1-  ) R /(1-  )

Discounting Now consider what happens to these values as  varies (from 0-1): V (D/D) = P +  P +  2 P + … = P/(1-  ) V (C/C) =R +  R +  2 R + … = R/(1-  ) V (T/T)=R +  R +  2 R + … = R/(1-  ) V (D/C)=T +  T +  2 T + … = T/(1-  ) V (D/T)=T +  P +  2 P + … = T+  P/(1-  ) V (C/D)=S +  S +  2 S + … = S/(1-  ) V (C/T)=R +  R +  2 R + … = R/(1-  ) V (T/D)=S +  P +  2 P + … = S+  P/(1-  ) V (T/C)=R +  R +  2 R + … = R/(1-  )

Discounting Now consider what happens to these values as  varies (from 0-1): V (D/D) = P +  P +  2 P + … = P/(1-  ) V (C/C) =R +  R +  2 R + … = R/(1-  ) V (T/T)=R +  R +  2 R + … = R/(1-  ) V (D/C)=T +  T +  2 T + … = T/(1-  ) V (D/T)=T +  P +  2 P + … = T+  P/(1-  ) V (C/D)=S +  S +  2 S + … = S/(1-  ) V (C/T)=R +  R +  2 R + … = R/(1-  ) V (T/D)=S +  P +  2 P + … = S+  P/(1-  ) V (T/C)=R +  R +  2 R + … = R/(1-  )

Discounting Now consider what happens to these values as  varies (from 0-1): V (D/D) = P +  P +  2 P + … = P+  P/(1-  ) V (C/C) =R +  R +  2 R + … = R/(1-  ) V (T/T)=R +  R +  2 R + … = R/(1-  ) V (D/C)=T +  T +  2 T + … = T/(1-  ) V (D/T)=T +  P +  2 P + … = T+  P/(1-  ) V (C/D)=S +  S +  2 S + … = S/(1-  ) V (C /T) = R +  R +  2 R + … = R/(1-  ) V (T/D)=S +  P +  2 P + … = S+  P/(1-  ) V (T/C)=R +  R +  2 R + … = R/(1-  ) V(D/D) > V(T/D) D is a best response to D

Discounting Now consider what happens to these values as  varies (from 0-1): V (D/D) = P +  P +  2 P + … = P+  P/(1-  ) V (C/C) =R +  R +  2 R + … = R/(1-  ) V (T/T)=R +  R +  2 R + … = R/(1-  ) V (D/C)=T +  T +  2 T + … = T/(1-  ) V (D/T)=T +  P +  2 P + … = T+  P/(1-  ) V (C/D)=S +  S +  2 S + … = S/(1-  ) V (C/T)=R +  R +  2 R + … = R/(1-  ) V (T/D)=S +  P +  2 P + … = S+  P/(1-  ) V (T/C)=R +  R +  2 R + … = R/(1-  ) ?

Discounting Now consider what happens to these values as  varies (from 0-1): For all values of  : V(D/T) > V(D/D) > V(T/D) V(T/T) > V(D/D) > V(T/D) Is there a value of  s.t., V(D/T) = V(T/T)? Call this  *. If  <  *, the following ordering hold: V(D/T) > V(T/T) > V(D/D) > V(T/D) D is dominant: GAME SOLVED V(D/T) = V(T/T) T+  P(1-  ) = R/(1-  ) T-  t+  P = R T-R =  (T-P)   * = (T-R)/(T-P) ?

Discounting Now consider what happens to these values as  varies (from 0-1): For all values of  : V(D/T) > V(D/D) > V(T/D) V(T/T) > V(D/D) > V(T/D) Is there a value of  s.t., V(D/T) = V(T/T)? Call this  *.  * = (T-R)/(T-P) If  >  *, the following ordering hold: V(T/T) > V(D/T) > V(D/D) > V(T/D) D is a best response to D; T is a best response to T; multiple NE.

Discounting V(T/T) = R/(1-  )  * 1 V TRV TR Graphically: The V(alue) to a player using ALWAYS DEFECT (D) against TRIGGER (T), and the V(T/T) as a function of the discount parameter (  ) V(D/T) = T +  P/(1-  )

The Folk Theorem (R,R) (T,S) (S,T) (P,P) The payoff set of the repeated PD is the convex closure of the points [( T,S ); ( R,R ); ( S,T ); ( P,P )].

The Folk Theorem (R,R) (T,S) (S,T) (P,P) The shaded area is the set of payoffs that Pareto-dominate the one-shot NE ( P,P ).

The Folk Theorem (R,R) (T,S) (S,T) (P,P) Theorem: Any payoff that pareto- dominates the one-shot NE can be supported in a SPNE of the repeated game, if the discount parameter is sufficiently high.

The Folk Theorem (R,R) (T,S) (S,T) (P,P) In other words, in the repeated game, if the future matters “enough” i.e., (  >  * ), there are zillions of equilibria!

The theorem tells us that in general, repeated games give rise to a very large set of Nash equilibria. In the repeated PD, these are pareto-rankable, i.e., some are efficient and some are not. In this context, evolution can be seen as a process that selects for repeated game strategies with efficient payoffs. “Survival of the Fittest” The Folk Theorem

Evolutionary Games Fifteen months after I had begun my systematic enquiry, I happened to read for amusement ‘Malthus on Population’... It at once struck me that... favorable variations would tend to be preserved, and unfavorable ones to be destroyed. Here then I had at last got a theory by which to work. Charles Darwin

Evolutionary Games Evolutionary Stability (ESS) Hawk-Dove: an example The Replicator Dynamic The Trouble with TIT FOR TAT Designing Repeated Game Strategies Finite Automata

Evolutionary Games Biological Evolution: Under the pressure of natural selection, any population (capable of reproduction and variation) will evolve so as to become better adapted to its environment, i.e., will develop in the direction of increasing “fitness.” Economic Evolution: Firms that adopt efficient “routines” will survive, expand, and multiply; whereas others will be “weeded out” (Nelson and Winters, 1982).

Evolutionary Stability Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B

Hawk-Dove: an example Imagine a population of Hawks and Doves competing over a scarce resource (say food in a given area). The share of each type in the population changes according to the payoff matrix, so that payoffs determine the number of offspring left to the next generation. v = value of the resource c = cost of fighting H/D: Hawk gets resource; Dove flees (v, 0) D/D: Share resource (v/2, v/2) H/H: Share resource less cost of fighting ((v-c)/2, (v-c)/2) (See Hargreave-Heap and Varoufakis: ; Casti: )

Hawk-Dove: an example H D H ( v-c)/2,(v-c)/2 v,0 D 0,v v/2,v/2 v = value of resource c = cost of fighting

Hawk-Dove: an example H D H - 1,-1 4,0 D 0,4 2, 2 v = value of resource = 4 c = cost of fighting = 6

Hawk-Dove: an example H D H - 1,-1 4,0 D 0,4 2, 2 NE = {(1,0);(0,1);(2/3,2/3)} unstable stable The mixed NE corresponds to a population that is 2/3 Hawks and 1/3 Doves

Hawk-Dove: an example H D H - 1,-1 4,0 D 0,4 2, 2 NE = {(1,0);(0,1);(2/3,2/3)} unstable stable Is any strategy ESS?

H D H D -1,-1 4,0 0,4 2,2 A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B EP 2 (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

H D H D -1,-1 4,0 0,4 2,2 A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B In other words, to be ESS, a strategy must be a NE with itself. Neither H nor D is ESS. (For these payoffs.) EP 2 (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

H D H D -1,-1 4,0 0,4 2,2 A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B What about the mixed NE strategy? = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

H D H D -1,-1 4,0 0,4 2,2 V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3 V(H/H)+ 1/3 V(H/D) = 2/3 V(M/H) = 2/3 V(H/H)+ 1/3 V(D/H) = -2/3 V(D/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 V(M/D) = 2/3 V(H/D)+ 1/3 V(D/D) = 10/3 V(M/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)} Where M is the mixed strategy 2/3 Hawk, 1/3 Dove

H D H D -1,-1 4,0 0,4 2,2 V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3 V(H/H)+ 1/3 V(H/D) = 2/3 V(M/H) = 2/3 ( -1 ) + 1/3 ( 4 ) = 2/3 V(D/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 V(M/D) = 2/3 V(H/D)+ 1/3 V(D/D) = 10/3 V(M/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

H D H D -1,-1 4,0 0,4 2,2 V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3 V(H/H)+ 1/3 V(H/D) = 2/3 V(M/H) = 2/3 V(H/H)+ 1/3 V(D/H) = -2/3 V(D/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 V(M/D) = 2/3 V(H/D)+ 1/3 V(D/D) = 10/3 V(M/M) = 4/9 V(H/H)+ 2/9 V(H/D) = 2/9 V(D/H)+ 1/9 V(D/D) = 2/3 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

H D H D -1,-1 4,0 0,4 2,2 To be an ESS i) V(M/M) > V(B/M), for all B ii) either V(M/M) > V(B/M) or V(M/B) > V(B/B), for all B (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

H D H D -1,-1 4,0 0,4 2,2 To be an ESS i) V(M/M) = V(H/M) = V(D/M) = 2/3 ii) either V(M/M) > V(B/M) or V(M/B) > V(B/B), for all B (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)} V(M/D) > V(D/D) 10/3 > 2 V(M/H) > V(H/H) -2/3 > -1

Evolutionary Stability in IRPD? Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) Is D an ESS? i) V(D/D) > V(STFT/D) ? ii) V(D/D) > V(STFT/D) or V(D/STFT) > V(STFT/STFT) ? Consider a mutant strategy called e.g., SUSPICIOUS TIT FOR TAT (STFT). STFT defects on the first round, then plays like TFT

Evolutionary Stability in IRPD? Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) Is D an ESS? i) V(D/D) = V(STFT/D) ii) V(D/D) = V(STFT/D) or V(D/STFT) = V(STFT/STFT) Consider a mutant strategy called e.g., SUSPICIOUS TIT FOR TAT (STFT). STFT defects on the first round, then plays like TFT D and STFT are “neutral mutants”

Evolutionary Stability in IRPD? Axelrod & Hamilton (1981) demonstrated that D is not an ESS, opening the way to subsequent tournament studies of the game. This is a sort-of Folk Theorem for evolutionary games: In the one- shot Prisoner’s Dilemma, DEFECT is strictly dominant. But in the repeated game, ALWAYS DEFECT (D) can be invaded by a mutant strategy, e.g., SUSPICIOUS TIT FOR TAT (STFT). Many cooperative strategies do better than D, thus they can gain a foothold and grow as a share of the population. Depending on the initial population, the equilibrium reached can exhibit any amount of cooperation. Is STFT an ESS?

Evolutionary Stability in IRPD? It can be shown that there is no ESS in IRPD (Boyd & Lorberbaum, 1987; Lorberbaum, 1994). There can be stable polymorphisms among neutral mutants, whose realized behaviors are indistinguishable from one another. (This is the case, for example, of a population of C and TFT). Noise If the system is perturbed by “noise,” these behaviors become distinct and differences in their reproductive success rates are amplified. As a result, interest has shifted from the proof of the existence of a solution to the design of repeated game strategies that perform well against other sophisticated strategies.

Consider a population of strategies competing over a niche that can only maintain a fixed number of individuals, i.e., the population’s size is upwardly bounded by the system’s carrying capacity. In each generation, each strategy is matched against every other, itself, & RANDOM in pairwise games. Between generations, the strategies reproduce, where the chance of successful reproduction (“fitness”) is determined by the payoffs (i.e., payoffs play the role of reproductive rates). Then, strategies that do better than average will grow as a share of the population and those that do worse than average will eventually die-out... Replicator Dynamics

There is a very simple way to describe this process. Let: x(A) = the proportion of the population using strategy A in a given generation; V(A) = strategy A’s tournament score; V = the population’s average score. Then A’s population share in the next generation is: x’(A) = x(A) V(A) V

Replicator Dynamics For any finite set of strategies, the replicator dynamic will attain a fixed-point, where population shares do not change and all strategies are equally fit, i.e., V(A) = V(B), for all B. However, the dynamic described is population-specific. For instance, if the population consists entirely of naive cooperators (ALWAYS COOPERATE), then x(A) = x’(A) = 1, and the process is at a fixed-point. To be sure, the population is in equilibrium, but only in a very weak sense. For if a single D strategy were to “invade” the population, the system would be driven away from equilibrium, and C would be driven toward extinction.

Simulating Evolution An evolutionary model includes three components: Reproduction + Selection + Variation Population of Strategies Selection Mechanism Variation Mechanism Mutation or Learning Reproduction Competition Invasion

The Trouble with TIT FOR TAT TIT FOR TATis susceptible to 2 types of perturbations : Mutations: random Cs can invade TFT (TFT is not ESS), which in turn allows exploiters to gain a foothold. Noise: a “mistake” between a pair of TFTs induces CD, DC cycles (“mirroring” or “echo” effect). TIT FOR TAT never beats its opponent; it wins because it elicits reciprocal cooperation. It never exploits “naively” nice strategies. (See Poundstone: ; Casti )

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores. TFT:CCCC TFT:CCCD “mistake” The Trouble with TIT FOR TAT

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores. TFT:CCCCDCD …. TFT:CCCDCDC …. “mistake” Avg Payoff = R (T+S)/2 The Trouble with TIT FOR TAT

Nowak and Sigmund (1993) ran an extensive series of computer- based experiments and found the simple learning rule PAVLOV outperformed TIT FOR TAT in the presence of noise. PAVLOV (win-stay, loose-switch) Cooperate after both cooperated or both defected; otherwise defect. The Trouble with TIT FOR TAT

PAVLOV cannot be invaded by random C; PAVLOV is an exploiter (will “fleece a sucker” once it discovers no need to fear retaliation). A mistake between a pair of PAVLOVs causes only a single round of mutual defection followed by a return to mutual cooperation. PAV:CCCCD PAV:CCCDD “mistake” The Trouble with TIT FOR TAT

PAVLOV cannot be invaded by random C; PAVLOV is an exploiter (will “fleece a sucker” once it discovers no need to fear retaliation). A mistake between a pair of PAVLOVs causes only a single round of mutual defection followed by a return to mutual cooperation. PAV:CCCCDCC PAV:CCCDDCC “mistake” The Trouble with TIT FOR TAT

Pop. Share Generations Simulating Evolution 1(TFT) , ,12,15 13 No. = Position after 1 st Generation Source: Axelrod 1984, p. 51.

Simulating Evolution PAV TFT GRIM (TRIGGER) D R C Population shares for 6 RPD strategies (including RANDOM), with noise at 0.01 level. Pop. Shares Generations GTFT?

4/16 A Tournament How to Promote Cooperation? Axelrod, Ch. 6-9: Next Time