Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.

Similar presentations


Presentation on theme: "Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote."— Presentation transcript:

1 Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote Cooperation/Unit Review 4/14 7/28 4/6

2 Repeated Games Some Questions: What happens when a game is repeated? Can threats and promises about the future influence behavior in the present? Cheap talk Finitely repeated games: Backward induction Indefinitely repeated games: Trigger strategies

3 The Folk Theorem (R,R) (T,S) (S,T) (P,P) Theorem: Any payoff that pareto- dominates the one-shot NE can be supported in a SPNE of the repeated game, if the discount parameter is sufficiently high.

4 The Folk Theorem (R,R) (T,S) (S,T) (P,P) In other words, in the repeated game, if the future matters “enough” i.e., (  >  * ), there are zillions of equilibria!

5 The theorem tells us that in general, repeated games give rise to a very large set of Nash equilibria. In the repeated PD, these are pareto-rankable, i.e., some are efficient and some are not. In this context, evolution can be seen as a process that selects for repeated game strategies with efficient payoffs. “Survival of the Fittest” The Folk Theorem

6 Evolutionary Games Fifteen months after I had begun my systematic enquiry, I happened to read for amusement ‘Malthus on Population’... It at once struck me that... favorable variations would tend to be preserved, and unfavorable ones to be destroyed. Here then I had at last got a theory by which to work. Charles Darwin

7 Evolutionary Games Evolutionary Stability (ESS) Hawk-Dove: an example The Replicator Dynamic The Trouble with TIT FOR TAT Designing Repeated Game Strategies Finite Automata

8 Evolutionary Games Biological Evolution: Under the pressure of natural selection, any population (capable of reproduction and variation) will evolve so as to become better adapted to its environment, i.e., will develop in the direction of increasing “fitness.” Economic Evolution: Firms that adopt efficient “routines” will survive, expand, and multiply; whereas others will be “weeded out” (Nelson and Winters, 1982).

9 Evolutionary Stability Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B

10 Hawk-Dove: an example Imagine a population of Hawks and Doves competing over a scarce resource (say food in a given area). The share of each type in the population changes according to the payoff matrix, so that payoffs determine the number of offspring left to the next generation. v = value of the resource c = cost of fighting H/D: Hawk gets resource; Dove flees (v, 0) D/D: Share resource (v/2, v/2) H/H: Share resource less cost of fighting ((v-c)/2, (v-c)/2) (See Hargreave-Heap and Varoufakis: 195-214; Casti: 71-75.)

11 Hawk-Dove: an example H D H ( v-c)/2,(v-c)/2 v,0 D 0,v v/2,v/2 v = value of resource c = cost of fighting

12 Hawk-Dove: an example H D H - 1,-1 4,0 D 0,4 2, 2 v = value of resource = 4 c = cost of fighting = 6

13 Hawk-Dove: an example H D H - 1,-1 4,0 D 0,4 2, 2 NE = {(1,0);(0,1);(2/3,2/3)} unstable stable The mixed NE corresponds to a population that is 2/3 Hawks and 1/3 Doves

14 Hawk-Dove: an example H D H - 1,-1 4,0 D 0,4 2, 2 NE = {(1,0);(0,1);(2/3,2/3)} unstable stable Is any strategy ESS?

15 H D H D -1,-1 4,0 0,4 2,2 A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B EP 2 (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

16 H D H D -1,-1 4,0 0,4 2,2 A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B In other words, to be ESS, a strategy must be a NE with itself. EP 2 (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

17 H D H D -1,-1 4,0 0,4 2,2 A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B In other words, to be ESS, a strategy must be a NE with itself. Neither H nor D is ESS. (For these payoffs.) EP 2 (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

18 H D H D -1,-1 4,0 0,4 2,2 A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B What about the mixed NE strategy? = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

19 H D H D -1,-1 4,0 0,4 2,2 V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3 V(H/H)+ 1/3 V(H/D) = 2/3 V(M/H) = 2/3 V(H/H)+ 1/3 V(D/H) = -2/3 V(D/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 V(M/D) = 2/3 V(H/D)+ 1/3 V(D/D) = 10/3 V(M/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)} Where M is the mixed strategy 2/3 Hawk, 1/3 Dove

20 H D H D -1,-1 4,0 0,4 2,2 V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3 V(H/H)+ 1/3 V(H/D) = 2/3 V(M/H) = 2/3 ( -1 ) + 1/3 ( 4 ) = 2/3 V(D/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 V(M/D) = 2/3 V(H/D)+ 1/3 V(D/D) = 10/3 V(M/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

21 H D H D -1,-1 4,0 0,4 2,2 V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3 V(H/H)+ 1/3 V(H/D) = 2/3 V(M/H) = 2/3 V(H/H)+ 1/3 V(D/H) = -2/3 V(D/M) = 2/3 V(D/H)+ 1/3 V(D/D) = 2/3 V(M/D) = 2/3 V(H/D)+ 1/3 V(D/D) = 10/3 V(M/M) = 4/9 V(H/H)+ 2/9 V(H/D) = 2/9 V(D/H)+ 1/9 V(D/D) = 2/3 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

22 H D H D -1,-1 4,0 0,4 2,2 To be an ESS i) V(M/M) > V(B/M), for all B ii) either V(M/M) > V(B/M) or V(M/B) > V(B/B), for all B (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

23 H D H D -1,-1 4,0 0,4 2,2 To be an ESS i) V(M/M) = V(H/M) = V(D/M) = 2/3 ii) either V(M/M) > V(B/M) or V(M/B) > V(B/B), for all B (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)}

24 H D H D -1,-1 4,0 0,4 2,2 To be an ESS i) V(M/M) = V(H/M) = V(D/M) = 2/3 ii) either V(M/M) > V(B/M) or V(M/B) > V(B/B), for all B (O) = 3p EP 2 (F) = 5-5p p* = 5/8 Hawk-Dove: an example NE = {(1,0);(0,1);(2/3,2/3)} V(M/D) > V(D/D) 10/3 > 2 V(M/H) > V(H/H) -2/3 > -1

25 Evolutionary Stability in IRPD? Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) Is D an ESS? i) V(D/D) > V(STFT/D) ? ii) V(D/D) > V(STFT/D) or V(D/STFT) > V(STFT/STFT) ? Consider a mutant strategy called e.g., SUSPICIOUS TIT FOR TAT (STFT). STFT defects on the first round, then plays like TFT

26 Evolutionary Stability in IRPD? Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) Is D an ESS? i) V(D/D) = V(STFT/D) ii) V(D/D) = V(STFT/D) or V(D/STFT) = V(STFT/STFT) Consider a mutant strategy called e.g., SUSPICIOUS TIT FOR TAT (STFT). STFT defects on the first round, then plays like TFT D and STFT are “neutral mutants”

27 Evolutionary Stability in IRPD? Axelrod & Hamilton (1981) demonstrated that D is not an ESS, opening the way to subsequent tournament studies of the game. This is a sort-of Folk Theorem for evolutionary games: In the one- shot Prisoner’s Dilemma, DEFECT is strictly dominant. But in the repeated game, ALWAYS DEFECT (D) can be invaded by a mutant strategy, e.g., SUSPICIOUS TIT FOR TAT (STFT). Many cooperative strategies do better than D, thus they can gain a foothold and grow as a share of the population. Depending on the initial population, the equilibrium reached can exhibit any amount of cooperation. Is STFT an ESS?

28 Evolutionary Stability in IRPD? It can be shown that there is no ESS in IRPD (Boyd & Lorberbaum, 1987; Lorberbaum, 1994). There can be stable polymorphisms among neutral mutants, whose realized behaviors are indistinguishable from one another. (This is the case, for example, of a population of C and TFT). Noise If the system is perturbed by “noise,” these behaviors become distinct and differences in their reproductive success rates are amplified. As a result, interest has shifted from the proof of the existence of a solution to the design of repeated game strategies that perform well against other sophisticated strategies.

29 Consider a population of strategies competing over a niche that can only maintain a fixed number of individuals, i.e., the population’s size is upwardly bounded by the system’s carrying capacity. In each generation, each strategy is matched against every other, itself, & RANDOM in pairwise games. Between generations, the strategies reproduce, where the chance of successful reproduction (“fitness”) is determined by the payoffs (i.e., payoffs play the role of reproductive rates). Then, strategies that do better than average will grow as a share of the population and those that do worse than average will eventually die-out... Replicator Dynamics

30 There is a very simple way to describe this process. Let: x(A) = the proportion of the population using strategy A in a given generation; V(A) = strategy A’s tournament score; V = the population’s average score. Then A’s population share in the next generation is: x’(A) = x(A) V(A) V

31 Replicator Dynamics For any finite set of strategies, the replicator dynamic will attain a fixed-point, where population shares do not change and all strategies are equally fit, i.e., V(A) = V(B), for all B. However, the dynamic described is population-specific. For instance, if the population consists entirely of naive cooperators (ALWAYS COOPERATE), then x(A) = x’(A) = 1, and the process is at a fixed-point. To be sure, the population is in equilibrium, but only in a very weak sense. For if a single D strategy were to “invade” the population, the system would be driven away from equilibrium, and C would be driven toward extinction.

32 Simulating Evolution An evolutionary model includes three components: Reproduction + Selection + Variation Population of Strategies Selection Mechanism Variation Mechanism Mutation or Learning Reproduction Competition Invasion

33 The Trouble with TIT FOR TAT TIT FOR TATis susceptible to 2 types of perturbations : Mutations: random Cs can invade TFT (TFT is not ESS), which in turn allows exploiters to gain a foothold. Noise: a “mistake” between a pair of TFTs induces CD, DC cycles (“mirroring” or “echo” effect). TIT FOR TAT never beats its opponent; it wins because it elicits reciprocal cooperation. It never exploits “naively” nice strategies. (See Poundstone: 242-248; Casti 76-84.)

34 Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores. TFT:CCCC TFT:CCCD “mistake” The Trouble with TIT FOR TAT

35 Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores. TFT:CCCCDCD …. TFT:CCCDCDC …. “mistake” Avg Payoff = R (T+S)/2 The Trouble with TIT FOR TAT

36 Nowak and Sigmund (1993) ran an extensive series of computer- based experiments and found the simple learning rule PAVLOV outperformed TIT FOR TAT in the presence of noise. PAVLOV (win-stay, loose-switch) Cooperate after both cooperated or both defected; otherwise defect. The Trouble with TIT FOR TAT

37 PAVLOV cannot be invaded by random C; PAVLOV is an exploiter (will “fleece a sucker” once it discovers no need to fear retaliation). A mistake between a pair of PAVLOVs causes only a single round of mutual defection followed by a return to mutual cooperation. PAV:CCCCDCC PAV:CCCDDCC “mistake” The Trouble with TIT FOR TAT

38 Pop. Share 0.140 0.100 0.060 0.020 0 200400600800 Generations Simulating Evolution 1(TFT) 3 2 6 7,9 10 4 11 5 8 18 14,12,15 13 No. = Position after 1 st Generation Source: Axelrod 1984, p. 51.

39 Simulating Evolution PAV TFT GRIM (TRIGGER) D R C Population shares for 6 RPD strategies (including RANDOM), with noise at 0.01 level. Pop. Shares 0.50 0.40 0.30 0.20 0.10 0.00 Generations GTFT?

40 In the Repeated Prisoner’s Dilemma, it has been suggested that “uncooperative behavior is the result of ‘unbounded rationality’, i.e., the assumed availability of unlimited reasoning and computational resources to the players” (Papadimitrou, 1992: 122). If players are bounded rational, on the other hand, the cooperative outcome may emerge as the result of a “muddling” process. They reason inductively and adapt (imitate or learn) locally superior strategies. Thus, not only is bounded rationality a more “realistic” approach, it may also solve some deep analytical problems, e.g., resolution of finite horizon paradoxes. Bounded Rationality

41 Tournament Assignment Design a strategy to play an Evolutionary Prisoner’s Dilemma Tournament. Entries will meet in a round robin tournament, with 1% noise (i.e., for each intended choice there is a 1% chance that the opposite choice will be implemented). Games will last at least 1000 repetitions (each generation), and after each generation, population shares will be adjusted according to the replicator dynamic, so that strategies that do better than average will grow as a share of the population whereas others will be driven to extinction. The winner or winners will be those strategies that survive after at least 10,000 generations.

42 Designing Repeated Game Strategies Imagine a very simple decision making machine playing a repeated game. The machine has very little information at the start of the game: no knowledge of the payoffs or “priors” over the opponent’s behavior. It merely makes a choice, receives a payoff, then adapts its behavior, and so on. The machine, though very simple, is able to implement a strategy against any possible opponent, i.e., it “knows what to do” in any possible situation of the game.

43 Designing Repeated Game Strategies A repeated game strategy is a map from a history to an action. A history is all the actions in the game thus far …. … T -3 T -2 T -1 T o CCCCDCC CCCDDCD History at time T o ?

44 Designing Repeated Game Strategies A repeated game strategy is a map from a history to an action. A history is all the actions in the game thus far, subject to the constraint of a finite memory: … T -3 T -2 T -1 T o CCCCDCC CCCDDCC History of memory-4 ?

45 Designing Repeated Game Strategies TIT FOR TAT is a remarkably simple repeated game strategy. It merely requires recall of what happened in the last round (memory-1). … T -3 T -2 T -1 T o CCCCDDC CCCDDCD History of memory-1 ?

46 Finite Automata A FINITE AUTOMATON (FA) is a mathematical representation of a simple decision-making process. FA are completely described by: A finite set of internal states An initial state An output function A transition function The output function determines an action, C or D, in each state. The transition function determines how the FA changes states in response to the inputs it receives (e.g., actions of other FA). Rubinstein, “Finite Automata Play the Repeated PD” JET, 1986)

47 FA will implement a strategy against any possible opponent, i.e., they “know what to do” in any possible situation of the game. FA meet in 2-player repeated games and make a move in each round (either C or D). Depending upon the outcome of that round, they “decide” what to play on the next round, and so on. FA are very simple, have no knowledge of the payoffs or priors over the opponent’s behavior, and no deductive ability. They simply read and react to what happens. Nonetheless, they are capable of a crude form of “learning” — they receive payoffs that reinforce certain behaviors and “punish” others. Finite Automata

48 DC D C D “TIT FOR TAT” C

49 Finite Automata CC D C C D D D C “TIT FOR TWO TATS”

50 Finite Automata Some examples: CC D D DD C,D C D D C C,D C D START “ALWAYS DEFECT” “TIT FOR TAT” “GRIM (TRIGGER)” C D D D C CD “PAVLOV” “M5” C C C D D C C

51 Calculating Automata Payoffs D C D D D C C “PAVLOV” “M5” CC D D C C D Time-average payoffs can be calculated because any pair of FA will achieve cycles, since each FA takes as input only the actions in the previous period (i.e., it is “Markovian”). For example, consider the following pair of FA: C

52 Calculating Automata Payoffs D C D D D C C “PAVLOV” “M5” CC D D C C PAVLOV:C M5:D D C

53 Calculating Automata Payoffs D C D D D C C “PAVLOV” “M5” CC D D C C PAVLOV:CD M5:DC D C

54 Calculating Automata Payoffs D C D D D C C “PAVLOV” “M5” CC C D D C C Payoff05105105AVG=2 PAVLOVCDDCDDCD M5DCDDCDDC Payoff5015015 AVG=2 D cyclecycle cycle

55 Tournament Assignment To design your strategy, access the programs through your fas Unix account. The Finite Automaton Creation Tool (fa) will prompt you to create a finite automata to implement your strategy. Select the number of internal states, designate the initial state, define output and transition functions, which together determine how an automaton “behaves.” The program also allows you to specify probabilistic output and transition functions. Simple probabilistic strategies such as GENEROUS TIT FOR TAT have been shown to perform particularly well in noisy environments, because they avoid costly sequences of alternating defections that undermine sustained cooperation.

56 Some examples: CC D D.9D C,D C D D C C C D START ALWAYS DEFECT TIT FOR TAT GENEROUS PAVLOV Tournament Assignment A number of test runs will be held and results will be distributed to the class. You can revise your strategy as often as you like before the final submission date. You can also create your own tournament environment and test various designs before submitting. Entries must be submitted by 5pm, Friday, May 6. D

57 Creating your automaton To create a finite automaton (fa) you need to run the fa creation program. Log into your unix account via an ice server and at the ice% prompt, type: ~neugebor/simulation/fa From there, simply follow the instructions provided. Use your user name as the name for the fa. If anything goes wrong, simply press “ctrl-c” and start over. Computer Instructions

58 Creating your automaton The program prompts the user to: specify the number of states in the automaton, with an upper limit of 50. For each state, the program asks: “choose an action (cooperate or defect);” and “in response to cooperate (defect), transition to what state?” Finally, the program asks: specify the initial state. The program also allows the user to specify probabilistic outputs and transitions. Computer Instructions

59 Submitting your automaton After creating the fa, submit it by typing: cp username.fa ~neugebor/ece1040.11 chmod 744 ~neugebor/ece1040.11/username.fa where username is your user name. You may resubmit as often as you like before the submission deadline. Computer Instructions

60 Testing your automaton You may wish to test your fa before submitting it. You can do this by running sample tournaments with different fa’s you’ve created. To run the tournament program, you must copy it into your own account. You can do this by typing: mkdir simulation cp ~neugebor/simulation/* simulation To change into the directory with the tournament program type: cd simulation Then, to run the tournament type:./tournament Computer Instructions

61 Testing your automaton Follow the instructions provided. Note that running a tournament with many fa’s can be computationally intensive and may take a long time to complete. Use your favorite text editor to view the results of the tournament (“less” is an easy option if you are unfamiliar with unix -- type “less textfilename” to open a text file). To create extra automaton to test in your tournament type:./fa Name each fa whatever you want by entering the any name you wish to use instead of your user name. Initially six different kinds of fa’s are in the directory: D, C, TFT, GRIM, PAVLOV, AND RANDOM. Experiment with these and others as you like. Computer Instructions

62 4/13 How to Promote Cooperation? Axelrod, Ch. 6-9: 104-191. Next Time


Download ppt "Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote."

Similar presentations


Ads by Google