Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

Slides:

Advertisements

Similar presentations

Vincent Conitzer CPS Repeated games Vincent Conitzer

Advertisements

Some Problems from Chapt 13

NON - zero sum games.

Ultimatum Game Two players bargain (anonymously) to divide a fixed amount between them. P1 (proposer) offers a division of the “pie” P2 (responder) decides.

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.

Crime, Punishment, and Forgiveness

This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.

APPENDIX An Alternative View of the Payoff Matrix n Assume total maximum profits of all oligopolists is constant at 200 units. n Alternative policies.

Game Theory “Доверяй, Но Проверяй” - Russian Proverb (Trust, but Verify) - Ronald Reagan Mike Shor Lecture 6.

Game Theory “Доверяй, Но Проверяй” (“Trust, but Verify”) - Russian Proverb (Ronald Reagan) Topic 5 Repeated Games.

Evolution of Cooperation The importance of being suspicious.

Dispute Settlement Mechanism The role of dispute settlement mechanism –information gathering and dispatching, not enforcement of trade arrangements Main.

Congestion Games with Player- Specific Payoff Functions Igal Milchtaich, Department of Mathematics, The Hebrew University of Jerusalem, 1993 Presentation.

Infinitely Repeated Games. In an infinitely repeated game, the application of subgame perfection is different - after any possible history, the continuation.

Non-Cooperative Game Theory To define a game, you need to know three things: –The set of players –The strategy sets of the players (i.e., the actions they.

Chapter 14 Infinite Horizon 1.Markov Games 2.Markov Solutions 3.Infinite Horizon Repeated Games 4.Trigger Strategy Solutions 5.Investing in Strategic Capital.

M9302 Mathematical Models in Economics Instructor: Georgi Burlakov 2.5.Repeated Games Lecture

EC941 - Game Theory Lecture 7 Prof. Francesco Squintani

Game Theory Lecture 9.

What is a game?. Game: a contest between players with rules to determine a winner. Strategy: a long term plan of action designed to achieve a particular.

Game Theory Lecture 8.

Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

The Evolution of Conventions H. Peyton Young Presented by Na Li and Cory Pender.

Games People Play. 8: The Prisoners’ Dilemma and repeated games In this section we shall learn How repeated play of a game opens up many new strategic.

Dynamic Games of Complete Information.. Repeated games Best understood class of dynamic games Past play cannot influence feasible actions or payoff functions.

A camper awakens to the growl of a hungry bear and sees his friend putting on a pair of running shoes, “You can’t outrun a bear,” scoffs the camper. His.

Rational Learning Leads to Nash Equilibrium Ehud Kalai and Ehud Lehrer Econometrica, Vol. 61 No. 5 (Sep 1993), Presented by Vincent Mak

An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.

Yale 11 and 12 Evolutionary Stability: cooperation, mutation, and equilibrium.

A Memetic Framework for Describing and Simulating Spatial Prisoner’s Dilemma with Coalition Formation Sneak Review by Udara Weerakoon.

ECON6036 1st semester Format of final exam Same as the mid term

Dispute Settlement Mechanism The role of dispute settlement mechanism –information gathering and dispatching, not enforcement of trade arrangements Main.

APEC 8205: Applied Game Theory Fall 2007

Repeated games - example This stage game is played 2 times Any SPNE where players behave differently than in a 1-time game? Player 2 LMR L1, 10, 05, 0.

TOPIC 6 REPEATED GAMES The same players play the same game G period after period. Before playing in one period they perfectly observe the actions chosen.

On Bounded Rationality and Computational Complexity Christos Papadimitriou and Mihallis Yannakakis.

DANSS Colloquium By Prof. Danny Dolev Presented by Rica Gonen

Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley.

© 2009 Institute of Information Management National Chiao Tung University Lecture Notes II-2 Dynamic Games of Complete Information Extensive Form Representation.

Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.

1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester Week 46 (November 14-15)

Punishment and Forgiveness in Repeated Games. A review of present values.

Learning in Multiagent systems

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Independence and Bernoulli.

Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -

Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.

Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.

Dynamic Games & The Extensive Form

Moshe Tennenholtz, Aviv Zohar Learning Equilibria in Repeated Congestion Games.

Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.

Chapters 29, 30 Game Theory A good time to talk about game theory since we have actually seen some types of equilibria last time. Game theory is concerned.

Independence and Bernoulli Trials. Sharif University of Technology 2 Independence  A, B independent implies: are also independent. Proof for independence.

Section 2 – Ec1818 Jeremy Barofsky

Punishment, Detection, and Forgiveness in Repeated Games.

Final Lecture. Problem 2, Chapter 13 Exploring the problem Note that c, x yields the highest total payoff of 7 for each player. Is this a Nash equilibrium?

Robert Axelrod’s Tournaments Robert Axelrod’s Tournaments, as reported in Axelrod, Robert. 1980a. “Effective Choice in the Prisoner’s Dilemma.” Journal.

1. 2 You should know by now… u The security level of a strategy for a player is the minimum payoff regardless of what strategy his opponent uses. u A.

Punishment, Detection, and Forgiveness in Repeated Games.

GAME THEORY and its Application Chapter 06. Outlines... Introduction Prisoner`s dilemma Nash equilibrium Oligopoly price fixing Game Collusion for profit.

Bargaining games Econ 414. General bargaining games A common application of repeated games is to examine situations of two or more parties bargaining.

Lec 23 Chapter 28 Game Theory.

1 Distributed Vertex Coloring. 2 Vertex Coloring: each vertex is assigned a color.

Entry Deterrence Players Two firms, entrant and incumbent Order of play Entrant decides to enter or stay out. If entrant enters, incumbent decides to fight.

Game Theory and Cooperation

Vincent Conitzer CPS Repeated games Vincent Conitzer

Vincent Conitzer Repeated games Vincent Conitzer

Chapter 14 & 15 Repeated Games.

Chapter 14 & 15 Repeated Games.

Vincent Conitzer CPS Repeated games Vincent Conitzer

Presentation transcript:

Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

2 Theory of repeated games important central model for explaining how self-interested agents can cooperate used in economics, biology, political science and other fields

3 But theory has a serious flaw: although cooperative behavior possible, so is uncooperative behavior (and everything in between) theory doesn’t favor one behavior over another theory doesn’t make sharp predictions

4 Evolution (biological or cultural) can promote efficiency might hope that uncooperative behavior will be “weeded out” this view expressed in Axelrod (1984)

5 Basic idea: Start with population of repeated game strategy Always D Consider small group of mutants using Conditional C (Play C until someone plays D, thereafter play D) –does essentially same against Always D as Always D does –does much better against Conditional C than Always D does Thus Conditional C will invade Always D uncooperative behavior driven out

6 But consider ALT Alternate between C and D until pattern broken, thereafter play D can’t be invaded by some other strategy –other strategy would have to alternate or else would do much worse against ALT than ALT does Thus ALT is “evolutionarily stable” But ALT is quite inefficient (average payoff 1)

7 Still, ALT highly inflexible –relies on perfect alternation –if pattern broken, get D forever What if there is a (small) probability of mistake in execution?

8 Consider mutant strategy identical to ALT except if (by mistake) alternating pattern broken – “intention” to cooperate by playing C in following period –if other strategy plays C too, –if other strategy plays D,

9 Main results in paper (for 2-player symmetric repeated games) (1)If s evolutionarily stable and –discount rate r small (future important) –mistake probability p small (but p > 0) then s (almost) “efficient” (2) If payoffs (v, v) “efficient”, then exists ES strategy s (almost) attaining (v, v) provided –r small –p small relative to r generalizes Fudenberg-Maskin (1990), in which r = p = 0

10 Finite symmetric 2–player game if normalize payoffs so that

11 strongly efficient if

12 Repeated game: g repeated infinitely many times period t history H = set of all histories repeated game strategy –assume finitely complex (playable by finite computer) in each period, probability p that i makes mistake –chooses (equal probabilities for all actions) –mistakes independent across players

13

14 informally, s evolutionarily stable (ES), if no mutant can invade population with big proportion s and small proportion formally, s is ES w.r.t. if for all and all evolutionary stability –expressed statically here –but can be given precise dynamic meaning

15 population of suppose time measure in “epochs” T = 1, 2,... strategy state in epoch T −most players in population use group of mutants (of size a) plays s' a drawn randomly from s' drawn randomly from finitely complex strategies M random drawings of pairs of players −each pair plays repeated game = strategy with highest average score

16 Theorem 1: For any exists such that, for all there exists such that, for all (i) if s not ES, (ii) if

17 Let Theorem 2: Given such that, for all if s is ES w.r.t. then

18

19 Proof: Suppose will construct mutant s' that can invade let if s = ALT, = any history for which alternating pattern broken

20 Construct s' so that if h not a continuation of after, strategy s' –“signals” willingness to cooperate by playing differently from s for 1 period (assume s is pure strategy) –if other player responds positively, plays strongly efficiently thereafter –if not, plays according to s thereafter after –responds positively if other strategy has signaled, and thereafter plays strongly efficiently –plays according to s otherwise

21 because is already worst history, s' loses for only 1 period by signaling (small loss if r small) if p small, probability that s' “misreads” other player’s intention is small hence, s' does nearly as well against s as s does against itself (even after ) s' does very well against itself (strong efficiency), after

22 remains to check how well s does against s' by definition of Ignoring effect of p, Also, after deviation by s', punishment started again, and so Hence so s does appreciably worse against s' than s' does against s'

23 Summing up, we have: s is not ES

24 Theorem 2 implies for Prisoner’s Dilemma that, for any doesn’t rule out punishments of arbitrary (finite) length

25 Consider strategy s with “cooperative” and “punishment” phases –in cooperative phase, play C –stay in cooperative phase until one player plays D, in which case go to punishment phase –in punishment phase, play D –stay in punishment phase for m periods (and then go back to cooperative phase) unless at some point some player chooses C, in which case restart punishment For any m,

26 Can sharpen Theorem 2 for Prisoner’s Dilemma: Given, there exist such that, for all if s is ES w.r.t. then it cannot entail a punishment lasting more than periods Proof: very similar to that of Theorem 2

27 For r and p too big, ES strategy s may not be “efficient” if if fully cooperative strategies in Prisoner’s Dilemma generate payoffs

28 Theorem 3: Let For all for all

29 Proof: Construct s so that along equilibrium path of (s, s), payoffs are (approximately) (v, v) punishments are nearly strongly efficient –deviating player (say 1) minimaxed long enough wipe out gain –thereafter go to strongly efficient point –overall payoffs after deviation: if r and p small (s, s) is a subgame perfect equilibrium

30 In Prisoner’s Dilemma, consider s that –plays C the first period –thereafter, plays C if and only if either both players played C previous period or neither did strategy s –is efficient –entails punishments that are as short as possible –is modification of Tit-for-Tat (C the first period; thereafter, do what other player did previous period) Tit-for-Tat not ES –if mistake (D, C) occurs then get wave of alternating punishments: (C, D), (D, C), (C, D),... until another mistake made

31 Let s = play d as long as in all past periods –both players played d –neither played d if single player deviates from d –henceforth, that player plays b –other player plays a s is ES even though inefficient –any attempt to improve on efficiency, punished forever –can’t invade during punishment, because punishment efficient

32 Consider potential invader s' For any h, s' cannot do better against s than s does against itself, since (s, s) equilibrium hence, for all h, and so For s' to invade, need Claim: implies h' involves deviation from equil path of (s, s) only other possibility: –s' different from s on equil path –then s' punished by –violates we thus have Hence, from rhs of

33 For Theorem 3 to hold, p must be small relative to r consider modified Tit-for-Tat against itself (play C if and only if both players took same action last period) with every mistake, there is an expected loss of 2 – (½ · 3 + ½ (−1)) = 1 the first period 2 – 0 = 2 the second period so over-all the expected loss from mistakes is approximately By contrast, a mutant strategy that signals, etc. and doesn’t punish at all against itself loses only about so if r is small enough relative to p, mutant can invade