Robert Axelrod’s Tournaments Robert Axelrod’s Tournaments, as reported in Axelrod, Robert. 1980a. “Effective Choice in the Prisoner’s Dilemma.” Journal.

Slides:

Advertisements

Similar presentations

Concepts of Game Theory II. 2 The prisioners reasoning… Put yourself in the place of prisoner i (or j)… Reason as follows: –Suppose I cooperate… If j.

Advertisements

The Basics of Game Theory

Tutorial 1 Ata Kaban School of Computer Science University of Birmingham.

Infinitely Repeated Games

Crime, Punishment, and Forgiveness

3. Basic Topics in Game Theory. Strategic Behavior in Business and Econ Outline 3.1 What is a Game ? The elements of a Game The Rules of the.

Probability Three basic types of probability: Probability as counting

Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

Game Theory “Доверяй, Но Проверяй” - Russian Proverb (Trust, but Verify) - Ronald Reagan Mike Shor Lecture 6.

Game Theory “Доверяй, Но Проверяй” (“Trust, but Verify”) - Russian Proverb (Ronald Reagan) Topic 5 Repeated Games.

Evolution of Cooperation The importance of being suspicious.

Infinitely Repeated Games. In an infinitely repeated game, the application of subgame perfection is different - after any possible history, the continuation.

6-1 LECTURE 6: MULTIAGENT INTERACTIONS An Introduction to MultiAgent Systems

Infinitely Repeated Games Econ 171. Finitely Repeated Game Take any game play it, then play it again, for a specified number of times. The game that is.

EC941 - Game Theory Lecture 7 Prof. Francesco Squintani

Coye Cheshire & Andrew Fiore March 21, 2012 // Computer-Mediated Communication Collective Action and CMC: Game Theory Approaches and Applications.

Game Theory Lecture 8.

Cognitive Biases 2 Incomplete and Unrepresentative Data.

Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Prisoner’s Dilemma. The scenario In the Prisoner’s Dilemma, you and Lucifer are picked up by the police and interrogated in separate cells without the.

Chapter 6 © 2006 Thomson Learning/South-Western Game Theory.

EC – Tutorial / Case study Iterated Prisoner's Dilemma Ata Kaban University of Birmingham.

R. Keeney November 28,  A decision maker wants to behave optimally but is faced with an opponent  Nature – offers uncertain outcomes  Competition.

A camper awakens to the growl of a hungry bear and sees his friend putting on a pair of running shoes, “You can’t outrun a bear,” scoffs the camper. His.

Evolving New Strategies The Evolution of Strategies in the Iterated Prisoner’s Dilemma 01 / 25.

Story time! Robert Axelrod. Contest #1 Call for entries to game theorists All entrants told of preliminary experiments 15 strategies = 14 entries + 1.

Objectives © Pearson Education, 2005 Oligopoly LUBS1940: Topic 7.

Minority Games A Complex Systems Project. Going to a concert… But which night to pick? Friday or Saturday? You want to go on the night with the least.

An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.

Decision Analysis April 11, Game Theory Frame Work Players ◦ Decision maker: optimizing agent ◦ Opponent  Nature: offers uncertain outcome  Competition:

APEC 8205: Applied Game Theory Fall 2007

Introduction to Game Theory and Behavior Networked Life CIS 112 Spring 2009 Prof. Michael Kearns.

QR 38 3/15/07, Repeated Games I I.The PD II.Infinitely repeated PD III.Patterns of cooperation.

Coye Cheshire & Andrew Fiore June 28, 2015 // Computer-Mediated Communication Game Theory, Games, and CMC.

On Bounded Rationality and Computational Complexity Christos Papadimitriou and Mihallis Yannakakis.

Peter B. Henderson Butler University

Brian Duddy.  Two players, X and Y, are playing a card game- goal is to find optimal strategy for X  X has red ace (A), black ace (A), and red two (2)

Agenda, Day 2  Questions about syllabus? About myths?  Prisoner’s dilemma  Prisoner’s dilemma vs negotiation  Play a single round  Play multiple rounds.

Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.

Classifying Attributes with Game- theoretic Rough Sets Nouman Azam and JingTao Yao Department of Computer Science University of Regina CANADA S4S 0A2

Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -

Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.

1 Near-Optimal Play in a Social Learning Game Ryan Carr, Eric Raboin, Austin Parker, and Dana Nau Department of Computer Science, University of Maryland.

Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.

Experimental Economics NSF short course David Laibson August 11, 2005.

The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.

Bargaining as Constraint Satisfaction Simple Bargaining Game Edward Tsang

KRUGMAN'S MICROECONOMICS for AP* Game Theory Margaret Ray and David Anderson Micro: Econ: Module.

Part 3 Linear Programming

Section 2 – Ec1818 Jeremy Barofsky

1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.

Game Theory by James Crissey Luis Mendez James Reid.

Mohsen Afsharchi Multiagent Interaction. What are Multiagent Systems?

Evolving Strategies for the Prisoner’s Dilemma Jennifer Golbeck University of Maryland, College Park Department of Computer Science July 23, 2002.

More on Logic Today we look at the for loop and then put all of this together to look at some more complex forms of logic that a program will need The.

The Good News about The Bad News Gospel. The BAD News Gospel: Humans are “fallen”, “depraved” and incapable of doing the right thing “Human Nature” is.

Indirect Reciprocity in the Selective Play Environment Nobuyuki Takahashi and Rie Mashima Department of Behavioral Science Hokkaido University 08/07/2003.

March 1, 2016Introduction to Artificial Intelligence Lecture 11: Machine Evolution 1 Let’s look at… Machine Evolution.

Modelling and Simulating Social Systems with MATLAB

PRISONER’S DILEMMA BERK EROL

Evolving New Strategies

tit-for-tat algorithm

Pairings FIDE Arbiter Seminar.

Computer-Mediated Communication

Computer-Mediated Communication

Pairings FIDE Arbiter Seminar.

LECTURE 6: MULTIAGENT INTERACTIONS

Game Theory Fall Mike Shor Topic 5.

Game Theory Spring Mike Shor Topic 5.

Presentation transcript:

Robert Axelrod’s Tournaments Robert Axelrod’s Tournaments, as reported in Axelrod, Robert. 1980a. “Effective Choice in the Prisoner’s Dilemma.” Journal of Conflict Resolution 24: Axelrod, Robert. 1980b. “More Effective Choice in the Prisoner’s Dilemma.” Journal of Conflict Resolution 24 (3): Axelrod, Robert Evolution of Cooperation. Robert Axelrod’s Tournaments Robert Axelrod’s Tournaments, as reported in Axelrod, Robert. 1980a. “Effective Choice in the Prisoner’s Dilemma.” Journal of Conflict Resolution 24: Axelrod, Robert. 1980b. “More Effective Choice in the Prisoner’s Dilemma.” Journal of Conflict Resolution 24 (3): Axelrod, Robert Evolution of Cooperation.

Tournament Num. 1 Tournament Num. 1 (1980) -non-zero sum setting, given payoff matrix (R=3, T=5, S=0, P=1) -round robin tournament (play all other entrants, twin, and RANDOM) -each entrant told to write a program to select C or D choice every move, can use history of the game so far in this decision making -sent copies of preliminary tournament in which TFT scored second, so known to be powerful competitor, also told RANDOM was somewhere in the competition  tried to improve on TFT principle -known number of moves per game: 200 -entire round robin run 5 times  total 120,000 moves and 240,000 choices Tournament Num. 1 Tournament Num. 1 (1980) -non-zero sum setting, given payoff matrix (R=3, T=5, S=0, P=1) -round robin tournament (play all other entrants, twin, and RANDOM) -each entrant told to write a program to select C or D choice every move, can use history of the game so far in this decision making -sent copies of preliminary tournament in which TFT scored second, so known to be powerful competitor, also told RANDOM was somewhere in the competition  tried to improve on TFT principle -known number of moves per game: 200 -entire round robin run 5 times  total 120,000 moves and 240,000 choices

14 Entrants -3 countries, 5 disciplines (psychology, math, economics, sociology, political sciences) -scores range from 0 to 1000, but “useful benchmark for very good performance is 600,” attained if both always cooperate together -“very poor performance [benchmark] is 200 points” (if both always D) -winner Tit for Tat (TFT) scored 504 (but if change P=2, does not win) -top 8 entries were nice (defined as not first to defect), rest were not -nice entries’ scores scored from 472 to 504, while best of mean entries only scored 401 points (huge disparity!) -logically, because nice ones cooperate together, this is how TFT wins! (though it cannot get a score higher than its opponent’s) 14 Entrants -3 countries, 5 disciplines (psychology, math, economics, sociology, political sciences) -scores range from 0 to 1000, but “useful benchmark for very good performance is 600,” attained if both always cooperate together -“very poor performance [benchmark] is 200 points” (if both always D) -winner Tit for Tat (TFT) scored 504 (but if change P=2, does not win) -top 8 entries were nice (defined as not first to defect), rest were not -nice entries’ scores scored from 472 to 504, while best of mean entries only scored 401 points (huge disparity!) -logically, because nice ones cooperate together, this is how TFT wins! (though it cannot get a score higher than its opponent’s)

14 Entrants -important to be nice and forgiving -2 kingmakers (defined as players who do not do well themselves but “LARGELY determine the rankings among the top contenders”): GRAASKAMP and DOWNING -DOWNING most important kingmaker since it had the largest range of scores achieved with the nice rules, important to note DOWNING was not based on TFT principle -now to look at the actual results!, then to examen the strategies, since strategies aside from TFT are just denoted by name of creator 14 Entrants -important to be nice and forgiving -2 kingmakers (defined as players who do not do well themselves but “LARGELY determine the rankings among the top contenders”): GRAASKAMP and DOWNING -DOWNING most important kingmaker since it had the largest range of scores achieved with the nice rules, important to note DOWNING was not based on TFT principle -now to look at the actual results!, then to examen the strategies, since strategies aside from TFT are just denoted by name of creator

STRATEGIES! 1. Tit for Tat (TFT)- winner with points, from Toronto (psychology), as we all know- cooperates on first move, then does what opponent did last move, “eye for eye” style, 4 lines FORTRAN 2. TIDEMAN and CHIERUZZI points, from US (Economics), begins with cooperation/ TFT, but after opponent finishes second run of D, institutes extra punishment  increases number of punishments (D) by 1 with each run of opponent’s defections, then decides whether to give opponent a fresh start and begin with TFT again based on- if it has 10+ points more than opponent, opponent has not started another run of D’s, been 20+ moves since last fresh start, are 10+ moves left, number of opponent’s D’s “differs from generator by at least 3 standard deviations,” 41 lines of code STRATEGIES! 1. Tit for Tat (TFT)- winner with points, from Toronto (psychology), as we all know- cooperates on first move, then does what opponent did last move, “eye for eye” style, 4 lines FORTRAN 2. TIDEMAN and CHIERUZZI points, from US (Economics), begins with cooperation/ TFT, but after opponent finishes second run of D, institutes extra punishment  increases number of punishments (D) by 1 with each run of opponent’s defections, then decides whether to give opponent a fresh start and begin with TFT again based on- if it has 10+ points more than opponent, opponent has not started another run of D’s, been 20+ moves since last fresh start, are 10+ moves left, number of opponent’s D’s “differs from generator by at least 3 standard deviations,” 41 lines of code

STRATEGIES! 3. NYDEGGER points, starts with TFT for first 3 moves unless it was only one to C on first move and only one to D on second move, then it will D on third move, after third move- it chooses based on a complex weighted sum (2 points for opponent’s D, 1 point for own D, then weight this sum for past three terms- 16 for last term, then 4, then 1; if sum = 63, i.e. three turns of mutual defection  it will C) 4. GROFMAN points, always cooperates unless players did not do the same thing on the last move, then cooperates with prob 2/7 5. SHUBIK pts, cooperates until opponent plays D, then it defects once, if other defects again- it begins again with cooperation, in general- “length of retaliation is increased by one for each departure from mutual cooperation” STRATEGIES! 3. NYDEGGER points, starts with TFT for first 3 moves unless it was only one to C on first move and only one to D on second move, then it will D on third move, after third move- it chooses based on a complex weighted sum (2 points for opponent’s D, 1 point for own D, then weight this sum for past three terms- 16 for last term, then 4, then 1; if sum = 63, i.e. three turns of mutual defection  it will C) 4. GROFMAN points, always cooperates unless players did not do the same thing on the last move, then cooperates with prob 2/7 5. SHUBIK pts, cooperates until opponent plays D, then it defects once, if other defects again- it begins again with cooperation, in general- “length of retaliation is increased by one for each departure from mutual cooperation”

STRATEGIES! 6. STEIN pts, TFT except it cooperates always first four moves and defects on last 2 moves (move 199 and 200 of game), every 15 moves checks to see if opponent is RANDOM with chi-squared test of opponent’s transition probabilities and alternating CD/DC moves 7. FRIEDMAN pts, cooperates until opponent defects, then it defects forever 8. DAVIS pts, last of the nice guys, cooperates first 10 moves, then if there is a defection, it will defect forever 9. GRAASKAMP pts, one of kingmakers, TFT for 50 moves, defects on move 51, then plays 5 more TFT, check to see if opponent is RANDOM, if so- D from then on (also checks for TFT, ANALOGY, CLONE), otherwise- randomly defects every 5-15 moves, enough trust STRATEGIES! 6. STEIN pts, TFT except it cooperates always first four moves and defects on last 2 moves (move 199 and 200 of game), every 15 moves checks to see if opponent is RANDOM with chi-squared test of opponent’s transition probabilities and alternating CD/DC moves 7. FRIEDMAN pts, cooperates until opponent defects, then it defects forever 8. DAVIS pts, last of the nice guys, cooperates first 10 moves, then if there is a defection, it will defect forever 9. GRAASKAMP pts, one of kingmakers, TFT for 50 moves, defects on move 51, then plays 5 more TFT, check to see if opponent is RANDOM, if so- D from then on (also checks for TFT, ANALOGY, CLONE), otherwise- randomly defects every 5-15 moves, enough trust

STRATEGIES! 10. DOWNING , main kingmaker, starts with D since assumes opponent is unresponsive (i.e. initially assumes 1/2 for conditional probabilities, its downfall!), from then on- assesses and updates probabilities (that opponent cooperates if DOWNING defects, etc) to calculate choice to maximize its long-term expected payoff, if the 2 conditional probabilities have similar values- DOWNING determines pays to D, conversely- if opponent is responsive (much more likely to play C after DOWNING plays C than after D), then it will cooperate 11. FELD pts, starts with TFT, gradually lowers probability of C following the other plays C to 1/2 by the 200th move 12. JOSS , cooperates 90% after opponent’s C, always D after D 13. TULLOCK , cooperates first 11 moves, then cooperates 10% less than opponent has on preceding 10 moves STRATEGIES! 10. DOWNING , main kingmaker, starts with D since assumes opponent is unresponsive (i.e. initially assumes 1/2 for conditional probabilities, its downfall!), from then on- assesses and updates probabilities (that opponent cooperates if DOWNING defects, etc) to calculate choice to maximize its long-term expected payoff, if the 2 conditional probabilities have similar values- DOWNING determines pays to D, conversely- if opponent is responsive (much more likely to play C after DOWNING plays C than after D), then it will cooperate 11. FELD pts, starts with TFT, gradually lowers probability of C following the other plays C to 1/2 by the 200th move 12. JOSS , cooperates 90% after opponent’s C, always D after D 13. TULLOCK , cooperates first 11 moves, then cooperates 10% less than opponent has on preceding 10 moves

Last of STRATEGIES! 14. GRADUATE STUDENT NAME WITHHELD pts, starts with probability of C of 30%, which is updated every 10 moves if opponent seems very cooperative, very uncooperative, or random, after 130 moves if losing- probability is adjusted, this complex process kept P between 30% and 70%, making it seem random to most opponents 15. RANDOM pts, C with probability 1/2 and D with probability 1/2 (C and D with equal probabilities) Last of STRATEGIES! 14. GRADUATE STUDENT NAME WITHHELD pts, starts with probability of C of 30%, which is updated every 10 moves if opponent seems very cooperative, very uncooperative, or random, after 130 moves if losing- probability is adjusted, this complex process kept P between 30% and 70%, making it seem random to most opponents 15. RANDOM pts, C with probability 1/2 and D with probability 1/2 (C and D with equal probabilities)

Tournament Num. 2 Tournament Num. 2 (1980) -same non-zero sum setting, again round robin tournament (play all) -each entrant was sent report of first tournament, given same task -instead of known number of moves per game, “length of the game was determined probabilistically with chance of ending with each given move” (one way to include w), w chosen so expected median length = 200 moves (w = in second tournament) -average length turned out to be shorter: closer to 150 moves -endgame effects successfully avoided this time -features of entries do not relate to success (length of program, type, nationality, type of program, etc) Tournament Num. 2 Tournament Num. 2 (1980) -same non-zero sum setting, again round robin tournament (play all) -each entrant was sent report of first tournament, given same task -instead of known number of moves per game, “length of the game was determined probabilistically with chance of ending with each given move” (one way to include w), w chosen so expected median length = 200 moves (w = in second tournament) -average length turned out to be shorter: closer to 150 moves -endgame effects successfully avoided this time -features of entries do not relate to success (length of program, type, nationality, type of program, etc)

63 Entrants -6 countries, contests largely recruited via journals, etc -everyone from first tournament re-invited, entrants ranged from 11 year-old Steve Newman to professors from many disciplines, including computer science and evolutionary biology this time -more than half of entries were nice, Tit for Tat (TFT) won again -Tit for Two Tats- too forgiving, suggested post-Tourney 1, submitted Tourney 2 by evolutionary biologist, ended up in bottom half of group -5 representative rules can predict how a given rule did with the 63 rules- GRAASKAMP & KATZEN (S 6 ), PINKLEY (S 30 ), ADAMS (S 35 ), GLADSTEIN (S 46 ), and FEATHERS (S 27 )  predicted tournament score T = (.202) S 6 + (.198) S 30 + (.110) S 35 + (.072) S 46 + (.086) S Entrants -6 countries, contests largely recruited via journals, etc -everyone from first tournament re-invited, entrants ranged from 11 year-old Steve Newman to professors from many disciplines, including computer science and evolutionary biology this time -more than half of entries were nice, Tit for Tat (TFT) won again -Tit for Two Tats- too forgiving, suggested post-Tourney 1, submitted Tourney 2 by evolutionary biologist, ended up in bottom half of group -5 representative rules can predict how a given rule did with the 63 rules- GRAASKAMP & KATZEN (S 6 ), PINKLEY (S 30 ), ADAMS (S 35 ), GLADSTEIN (S 46 ), and FEATHERS (S 27 )  predicted tournament score T = (.202) S 6 + (.198) S 30 + (.110) S 35 + (.072) S 46 + (.086) S 27