Evolving Strategies for the Prisoner’s Dilemma Jennifer Golbeck University of Maryland, College Park Department of Computer Science July 23, 2002.

Slides:



Advertisements
Similar presentations
Concepts of Game Theory II. 2 The prisioners reasoning… Put yourself in the place of prisoner i (or j)… Reason as follows: –Suppose I cooperate… If j.
Advertisements

PowerPoint Slides by Robert F. BrookerCopyright (c) 2001 by Harcourt, Inc. All rights reserved. Strategic Behavior Game Theory –Players –Strategies –Payoff.
Genetic Algorithms.
Tutorial 1 Ata Kaban School of Computer Science University of Birmingham.
Lecture V: Game Theory Zhixin Liu Complex Systems Research Center,
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)
Evolution of Cooperation The importance of being suspicious.
An Introduction to... Evolutionary Game Theory
1.Major Transitions in Evolution 2.Game Theory 3.Evolution of Cooperation.
Automata-based adaptive behavior for economic modeling using game theory Rawan Ghnemat, Khalaf Khatatneh, Saleh Oqeili Al-Balqa’ Applied University, Al-Salt,
Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.
EC – Tutorial / Case study Iterated Prisoner's Dilemma Ata Kaban University of Birmingham.
Institutions and the Evolution of Collective Action Mark Lubell UC Davis.
Evolving New Strategies The Evolution of Strategies in the Iterated Prisoner’s Dilemma 01 / 25.
Story time! Robert Axelrod. Contest #1 Call for entries to game theorists All entrants told of preliminary experiments 15 strategies = 14 entries + 1.
Evolving Game Playing Strategies (4.4.3) Darren Gerling Jason Gerling Jared Hopf Colleen Wtorek.
A Memetic Framework for Describing and Simulating Spatial Prisoner’s Dilemma with Coalition Formation Sneak Review by Udara Weerakoon.
Cs301fs07a5 The Iterated Prisoner’s Dilemma. Overview Two criminals get caught and choose to cooperate or defect to alter their sentence Two criminals.
Coye Cheshire & Andrew Fiore June 19, 2015 // Computer-Mediated Communication Iterated Prisoner’s Dilemma.
6/4/03Genetic Algorithm The Genetic Algorithm The Research of Robert Axelrod The Algorithm of John Holland Reviewed by Eyal Allweil and Ami Blonder.
Agenda, Day 2  Questions about syllabus? About myths?  Prisoner’s dilemma  Prisoner’s dilemma vs negotiation  Play a single round  Play multiple rounds.
林偉楷 Taiwan Evolutionary Intelligence Laboratory.
Co-evolution time, changing environments agents. Static search space  solutions are determined by the optimization process only, by doing variation and.
Advanced Computational Modeling of Social Systems Lars-Erik Cederman and Luc Girardin Center for Comparative and International Studies (CIS) Swiss Federal.
Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote.
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Example Department of Computer Science University of Bologna Italy ( Decentralised, Evolving, Large-scale Information Systems (DELIS)
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
Genetic algorithms Charles Darwin "A man who dares to waste an hour of life has not discovered the value of life"
Models of Cooperation in Social Systems Example 1: Prisoner’s Dilemma Prisoner’s Dilemma: Player 2 Player 1 cooperate defect cooperatedefect 3, 30, 5.
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
S J van Vuuren The application of Genetic Algorithms (GAs) Planning Design and Management of Water Supply Systems.
ART – Artificial Reasoning Toolkit Evolving a complex system Marco Lamieri
Finite Iterated Prisoner’s Dilemma Revisited: Belief Change and End Game Effect Jiawei Li (Michael) & Graham Kendall University of Nottingham.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Daniel Ariosa Ecole Polytechnique Fédérale de Lausanne (EPFL) Institut de Physique de la Matière Complexe CH-1015 Lausanne, Switzerland and Hugo Fort Instituto.
Genetic Algorithms Siddhartha K. Shakya School of Computing. The Robert Gordon University Aberdeen, UK
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
Evolution Programs (insert catchy subtitle here).
Evolving cooperation in one-time interactions with strangers Tags produce cooperation in the single round prisoner’s dilemma and it’s.
Genetic Algorithms Genetic algorithms provide an approach to learning that is based loosely on simulated evolution. Hypotheses are often described by bit.
Section 2 – Ec1818 Jeremy Barofsky
1. Genetic Algorithms: An Overview  Objectives - Studying basic principle of GA - Understanding applications in prisoner’s dilemma & sorting network.
Robert Axelrod’s Tournaments Robert Axelrod’s Tournaments, as reported in Axelrod, Robert. 1980a. “Effective Choice in the Prisoner’s Dilemma.” Journal.
Stabilization of Tag-Mediated Interaction by Sexual Reproduction in an Evolutionary Agent System F. Alkemade D.D.B. van Braget J.A. La Poutr é Copyright.
Game Theory by James Crissey Luis Mendez James Reid.
1. Genetic Algorithms: An Overview 4 학습목표 GA 의 기본원리를 파악하고, Prisoner’s dilemma 와 sorting network 에의 응용 및 이론적 배경을 이해한 다.
1 Social Dilemmas. 2 The Anatomy of Cooperation How does cooperation develop and how is it sustained in an environment where individuals are rewarded.
Iterated Prisoner’s Dilemma Game in Evolutionary Computation Seung-Ryong Yang.
Prof. Dr. Lars-Erik Cederman ETH - Center for Comparative and International Studies (CIS) Seilergraben 49, Room G.2,
Genetic Search Algorithms Matt Herbster. Why Another Search?  Designed in the 1950s, heavily implemented under John Holland (1970s)  Genetic search.
Games People Play. 10: Evolutionary Games Games People Play. Evolutionary Games What if individuals aren’t as smart and calculating as we have assumed.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
1 Integrating Repast Library and Running RePast Examples using JBuilder X 07/19/2004 by Deddy Koesrindartoto Department of Economics Iowa State University.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Indirect Reciprocity in the Selective Play Environment Nobuyuki Takahashi and Rie Mashima Department of Behavioral Science Hokkaido University 08/07/2003.
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
GENETIC ALGORITHM By Siti Rohajawati. Definition Genetic algorithms are sets of computational procedures that conceptually follow steps inspired by the.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
Simulating Evolution Robbie Rosati
PRISONER’S DILEMMA BERK EROL
Evolving New Strategies
Introduction to Genetic Algorithm (GA)
tit-for-tat algorithm
Prisoner’s Dilemma with N-Participants and Optional Cooperation
COEVOLUTION of COOPERATION and SELECTIVE INTERACTION -AN EXPERIMENT-
Towards Realistic Models for Evolution of Cooperation
Boltzmann Machine (BM) (§6.4)
1. Genetic Algorithms: An Overview
Introduction to RePast and Tutorial I
Presentation transcript:

Evolving Strategies for the Prisoner’s Dilemma Jennifer Golbeck University of Maryland, College Park Department of Computer Science July 23, 2002

Overview Previous Research Prisoner’s Dilemma The Genetic Algorithm Results Conclusions

Previous Research

Axelrod Robert Axelrod’s experiments of the 1980’s served as the starting point for this research Implementation closely adheres to the configuration of his experiments Same model for the Prisoner’s Dilemma Minor variation in the implementation of the Genetic Algorithm

Prisoner’s Dilemma

The Prisoner’s Dilemma Model The basic two-player prisoner’s Dilemma Both players are arrested for the same crime Each has a choice –Confess - Cooperate with the authorities (admit to doing the crime) –Deny - Defect against the other player (claim the other person is responsible) No knowledge of “opponent’s” action

Payoff Matrix Optimization If both players cooperate, they each receive 3 points If both players Defect, each receives 1 point If there is a mixed outcome, the Defector gets 5 points and the cooperator gets 0 points

Iterated Game In simulation, the endpoint of the game is unknown to the players, making it essentially an infinitely iterated game Each player has a memory of the previous three rounds on which to base his strategy Strategies are deterministic - for a given history h players will always make the same move With 4 possible configurations in each round and a history of 3, each strategy is comprised of 4 3 = 64 moves

Previous Results Axelrod tournaments Using the three-round history model, teams submitted strategies to be competed in a round- robin tournament Tit for Tat Pavlov strategy, developed after these tournaments, was shown to be an effective strategy as well.

The Genetic Algorithm

The Model Darwinian Survival of the Fittest Genetic representation of entities Fitness function Select most fit individuals to reproduce Mutate Traits of most fit will be passed on Over time, the population will evolve to be more fit, optimal

GA’s and the Prisoner’s Dilemma Population: 20 individuals Chromosome: 64-bit string where each bit represents the Cooperate or Defect move played for a specific strategy

GA’s and PD II Fitness: Each player competes against every other for 64 consecutive rounds, and a cumulative score is maintained Selection:Roulette Wheel selection Reproduction: Random point crossover with replacement Mutation rate Generations: 200,000 generations

Simulation and Results

Hypothesis Past research has looked at which strategy was “best”. This research looks as what makes a “good” strategy. Tit for Tat and Pavlov both perform very well, and share two traits –Defend against Defectors –Cooperate with other cooperators

Hypothesis All populations evolve over time to possess and exhibit these two traits This behavior evolves regardless of the initial makeup of the population

Experiment I Five Initial Populations –All “Always Cooperate (Confess)” (AllC) –All “Always Defect (Deny)” (AllD) –All Tit for Tat –All Pavolv –All Randomly generated (independently)

Experiment II Controls: Tit for Tat and Pavolv –Statistically equal performance Support the hypothesis by showing: –Traits are not present in other initial populations –Over time, populations evolve to exhibit those traits and perform as well as Tit For Tat and Pavlov

Experiment II To show that the hypothesized traits evolve, populations must demonstrate –In the presence of Defectors, evolved populations perform identically to the controls –In the presence of cooperators, evolved populations perform identically to controls

Part 1:Defend Against Defectors I Mix each initial population with a small set of AllD –Tit for Tat and Pavolv (controls) perform at about 80% of maximum –All others perform significantly worse that Tit For Tat and Pavolv –AllC and Random populations perform significantly worse than their normal behavior –This shows that a priori, the AllC and random populations cannot defend against Defectors

Part 1: Defend against Defectors II Evolve each population and then mix with small set of AllD –All populations now perform equally as well as each other, and as well as the TFT and Pavlov controls –Fitness at about 80% maximum

Part 2: Cooperate with Cooperators As before, each startup population is mixed with a small set of AllC –TFT, Pavlov, do very well –AllC does exceptionally well –Others do significantly worse Evolve and then add AllC –All populations perform equally as well as each other –Identical performance to TFT and Pavlov

Performance of Different Experiments

Conclusions

Conclusions I Performance measures show that AllC, AllD, and random populations do not generally possess defensive or cooperative traits a priori After evolution, all populations have changed to incorporate both traits Evolved strategies perform as well as TFT and Pavlov, traditional “best” strategies

Conclusions II In both experiments there is no statistical difference between the performance of evolved populations before and after the introduction of AllC or AllD players Indicates that not only do the populations exhibit hypothesized traits in experimental conditions, but it is their normal behavior to do so.

Future Work

Non-deterministic Players This work shows results for players with deterministic strategies Much previous research has been done on stochastic strategies Preliminary results show that the results presented here apply to stochastic strategies as well, but a formal study is necessary.