Iterated Prisoner’s Dilemma Game in Evolutionary Computation 2003. 10. 2 Seung-Ryong Yang.

Slides:



Advertisements
Similar presentations
Evolving Cooperation in the N-player Prisoner's Dilemma: A Social Network Model Dept Computer Science and Software Engineering Golriz Rezaei Michael Kirley.
Advertisements

Tutorial 1 Ata Kaban School of Computer Science University of Birmingham.
Evolution of Cooperation The importance of being suspicious.
6-1 LECTURE 6: MULTIAGENT INTERACTIONS An Introduction to MultiAgent Systems
Chapter 6 Game Theory © 2006 Thomson Learning/South-Western.
An Introduction to... Evolutionary Game Theory
EC941 - Game Theory Lecture 7 Prof. Francesco Squintani
Automata-based adaptive behavior for economic modeling using game theory Rawan Ghnemat, Khalaf Khatatneh, Saleh Oqeili Al-Balqa’ Applied University, Al-Salt,
Evolving Cooperative Strategies in Multi-Agent Systems Using a Coevolutionary Algorithm Cesario C. Julaton III, Ramanathan S. Thinniyam, Una-May O’Reilly.
EC – Tutorial / Case study Iterated Prisoner's Dilemma Ata Kaban University of Birmingham.
Institutions and the Evolution of Collective Action Mark Lubell UC Davis.
Satisfaction Equilibrium Stéphane Ross. Canadian AI / 21 Problem In real life multiagent systems :  Agents generally do not know the preferences.
Evolving New Strategies The Evolution of Strategies in the Iterated Prisoner’s Dilemma 01 / 25.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
A Memetic Framework for Describing and Simulating Spatial Prisoner’s Dilemma with Coalition Formation Sneak Review by Udara Weerakoon.
A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.
Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal.
Human Social Dilemmas Cooperation Between Non-Relatives Complex Evolutionary Problem Repeated Interaction, Conditional Cooperation Human Cooperation Often.
Optimizing Online Auction Bidding Strategies with Genetic Programming Ekaterina “Kate” Smorodkina.
Genetic Algorithms and Their Applications John Paxton Montana State University August 14, 2003.
Evolutionary Computation and Co-evolution Alan Blair October 2005.
6/4/03Genetic Algorithm The Genetic Algorithm The Research of Robert Axelrod The Algorithm of John Holland Reviewed by Eyal Allweil and Ami Blonder.
Introduction to Game Theory Yale Braunstein Spring 2007.
Game Theory April 9, Prisoner’s Dilemma  One-shot, simultaneous game  Nash Equilibrium (individually rational strategies) is not Pareto Optimal.
1 On the Agenda(s) of Research on Multi-Agent Learning by Yoav Shoham and Rob Powers and Trond Grenager Learning against opponents with bounded memory.
Agent Based Modeling and Simulation
Evolutionary algorithms
Genetic Algorithm.
Changing Perspective… Common themes throughout past papers Repeated simple games with small number of actions Mostly theoretical papers Known available.
Agenda, Day 2  Questions about syllabus? About myths?  Prisoner’s dilemma  Prisoner’s dilemma vs negotiation  Play a single round  Play multiple rounds.
林偉楷 Taiwan Evolutionary Intelligence Laboratory.
Co-evolution time, changing environments agents. Static search space  solutions are determined by the optimization process only, by doing variation and.
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 1 Authors : Christopher Ballinger, Sushil Louis
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
1 Near-Optimal Play in a Social Learning Game Ryan Carr, Eric Raboin, Austin Parker, and Dana Nau Department of Computer Science, University of Maryland.
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Design of a real time strategy game with a genetic AI By Bharat Ponnaluri.
Models of Cooperation in Social Systems Example 1: Prisoner’s Dilemma Prisoner’s Dilemma: Player 2 Player 1 cooperate defect cooperatedefect 3, 30, 5.
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.
GENETIC ALGORITHMS.  Genetic algorithms are a form of local search that use methods based on evolution to make small changes to a popula- tion of chromosomes.
Evolutionary Programming
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Evolving Reactive NPCs for the Real-Time Simulation Game.
Improving the Genetic Algorithm Performance in Aerial Spray Deposition Management University of Georgia L. Wu, W.D. Potter, K. Rasheed USDA Forest Service.
Section 2 – Ec1818 Jeremy Barofsky
1. Genetic Algorithms: An Overview  Objectives - Studying basic principle of GA - Understanding applications in prisoner’s dilemma & sorting network.
The Evolution of Specialisation in Groups – Tags (again!) David Hales Centre for Policy Modelling, Manchester Metropolitan University, UK.
Stabilization of Tag-Mediated Interaction by Sexual Reproduction in an Evolutionary Agent System F. Alkemade D.D.B. van Braget J.A. La Poutr é Copyright.
Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.
Game Theory by James Crissey Luis Mendez James Reid.
1. Genetic Algorithms: An Overview 4 학습목표 GA 의 기본원리를 파악하고, Prisoner’s dilemma 와 sorting network 에의 응용 및 이론적 배경을 이해한 다.
Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.
Evolving Strategies for the Prisoner’s Dilemma Jennifer Golbeck University of Maryland, College Park Department of Computer Science July 23, 2002.
Design of a real time strategy game with a genetic AI By Bharat Ponnaluri.
Chapter 15: Co-Evolutionary Systems
The Good News about The Bad News Gospel. The BAD News Gospel: Humans are “fallen”, “depraved” and incapable of doing the right thing “Human Nature” is.
Evolving Virtual Creatures B2.2 Vincent Visser | Complexity through simplicity.
Evolving Specialisation, Altruism & Group-Level Optimisation Using Tags David Hales Centre for Policy Modelling, Manchester Metropolitan University, UK.
1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.
Evolutionary Programming A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Chapter 5.
An application of the genetic programming technique to strategy development Presented By PREMKUMAR.B M.Tech(CSE) PONDICHERRY UNIVERSITY.
Genetic Algorithms. Solution Search in Problem Space.
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
GENETIC ALGORITHM By Siti Rohajawati. Definition Genetic algorithms are sets of computational procedures that conceptually follow steps inspired by the.
Simulating Evolution Robbie Rosati
Evolving the goal priorities of autonomous agents
Evolving New Strategies
1. Genetic Algorithms: An Overview
Lecture 4. Niching and Speciation (1)
Presentation transcript:

Iterated Prisoner’s Dilemma Game in Evolutionary Computation Seung-Ryong Yang

2 Agenda Motivation Iterated Prisoner’s Dilemma Game Related Works Strategic Coalition Improving Generalization Ability Experimental Results Conclusion

3 Motivation Evolutionary approach Understanding complex behaviors by investigating simulation results using evolutionary process Giving a way to find optimal strategies in a dynamic environment IPD game Model complex phenomena such as social and economic behaviors Provide a testbed to model dynamic environment Objectives Obtaining multiple good strategies Forming coalition to improve generalization ability

4 Iterated Prisoner’s Dilemma Game (1/2) Overview Prisoner’s possible choice Defection Cooperation Characteristics Non-cooperative Non-zerosum Types of Game 2IPD (2-player Iterated Prisoner’s Dilemma) game NIPD (N-player Iterated Prisoner’s Dilemma) game CooperateDefect CooperateR / RT / S DefectS / TP / P Payoff Matrix of 2IPD Game by Axelrod, R.(1984) CooperateDefect Cooperate3 / 30 / 5 Defect5 / 01 / 1

5 Iterated Prisoner’s Dilemma Game (2/2) Representation of Strategy History TableRecent Action ∙∙∙ Last ActionRecent Action ∙∙∙ Last Action Own HistoryOpponent’s History 010 ∙∙∙ 1 l = 2 : Example History N History

6 Related Works Previous Study Paul J. Darwen and Xin Yao (1997) : Speciation as Automatic Categorical Modularization Onn M. Shehory, et al. (1998) : Multi-agent Coordination through Coalition Formation Y. G. Seo and S. B. Cho (1999) : Exploiting Coalition in Co-Evolutionary Learning Issues Topics are broad about coalition formation in multi-agent environment Darwen and Yao have studied coalition in IPD game, but different Focused on cooperation, the number of player, payoff variances, etc

7 What is Different? Co-evolutionary Learning Selection Method Rank Based Roulette wheel Tournament Coalition Formation Coalition keeps surviving to next generation Condition to form coalition is flexible Decision Making in Coalition Adapting several decision making methods to coalition Borda Function, Condorect Function Average Payoff, Highest Payoff Weighted Voting

8 Evolving Strategy To evolve strategy, we use ; Genetic algorithm Co-evolutionary learning Strategic coalition Evolutionary Process

9 Evolution of Agents (1/2) CiCi C1C1 CkCk Before PopulationCurrent Population Next Population CiCi C1C1 CkCk CjCj CiCi C1C1 CkCk CjCj ClCl Evolution of Agents Agents can develop their strategy using co-evolutionary learning Weak agents are removed from the population Evolution of Coalition Formed coalition survives to next generation Agents can join coalition generation by generation Coalition survives or grows up

10 Evolution of Agents (2/2) Problem : Possibility of evolving by weak agents Caused by removing better agent from the population who belongs to coalition Making new agents by mixing better agents within coalition Population CkCk CiCi CjCj A1A1 A2A2 Random Extraction Coalition Mutation AiAi Repeat as the number of agents belong to coalition

11 Strategic Coalition (1/2) What is Coalition? A cooperative game as a set A of agents in which each subset of A is called coalition - Matthias Klusch and Andreas Gerber, 2002 A group of agents that work jointly in order to accomplish their tasks - Onn M. Shehory, 1995 Coalition in the IPD game Forming coalition through round-robin game Pursuing more payoff using generalization ability Coalition forms autonomously without supervision

12 Definitions Definition 1 : Coalition Value Definition 2 : Payoff Function Definition 3 : Coalition Identification Strategic Coalition (2/2) (1) (2) (3) Definition 4 : Decision Making Definition 5 : Payoff Distribution

13 Coalition Formation (1/2) A1A1 A2A2 A3A3 A4A4 AkAk AnAn AmAm A5A5 AjAj AiAi A2A2 AiAi A5A5 A3A3 C1C1 AjAj C2C2 CiCi A1A1 A4A4 C1C1 AkAk AlAl C2C2 AmAm AnAn CiCi Initial Population Population Including coalition 2IPD game Form Coalition AiAi A5A5 A5A5 C1C1 C2C2 CiCi......

14 Coalition Formation (2/2) Algorithm 2IPD Game Exceeds iteration per generation? Game type? Agent vs. Agent Agent vs. Coalition Coalition vs. Coalition Satisfy condition for forming coalition? Forming Coalition Joining Coalition Genetic Operation Satisfy condition? N N N Y Y Stop Y Forming coalition 1.Round-robin 2IPD game 2.Obtain rank 3.Determine confidence of agent according to the rank Joining coalition 1.Round-robin 2IPD game 2.Obtain rank 3.If number of agents > max. number of agents within a coalition, remove the weakest agent 4.Determine confidence of each agent

15 Coalition Decision Making Decision making To decide coalition’s opinion Use weighted voting method Sharing profits Distribution payoff with each agent’s confidence Rank influences each weight Determining next action of coalition : Weight for cooperation of coalition C i : Weight for defection of coalition C i CiCi CjCj CkCk ClCl ∑ ∑ CiCi CjCj CkCk ClCl Previous ActionNext Action C D or

16 Weight of Agents Adjusting weight Give incentive to agents in coalition It reflects decision making of coalition CiCi CjCj CkCk ClCl ∑ ∑ CiCi CjCj CkCk ClCl Previous ActionNext Action C D or Adjusting weight

17 Improving Generalization Ability (1/2) Problem of one good strategy Not adaptive to dynamic environment Obtain multiple good strategies for specific environment Ex) Biological immune system Method Fitness sharing Adjust confidences of multiple strategies by evolution Co-evolution Coalition formation

18 Improving Generalization Ability (2/2) How good a player performs against unknown player Evaluation Random Generation of 100 Strategies 2IPD Game Extract Top Strategies in the Population Top Strategies Genetically Evolved Strategies IPD Game

19 Test Strategy Test Strategies StrategyCharacteristics Tit-For-TatInitially cooperate, and then follow opponent TriggerInitially cooperate. Once opponent defects, continuously defect AllDAlways defect CDCDCooperate and defect over and over CCDCooperate and cooperate and defect RandomRandom move Example Strategy Tit-for-Tat Trigger AllD CDCD CCD Random

20 Example of Game Tit-for-Tat Vs. Evolved Strategy history Payoff

21 Test Environment Population size : 100 Crossover rate : 0.3 Mutation rate : Number of generations : 200 Number of iterations : a third of population Training set : Well-known 6 strategies Experimental Result

22 Evolved Strategy vs. Random Rank Genotype of Evolved strategy Random Avg. PayoffS.D.Avg. PayoffS.D Random strategy is one of the weakest strategies for 2IPD game. In this game, the evolved strategies have a good performance. All strategies win the game against Random test strategies with high payoffs. Experimental Result

23 Evolved Strategy vs. Tit-for-Tat Rank Genotype of Evolved strategy Tit-for-Tat Avg. PayoffS.D.Avg. PayoffS.D Tit-for-Tat is a mimic strategy that gives “cooperation” on the first move in 2IPD game. The evolved strategies counteract in a proper way not to lose the game. It proves the generalization ability of the evolved strategies well. Experimental Result

24 Evolved Strategy vs. Trigger Rank Genotype of Evolved strategy Trigger Avg. PayoffS.D.Avg. PayoffS.D Trigger strategy is never forgiving strategy for opponent’s defection. The way to win a game against Trigger is also choosing “defection” iteratively. Experimental Result

25 Evolved Strategy vs. AllD Rank Genotype of Evolved strategy ALLD Avg. PayoffS.D.Avg. PayoffS.D The only way not to lose the game against AllD is only choosing “defection” on all moves. There is no way to cooperate for the game. Experimental Result

26 Number of Coalition Generation Coalition Coalition survives next generation. In early evolutionary process, most of coalition are formed. It makes genetic diversity high and better choice against opponents. Coalition can grow if the conditions of agents are satisfied. Experimental Result

27 Comparing the Results The evolved strategies get more payoff against Random, CCD and CDCD than Tit-for-Tat, Trigger and AllD. It describes the evolved strategies exploit opponent’s actions well. Experimental Result

28 Bias of the Strategy Bias Generation Bias shows how next choice of the strategies is selected against its opponents. The higher rate of bias means that a strategy chooses more “cooperation” than “defection” with a bias rate and vice versa. Experimental Result

29 Conclusions Conclusion Strategic coalition might be a robust method that can adapt to a dynamic environment Decision making methods influence the results, but not serious The evolved strategies by coalition generalize well against various opponents Discussion Can the strategic coalition be adapted to n-IPD game ? Which parameters in IPD game influence generalization ability ? How can make opponent strategies to test ? How can adapt this problem to real world ?

30 Examples (1) Market Observer

31 Examples (2) Forest Prediction