Changing Perspective… Common themes throughout past papers Repeated simple games with small number of actions Mostly theoretical papers Known available.

Slides:



Advertisements
Similar presentations
The simplex algorithm The simplex algorithm is the classical method for solving linear programs. Its running time is not polynomial in the worst case.
Advertisements

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.
Markov Decision Process
Genetic Algorithms (Evolutionary Computing) Genetic Algorithms are used to try to “evolve” the solution to a problem Generate prototype solutions called.
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)
Game Theoretical Insights in Strategic Patrolling: Model and Analysis Nicola Gatti – DEI, Politecnico di Milano, Piazza Leonardo.
Game Theory and Computer Networks: a useful combination? Christos Samaras, COMNET Group, DUTH.
An Introduction to... Evolutionary Game Theory
Learning in games Vincent Conitzer
Seminar In Game Theory Algorithms, TAU, Agenda  Introduction  Computational Complexity  Incentive Compatible Mechanism  LP Relaxation & Walrasian.
Class Project Due at end of finals week Essentially anything you want, so long as it’s AI related and I approve Any programming language you want In pairs.
EC – Tutorial / Case study Iterated Prisoner's Dilemma Ata Kaban University of Birmingham.
Gizem ALAGÖZ. Simulation optimization has received considerable attention from both simulation researchers and practitioners. Both continuous and discrete.
Planning under Uncertainty
1 Structure of search space, complexity of stochastic combinatorial optimization algorithms and application to biological motifs discovery Robin Gras INRIA.
Lecture 1 - Introduction 1.  Introduction to Game Theory  Basic Game Theory Examples  Strategic Games  More Game Theory Examples  Equilibrium  Mixed.
Review: Game theory Dominant strategy Nash equilibrium
A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.
6/2/2001 Cooperative Agent Systems: Artificial Agents Play the Ultimatum Game Steven O. Kimbrough Presented at FMEC 2001, Oslo Joint work with Fang Zhong.
1 Computing Nash Equilibrium Presenter: Yishay Mansour.
Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.
Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal.
Complexity Analysis (Part I)
AWESOME: A General Multiagent Learning Algorithm that Converges in Self- Play and Learns a Best Response Against Stationary Opponents Vincent Conitzer.
Nash Q-Learning for General-Sum Stochastic Games Hu & Wellman March 6 th, 2006 CS286r Presented by Ilan Lobel.
Algorithms and Economics of Networks Abraham Flaxman and Vahab Mirrokni, Microsoft Research.
Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
1 On the Agenda(s) of Research on Multi-Agent Learning by Yoav Shoham and Rob Powers and Trond Grenager Learning against opponents with bounded memory.
1 Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב Congestion Games, Potential Games and Price of Anarchy Liad Blumrosen ©
Simple search methods for finding a Nash equilibrium Ryan Porter, Eugene Nudelman, and Yoav Shoham Games and Economic Behavior, Vol. 63, Issue 2. pp ,
Minimax strategies, Nash equilibria, correlated equilibria Vincent Conitzer
CPS Learning in games Vincent Conitzer
MAKING COMPLEX DEClSlONS
Revolutionizing the Field of Grey-box Attack Surface Testing with Evolutionary Fuzzing Department of Computer Science & Engineering College of Engineering.
Mechanisms for Making Crowds Truthful Andrew Mao, Sergiy Nesterko.
The Multiplicative Weights Update Method Based on Arora, Hazan & Kale (2005) Mashor Housh Oded Cats Advanced simulation methods Prof. Rubinstein.
© 2009 Institute of Information Management National Chiao Tung University Lecture Note II-3 Static Games of Incomplete Information Static Bayesian Game.
Derivative Action Learning in Games Review of: J. Shamma and G. Arslan, “Dynamic Fictitious Play, Dynamic Gradient Play, and Distributed Convergence to.
Learning in Multiagent systems
林偉楷 Taiwan Evolutionary Intelligence Laboratory.
1 Economics & Evolution Number 3. 2 The replicator dynamics (in general)
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Design of a real time strategy game with a genetic AI By Bharat Ponnaluri.
Benk Erika Kelemen Zsolt
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
Pareto Coevolution Presented by KC Tsui Based on [1]
Predictive Design Space Exploration Using Genetically Programmed Response Surfaces Henry Cook Department of Electrical Engineering and Computer Science.
GENETIC ALGORITHMS.  Genetic algorithms are a form of local search that use methods based on evolution to make small changes to a popula- tion of chromosomes.
A Study of Central Auction Based Wholesale Electricity Markets S. Ceppi and N. Gatti.
Computing and Approximating Equilibria: How… …and What’s the Point? Yevgeniy Vorobeychik Sandia National Laboratories.
Applications of Genetic Algorithms TJHSST Computer Systems Lab By Mary Linnell.
Section 2 – Ec1818 Jeremy Barofsky
1. Genetic Algorithms: An Overview  Objectives - Studying basic principle of GA - Understanding applications in prisoner’s dilemma & sorting network.
Iterated Prisoner’s Dilemma Game in Evolutionary Computation Seung-Ryong Yang.
Chapter VI What should I know about the sizes and speeds of computers?
Evolving RBF Networks via GP for Estimating Fitness Values using Surrogate Models Ahmed Kattan Edgar Galvan.
Vincent Conitzer CPS Learning in games Vincent Conitzer
Day 9 GAME THEORY. 3 Solution Methods for Non-Zero Sum Games Dominant Strategy Iterated Dominant Strategy Nash Equilibrium NON- ZERO SUM GAMES HOW TO.
Analysis of Algorithms Spring 2016CS202 - Fundamentals of Computer Science II1.
Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.
Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-
Complexity Analysis (Part I)
Announcements Homework 3 due today (grace period through Friday)
Multiagent Systems Repeated Games © Manfred Huber 2018.
Applications of Genetic Algorithms TJHSST Computer Systems Lab
Lecture 4. Niching and Speciation (1)
Complexity Analysis (Part I)
Normal Form (Matrix) Games
Complexity Analysis (Part I)
Presentation transcript:

Changing Perspective… Common themes throughout past papers Repeated simple games with small number of actions Mostly theoretical papers Known available actions Game-theory perspective How are these papers different? Evolutionary approach Empirical papers (no analytic proofs) Complex games, large strategy space AI perspective

Comparing Today’s Papers Similarities Use genetic algorithm to generate strategies Evaluate generated strategies against a fixed set of opponents Describe a (potentially) iterative approach Differences Phelps focuses on process within each iteration Ficici focuses on movement from one iteration to the next Different types of games (constant sum v. general sum)

A Novel Method for Automatic Strategy Acquisition in N-Player Non-zero-sum Games (Phelps et al.) Double auction setting --> potentially infinitely large strategy space (intractable game) Basis of iterative approach (not really discussed in paper) that improves set of heuristic strategies through search Uses genetic algorithm to find best strategy for current market conditions Evaluates strategy using replicator dynamics. Uses the size of the basin of attraction (“market share”) to quantify the fitness of the strategy.

Replicator Dynamics: Calculating the Payoff Matrix Given a starting point, predicts trajectory of population mix of pure strategies using the equation: The utility is based off of a roughly calibrated heuristic payoff matrix. Payoff matrix generated through simulation of the game to get the expected payoff for each agent. Payoffs are independent of type, justified by the type- dependent variations in actual game being averaged out via sampling --> simplifies payoff matrix Assumes that can simulate game

Replicator Dynamics: Finding the Candidate Strategy After running many trajectories, calculate the size of the basin of attraction attributable to each pure strategy Run perturbation analysis (increase payoffs of candidate strategy) and replot replicator-dynamics direction field. Answers question of which strategy is worthwhile to concentrate efforts of improvement. In the paper’s example, ran this algorithm on 3 strategies: truth-telling (TT), Roth-Erev (RE), and Gjerstad-Dickhaut (GD) and found that RE is the best strategy to focus improvement efforts.

The Novel Method For each individual in a generation from GA: Sample over types simulate games Heuristic Payoff Matrix RD Fitness Value Create next generation in GA based on fitness values (Repeat for each generation)

Results Found the optimized strategy (OS) to be an RE strategy with stateless Q-learning. OS’s market share against original RE, GD, and TT is 65%, greater than TT (32%), GD (3%) and the original RE (0%). However, took 1800 CPU hours to compute (over 2 months) Keep in mind, this is only for one iteration. Presumably, next step would be to substitute OS for RE in the set of strategies and run the algorithm again. But Phelps does not go into this.

Some Strengths and Weaknesses of Phelps paper Strengths Application to real-world setting (double auction) Applies to general-sum game with (infinitely) large strategy space Weaknesses Very large computation time. Additionally, time increases exponentially with number of strategies (due to computation of RD) Dependent on having attractors in RD (there exists games that this does not hold true) Any more? Other remarks?

Ficici and Pollack Paper Phelps does not address how to iterate his approach Second paper, A Game-Theoretic Memory Mechanism for Coevolution (Ficici and Pollack) uses similar approach but with focus on moving from one iteration to the next In each iteration, performs search for strategies (GA) Evaluates fitness of strategies by playing them against most fit opponent from last iteration (N) Also keeps set of potentially useful strategies in memory (M), updates memory every iteration

Motivation / Set up for Second Paper For symmetric zero-sum 2-player games (generalizes to asymmetric constant-sum 2-player games) In coevolutionary algorithms, the population contains genetic diversity for effective search for new strategies AND often represents the solution to the problem --> these two objectives can be conflicting Avoid “forgetting” - problematic if have intransitive cycle This paper separates these two functions: Uses memory mechanism to hold solution and previously encountered strategies. This lets the population maintain genetic diversity that’s useful for performing effective search.

Nash Memory Mechanism, some definitions For mixed strategy m C(m) = support set of m S(m) = security set of m = {all s : E(m, s) ≥ 0} Nash Memory Mechanism: N & M - mutually exclusive sets N: holds best mixed strategy over time M: holds strategies not in N, but that may be useful later, has limited capacity c. H: search heuristic (in this paper a GA was used) Q: set of strategies delivered by H

Memory Mechanism Updating For each iteration, GA evolves strategies by testing against fixed opponent N. At end of iteration, tests each q  Q and see if beats N Let W = “winners from GA” = {q  Q: E(q, N) > 0} Use polynomial-time linear program to update N -> N’ and M -> M’ defined by the following constraints: C(N’)  (W  N  M) S(N’)  (W  N  M), Note: S(N’) not nec.  S(N) C(M’)  (W  N  M) specifically, M’ gets unnecessary strategies from W, strategies released from N, and unneeded leftovers from M

Memory Mechanism Updating W N N’ M M’ From Search Heuristic Pre-Update State Post-Update State Discard some if exceed capacity

Strengths / Weaknesses of Ficici Paper Strengths Incorporates game theoretic notion of Nash into solution Avoids the problems coming from forgetting in context of games with intransitive cycles Time-efficient (polynomial time for memory update) Outperformed other memory mechanisms (BOG, DT) in test Weaknesses Allows for some memory loss in update rule No analytical proofs Strong results only (seem to) hold for 2-player constant sum For general sum, lose efficiency, have multiple equilibria Performance when intransitive cycles not as important?

Concluding Remarks How do these papers compare to what we’ve seen previously? Values and contexts for different approaches? Other comments?