The Evolution of Conventions H. Peyton Young Presented by Na Li and Cory Pender.

Slides:



Advertisements
Similar presentations
An Introduction to Game Theory Part V: Extensive Games with Perfect Information Bernhard Nebel.
Advertisements

THE PRICE OF STOCHASTIC ANARCHY Christine ChungUniversity of Pittsburgh Katrina LigettCarnegie Mellon University Kirk PruhsUniversity of Pittsburgh Aaron.
. Markov Chains. 2 Dependencies along the genome In previous classes we assumed every letter in a sequence is sampled randomly from some distribution.
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.
Lecturer: Moni Naor Algorithmic Game Theory Uri Feige Robi Krauthgamer Moni Naor Lecture 8: Regret Minimization.
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)
Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.
Congestion Games with Player- Specific Payoff Functions Igal Milchtaich, Department of Mathematics, The Hebrew University of Jerusalem, 1993 Presentation.
ECO290E: Game Theory Lecture 5 Mixed Strategy Equilibrium.
1 Chapter 14 – Game Theory 14.1 Nash Equilibrium 14.2 Repeated Prisoners’ Dilemma 14.3 Sequential-Move Games and Strategic Moves.
Game Theory and Computer Networks: a useful combination? Christos Samaras, COMNET Group, DUTH.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
EC3224 Autumn Lecture #04 Mixed-Strategy Equilibrium
Non-cooperative Games Elon Kohlberg February 2, 2015.
 1. Introduction to game theory and its solutions.  2. Relate Cryptography with game theory problem by introducing an example.  3. Open questions and.
Learning in games Vincent Conitzer
Game-theoretic analysis tools Necessary for building nonmanipulable automated negotiation systems.
Game Theory Lecture 8.
ECO290E: Game Theory Lecture 4 Applications in Industrial Organization.
1 Evolution & Economics No. 5 Forest fire p = 0.53 p = 0.58.
EC941 - Game Theory Prof. Francesco Squintani Lecture 8 1.
1 Game Theory Lecture 2 Game Theory Lecture 2. Spieltheorie- Übungen P. Kircher: Dienstag – 09: HS M S. Ludwig: Donnerstag Uhr.
1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar.
An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.
Harsanyi transformation Players have private information Each possibility is called a type. Nature chooses a type for each player. Probability distribution.
Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.
Reinforcement Learning to Play an Optimal Nash Equilibrium in Coordination Markov Games XiaoFeng Wang and Tuomas Sandholm Carnegie Mellon University.
Correlated-Q Learning and Cyclic Equilibria in Markov games Haoqi Zhang.
Abstract Though previous explorations of equilibria in game theory have incorporated the concept of error-making, most do not consider the possibility.
APEC 8205: Applied Game Theory Fall 2007
PRISONER’S DILEMMA By Ajul Shah, Hiten Morar, Pooja Hindocha, Amish Parekh & Daniel Castellino.
Extensive Game with Imperfect Information Part I: Strategy and Nash equilibrium.
Simulating Normal Random Variables Simulation can provide a great deal of information about the behavior of a random variable.
DANSS Colloquium By Prof. Danny Dolev Presented by Rica Gonen
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
1. problem set 6 from Osborne’s Introd. To G.T. p.210 Ex p.234 Ex p.337 Ex. 26,27 from Binmore’s Fun and Games.
CPS Learning in games Vincent Conitzer
Bottom-Up Coordination in the El Farol Game: an agent-based model Shu-Heng Chen, Umberto Gostoli.
MAKING COMPLEX DEClSlONS
Reading Osborne, Chapters 5, 6, 7.1., 7.2, 7.7 Learning outcomes
1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester Week 46 (November 14-15)
Chapter 12 Choices Involving Strategy Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written.
Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie Mellon University.
Punishment and Forgiveness in Repeated Games. A review of present values.
Learning in Multiagent systems
Optimizing Scrip Systems: Efficiency, Crashes, Hoarders, and Altruists By Ian A. Kash, Eric J. Friedman, Joseph Y. Halpern Presentation by Avner May 12/10/08.
1 Efficiency and Nash Equilibria in a Scrip System for P2P Networks Eric J. Friedman Joseph Y. Halpern Ian Kash.
1 On the Emergence of Social Conventions: modeling, analysis and simulations Yoav Shoham & Moshe Tennenholtz Journal of Artificial Intelligence 94(1-2),
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
Dynamic Games & The Extensive Form
Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.
Topic 3 Games in Extensive Form 1. A. Perfect Information Games in Extensive Form. 1 RaiseFold Raise (0,0) (-1,1) Raise (1,-1) (-1,1)(2,-2) 2.
Lecture 5A Mixed Strategies and Multiplicity Not every game has a pure strategy Nash equilibrium, and some games have more than one. This lecture shows.
Data Analysis Econ 176, Fall Populations When we run an experiment, we are always measuring an outcome, x. We say that an outcome belongs to some.
Punishment, Detection, and Forgiveness in Repeated Games.
Mixed Strategies and Repeated Games
1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.
Strategic Behavior in Business and Econ Static Games of complete information: Dominant Strategies and Nash Equilibrium in pure and mixed strategies.
AP Statistics, Section 7.2, Part 1 2  The Michigan Daily Game you pick a 3 digit number and win $500 if your number matches the number drawn. AP Statistics,
Punishment, Detection, and Forgiveness in Repeated Games.
Vincent Conitzer CPS Learning in games Vincent Conitzer
5.1.Static Games of Incomplete Information
Econ 805 Advanced Micro Theory 1 Dan Quint Fall 2009 Lecture 1 A Quick Review of Game Theory and, in particular, Bayesian Games.
Game representations, game-theoretic solution concepts, and complexity Tuomas Sandholm Computer Science Department Carnegie Mellon University.
Dynamic Games of complete information: Backward Induction and Subgame perfection.
Krishnendu ChatterjeeFormal Methods Class1 MARKOV CHAINS.
Satisfaction Games in Graphical Multi-resource Allocation
A useful reduction (SAT -> game)
CSRG Presented by Souvik Das 11/02/05
Presentation transcript:

The Evolution of Conventions H. Peyton Young Presented by Na Li and Cory Pender

What is a convention? Customary behavior Self-enforcing Not always symmetric Follow given that other people do Examples Driving on the right Eating with utensils Men propose to women

How are conventions “chosen”? A convention is an equilibrium, but there could be others Some equilibria are inherently more reasonable (Harsanyi and Selten) One equilibrium more prominent (Schelling)

Evolutionary explanation Past plays influence players’ choices One equilibrium eventually becomes more prevalent This paper shows that behavior will converge over time to a Nash, given some limitations on the game

The model n people randomly selected from large population Base actions on sampling of plays from recent past No individual learning Mistakes possible “Adaptive play”

Goals In weakly acyclic games: –If samples are sufficiently incomplete and memory is finite, converge to Nash With mistakes: –Almost always converges to a particular equilibrium

Adaptive play n-person game G, strategy set S i N divided into classes C 1, C 2,..., C n. G played once per period; t = 1, 2,... Play at time t is s(t) = (s 1 (t), s 2 (t),... s n (t)) In class C i, utility u i (s) History of plays is h(t) = (s(1), s(2),..., s(t))

Choosing strategies Choose m, k such that 1≤k≤m In period t+1, where t ≥ m: –Each player sees k plays from past m periods –k/m is completeness of information –Plays are not necessarily equally likely to be seen

First m plays random H consists of all sequences of length m drawn from ∏S i Finite Markov chain on H with initial h(m) Successor of h H is h’ H For s S i, p i (s|h) P i ( · ) is a best-reply distribution –p i (s|h) > 0 iff s is i’s best reply for some k –p i (s|h) independent of t P moving from h to h’ is ∏ i=1,n p i (s i |h)

Convergence of adaptive play h is an absorbing state iff it is Nash played m times h = h’ = (s, s,..., s) Convergence  strict Nash –But strict Nash does not guarantee convergence –Cycling Use weakly acyclic games

Best-reply graph s s’ s*

Theorem G is a weakly acyclic n-person game L(s) = length shortest path from s to Nash L G = max s L(s) If k ≤ m/(L G + 2), adaptive play “almost surely” converges to convention Main idea: If information is sufficiently incomplete, adaptive play converges

Proof Positive probability that: –At some t + 1, all agents sample last k plays (call this µ) –From periods t + 1 to t + k, all agents choose sample µ –Each agent makes same best-reply to µ k times in a row So positive probability of a run (s, s,..., s) from t + 1 to t + k

If s is a strict Nash: Positive probability that from t + k + 1 to t + m, each agent samples last k plays s is played for m - k more periods, then absorbing state has been reached

If s is not a strict Nash: There is a best-reply path from s to strict Nash s r along the path s  s 1  s 2 ...  s r For s  s 1 : –Player i samples from periods t + 1 to t + k (i.e. samples s) –Everyone else samples µ –Positive probability that these will occur for the next k periods By similar argument, you can move from s 1 to s 2, and so on to s r Hence limiting the size of k

Example Battle of the sexes –Opera vs. football game - yield or not yield Man Woman YieldNot Yield Yield0,01,√2 Not Yield√2,10,0

Why must we limit k? Let k = m Consider initial sequence where they both yielded/both didn’t yield To decide next round: pick choice with highest expected payoff (in this case, each yields if 1 - f > f√2) What would happen if k is bounded as specified by adaptive play?

Is this the best we can do? Note that the theorem guarantees convergence to an equilibrium –But which equilibrium? Also, it seems unlikely that people would always play best response perfectly

Back to our example... With slightly different payoffs Man Woman YieldNot Yield Yield0,01,√2 Not Yield√2/2, 1/20,0

Let k = 1, m = 3 We can imagine a situation where –Both yield on first round –Both not yield on second round –On 3rd round, woman samples yielding round, man not yielding round –What would be each player’s best reply? –Next round? –Get stuck in suboptimal equilibrium Perhaps introducing mistakes could solve this problem

Simulation