Repeated games with Costly Observations Eilon Solan, Tel Aviv University Ehud Lehrer Tel Aviv University with.

Slides:



Advertisements
Similar presentations
Mixed Strategies.
Advertisements

Vincent Conitzer CPS Repeated games Vincent Conitzer
Infinitely Repeated Games
Pondering more Problems. Enriching the Alice-Bob story Go to AGo to B Go to A Alice Go to B Go to A Go to B Go shoot pool Alice.
Crime, Punishment, and Forgiveness
Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.
© 2009 Institute of Information Management National Chiao Tung University Game theory The study of multiperson decisions Four types of games Static games.
Stackelberg -leader/follower game 2 firms choose quantities sequentially (1) chooses its output; then (2) chooses it output; then the market clears This.
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)
Mixed Strategies For Managers
Games With No Pure Strategy Nash Equilibrium Player 2 Player
Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.
Simultaneous- Move Games with Mixed Strategies Zero-sum Games.
Infinitely Repeated Games. In an infinitely repeated game, the application of subgame perfection is different - after any possible history, the continuation.
ECO290E: Game Theory Lecture 5 Mixed Strategy Equilibrium.
M9302 Mathematical Models in Economics Instructor: Georgi Burlakov 2.5.Repeated Games Lecture
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
EC3224 Autumn Lecture #04 Mixed-Strategy Equilibrium
EC941 - Game Theory Lecture 7 Prof. Francesco Squintani
Game Theory Lecture 9.
Game Theory Lecture 8.
Games People Play. 8: The Prisoners’ Dilemma and repeated games In this section we shall learn How repeated play of a game opens up many new strategic.
Equilibrium Concepts in Two Player Games Kevin Byrnes Department of Applied Mathematics & Statistics.
Dynamic Games of Complete Information.. Repeated games Best understood class of dynamic games Past play cannot influence feasible actions or payoff functions.
A camper awakens to the growl of a hungry bear and sees his friend putting on a pair of running shoes, “You can’t outrun a bear,” scoffs the camper. His.
Attainability in Repeated Games with Vector Payoffs Eilon Solan Tel Aviv University Joint with: Dario Bauso, University of Palermo Ehud Lehrer, Tel Aviv.
Chapter 12 Choices Involving Strategy McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All Rights Reserved.
Rational Learning Leads to Nash Equilibrium Ehud Kalai and Ehud Lehrer Econometrica, Vol. 61 No. 5 (Sep 1993), Presented by Vincent Mak
An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.
ECON6036 1st semester Format of final exam Same as the mid term
Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.
Beyond Nash: Raising the Flag of Rebellion Yisrael Aumann University of Haifa, 11 Kislev 5771 ( )
APEC 8205: Applied Game Theory Fall 2007
Repeated games - example This stage game is played 2 times Any SPNE where players behave differently than in a 1-time game? Player 2 LMR L1, 10, 05, 0.
TOPIC 6 REPEATED GAMES The same players play the same game G period after period. Before playing in one period they perfectly observe the actions chosen.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
UNIT III: COMPETITIVE STRATEGY Monopoly Oligopoly Strategic Behavior 7/21.
Nash Equilibrium - definition A mixed-strategy profile σ * is a Nash equilibrium (NE) if for every player i we have u i (σ * i, σ * -i ) ≥ u i (s i, σ.
Game Theoretic Analysis of Oligopoly lr L R 0000 L R 1 22 The Lane Selection Game Rational Play is indicated by the black arrows.
Anonymizing Web Services Through a Club Mechanism With Economic Incentives Mamata Jenamani Leszek Lilien Bharat Bhargava Department of Computer Sciences.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley.
© 2009 Institute of Information Management National Chiao Tung University Lecture Notes II-2 Dynamic Games of Complete Information Extensive Form Representation.
1. problem set 6 from Osborne’s Introd. To G.T. p.210 Ex p.234 Ex p.337 Ex. 26,27 from Binmore’s Fun and Games.
Bayesian and non-Bayesian Learning in Games Ehud Lehrer Tel Aviv University, School of Mathematical Sciences Including joint works with: Ehud Kalai, Rann.
1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester Week 46 (November 14-15)
Dynamic Competitive Revenue Management with Forward and Spot Markets Srinivas Krishnamoorthy Guillermo Gallego Columbia University Robert Phillips Nomis.
Punishment and Forgiveness in Repeated Games. A review of present values.
Learning in Multiagent systems
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 11.
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Moshe Tennenholtz, Aviv Zohar Learning Equilibria in Repeated Congestion Games.
The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.
Punishment, Detection, and Forgiveness in Repeated Games.
Mixed Strategies and Repeated Games
Game Theory (Microeconomic Theory (IV)) Instructor: Yongqin Wang School of Economics, Fudan University December, 2004.
5.1.Static Games of Incomplete Information
Computation of Mixed Strategy Equilibria in Repeated Games Juho Timonen Instructor: PhD (Tech) Kimmo Berg Supervisor: Prof. Harri Ehtamo Saving.
Entry Deterrence Players Two firms, entrant and incumbent Order of play Entrant decides to enter or stay out. If entrant enters, incumbent decides to fight.
Game Theory Georg Groh, WS 08/09 Verteiltes Problemlösen.
Dynamic Games of Complete Information
Vincent Conitzer CPS Repeated games Vincent Conitzer
Information Design: A unified Perspective
ECE700.07: Game Theory with Engineering Applications
Multiagent Systems Repeated Games © Manfred Huber 2018.
Vincent Conitzer Repeated games Vincent Conitzer
Collaboration in Repeated Games
Eilon Solan, Tel Aviv University
Vincent Conitzer CPS Repeated games Vincent Conitzer
Presentation transcript:

Repeated games with Costly Observations Eilon Solan, Tel Aviv University Ehud Lehrer Tel Aviv University with

Literature Review Ben Porath and Kahneman (2003) Players do not observe each other’s actions. After each stage, for each subset Q, player i can pay c i (Q) and stealthly learn the actions that the players in Q just played. Players can make public announcements. The discount factor is close to 1. Result: A folk theorem when the discount factor goes to 1.

Literature Review Miyagawa, Miyahara, Sekiguchi (2008) Players costlessly observe random signals. Payoff depends on private signal and own action. After each stage, for each subset Q, player i can pay c i (Q) and stealthly learn the actions that the players in Q just played. Public randomization device. The discount factor is close to 1. Result: Under proper assumptions on signalling structure, a folk theorem when the discount factor goes to 1.

Literature Review Flesch and Perea (2009) Players can purchase information on any subset of past stages, paying c for each such stage (and observing all players’ actions at these stages). |N| ≥ 3, |A i | ≥ 4. The discount factor is close to 1. Result: Under full dimensionality condition, a folk theorem when the discount factor goes to 1.

Literature Review Fudenberg and Levine (1994) Repeated games with imperfect monitoring. There are public signals and private signals. The discount factor is close to 1. Result: Characterization of the limit set of public perfect equilibrium payoffs when the discount factor goes to 1.

Literature Review Fudenberg and Levine (1994) A repeated game with costly observation can presented as a repeated game with imperfect monitoring. 2,30,1 6,74,5 T B L R 2,30,1 6,74,5 T+no B+no L+no R+no 2-c,30-c,1 6-c,74-c,5 T+obs B+obs 0,1-c 4,5-c 0-c,1-c 4-c,5-c 2,3-c 6,7-c 2-c,3-c 6-c,7-c L+obs R+obs

The Model (Informal) Consider a two-player repeated game in continuous time. The players sit in different rooms and do not observe each other. After every time instant each player can open a window between the rooms and observe the action the other has just played (at a fixed cost c). The fact that a window is opened is common knowledge. Payoff of Player i if he observes the other at times (τ(i,k)) k = ʃ exp(-rt) u i (a(t)) dt - c Σ k exp(-r τ(i,k))

The Model (formal) Two players, 1 and 2. A i = the set of actions of Player i (finite). A := A 1 × A 2. u i : A → R = Payoff function of Player i. Payoff of Player i if he observes the other at stages (τ(i,k)) k = Σ n (1-r Δ )r Δ(n-1) u i (a n ) - c Σ k r Δ(τ(i,k)-1) The base game: G = (A 1,A 2,u 1,u 2 ). The repeated game G(r,Δ,c) : r = the discount factor. Δ = time between two consecutive stages. c = the cost of each observation.

Differences from Literature Time is “continuous”. Implies that cost of observation is “infinite” relative to stage payoff. Observing the other is common knowledge. No correlation device / public announcements. Two players.

The Minmax Value Theorem: The minmax value in mixed strategies of player i in the game G(r,Δ,c) is equal to v i, his minmax value in the base game G. Theorem: The set of public perfect equilibrium payoffs in the game G(r,Δ,c) contains the convex hull of the set of equilibrium payoffs of the base game G (provided Δ is small). Equilibria without Observation

No Folk Theorem (when Δ is small) ! 0,43,3 1,14,0 C D CD E N (r,Δ,c) = The set of Nash equilibrium payoffs in the repeated game. The Prisoner’s Dilemma x x 1 = δ u (1- δ) Cont 1 – (??) c ≤ δ u (1- δ) x 1 – (??) c 1 < x 1 ≤ u 1 1 – (??) c / δ

Main Result Theorem: Assume that for each player i there is an equilibrium in the base game that yields that player a payoff strictly higher than his minmax value. Then E P (r) = E N (r) = orange set. E P (r,Δ,c) := the set of public perfect equilibrium payoffs. E P (r,c) := lim Δ→0 E P (r,Δ,c) E P (r) := limsup c→0 E P (r,c) M 1 := max over pairs (α 1,α 2 ) where α 2 is a best response to α 1 of min u 1 (a 1,α 2 ) : a 1 in the support of α 1 M 2 := the analog quantity for Player 2. v2v2 v1v1 M1M1 M2M2

Theorem: Let x be an equilibrium payoff in the base game G, and let y be a vector that satisfies v < y < x (in each coordinate). If c is sufficiently small, then y is a Nash equilibrium payoff of G(r,Δ,c). Role 1: Throwing Away Utility Proof: The players play the mixed action pair that supports x as an equilibrium in the base game. To waste utility, Player 1 observes Player in predetermined stages, such that the total observation cost is x 1 -y 1. Similarly, Player 2 observes Player 1 in predeterminsed stages such that the total observation cost is x 2 -y 2. A deviation triggers a punishment by the minmax value.

Theorem: The set of Nash equilibrium payoffs of G(r,Δ,c) is (almost) convex. Role 2: Knowing the Other Player’s Payoff Proof: Suppose the players would like to randomly choose between two equilibria. At the first stage, Players throw away utility to ensure that there is no profitable deviation in the first stage. T B LR 1/2

Additional Equilibria? Suppose that at stage t the players do not play an equilibrium of the base game G. Then, any player who does not play a best response must be monitored with positive probability. Observation is costly, and therefore must be compensated. The benefit of deviation: O(1-r Δ ) The expected loss due to deviation: O(p), where p is the probability that the other player obsrves you.

α 1 is a best response to α 2. Player 2 is indifferent among his pure actions in the support of α 2. u(α 1,α 2 )=(0,1). β 2 is a best response to β 1. Player 1 is indifferent among his pure actions in the support of β 1. u(β 1, β 2 )=(1,0). (0,1)=u(α 1,α 2 ) Role 3: Probabilistic Observation for Threat. (1,0)=u(β 1, β 2 )

α 1 is a best response to α 2. Player 2 is indifferent among his pure actions in the support of α 1. u(α 1,α 2 )=(0,1). β 2 is a best response to β 1. Player 1 is indifferent among his pure actions in the support of β 1. u(β 1, β 2 )=(1,0). Role 3: Probabilisitc Observation for Threat. (0,1)=u(α 1,α 2 ) (1,0)=u(β 1, β 2 ) Implemented payoff x

α 1 is a best response to α 2. Player 2 is indifferent among his pure actions in the support of α 1. u(α 1,α 2 )=(0,1). β 2 is a best response to β 1. Player 1 is indifferent among his pure actions in the support of β 1. u(β 1, β 2 )=(1,0). Role 3: Probabilisitc Observation for Threat. (0,1)=u(α 1,α 2 ) (1,0)=u(β 1, β 2 ) Implemented payoff x x 1 = (1- δ) y 1 = (1- δ) z 1 – c Play (α 1,α 2 ) Player 1 observes with prob. p. z (if P1 observed) Continuation payoffs y (if P1 did not observe)

A Public Perfect Equilibrium with Low Payoffs For N stages the players play (β 1,β 2 ) and observe each other. For M stages the players play (γ 1,β 2 ) and Player 1 observes Player 2. After stage N+M the players play (α 1,α 2 ) and they throw away utility for indifference. If a deviation is detected, they start from the beginning. z=u(α 1,α 2 ) Let z=u(α 1,α 2 ) be an equilibrium payoff in the base game G. Let β i be a minmax strategy of Player i in G. Let γ i be a best response of Player i to β 3-i.

Merci תודה רבה Thank you شكرا спасибо Danke Dank Köszönöm Hvala dziękuję