Attainability in Repeated Games with Vector Payoffs Eilon Solan Tel Aviv University Joint with: Dario Bauso, University of Palermo Ehud Lehrer, Tel Aviv.

Slides:



Advertisements
Similar presentations
Vincent Conitzer CPS Repeated games Vincent Conitzer
Advertisements

Nash Implementation of Lindahl Equilibria Sébastien Rouillon Journées LAGV, 2007.
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)
Two-Player Zero-Sum Games
Negotiating a stable distribution of the payoff among agents may prove challenging. The issue of coalition formation has been investigated extensively,
Managerial Economics Game Theory for Oligopoly
Joint Strategy Fictitious Play Sherwin Doroudi. “Adapted” from J. R. Marden, G. Arslan, J. S. Shamma, “Joint strategy fictitious play with inertia for.
Calibrated Learning and Correlated Equilibrium By: Dean Foster and Rakesh Vohra Presented by: Jason Sorensen.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
Copyright (c) 2003 Brooks/Cole, a division of Thomson Learning, Inc
Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.
© 2015 McGraw-Hill Education. All rights reserved. Chapter 15 Game Theory.
Repeated games with Costly Observations Eilon Solan, Tel Aviv University Ehud Lehrer Tel Aviv University with.
EC941 - Game Theory Lecture 7 Prof. Francesco Squintani
Game Theory Lecture 9.
6/30/00UAI Regret Minimization in Stochastic Games Shie Mannor and Nahum Shimkin Technion, Israel Institute of Technology Dept. of Electrical Engineering.
Dynamic Games of Complete Information.. Repeated games Best understood class of dynamic games Past play cannot influence feasible actions or payoff functions.
Ecs289m Spring, 2008 Non-cooperative Games S. Felix Wu Computer Science Department University of California, Davis
Christos alatzidis constantina galbogini.  The Complexity of Computing a Nash Equilibrium  Constantinos Daskalakis  Paul W. Goldberg  Christos H.
Chapter 12 Choices Involving Strategy McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All Rights Reserved.
Algoritmi per Sistemi Distribuiti Strategici
Convergent Learning in Unknown Graphical Games Dr Archie Chapman, Dr David Leslie, Dr Alex Rogers and Prof Nick Jennings School of Mathematics, University.
Rational Learning Leads to Nash Equilibrium Ehud Kalai and Ehud Lehrer Econometrica, Vol. 61 No. 5 (Sep 1993), Presented by Vincent Mak
Oblivious Routing for the L p -norm Matthias Englert Harald Räcke 1.
Game Theory and Applications following H. Varian Chapters 28 & 29.
1 Introduction APEC 8205: Applied Game Theory. 2 Objectives Distinguishing Characteristics of a Game Common Elements of a Game Distinction Between Cooperative.
Correlated-Q Learning and Cyclic Equilibria in Markov games Haoqi Zhang.
APEC 8205: Applied Game Theory Fall 2007
5.4 Fundamental Theorems of Asset Pricing (2) 劉彥君.
International Workshop on Computer Vision - Institute for Studies in Theoretical Physics and Mathematics, April , Tehran 1 III SIZE FUNCTIONS.
Extensive Game with Imperfect Information Part I: Strategy and Nash equilibrium.
Constraints in Repeated Games. Rational Learning Leads to Nash Equilibrium …so what is rational learning? Kalai & Lehrer, 1993.
Game Theory.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley.
Learning and Planning for POMDPs Eyal Even-Dar, Tel-Aviv University Sham Kakade, University of Pennsylvania Yishay Mansour, Tel-Aviv University.
1 Economics & Evolution. 2 Cournot Game 2 players Each chooses quantity q i ≥ 0 Player i’s payoff is: q i (1- q i –q j ) Inverse demand (price) No cost.
MAKING COMPLEX DEClSlONS
Bayesian and non-Bayesian Learning in Games Ehud Lehrer Tel Aviv University, School of Mathematical Sciences Including joint works with: Ehud Kalai, Rann.
The Multiplicative Weights Update Method Based on Arora, Hazan & Kale (2005) Mashor Housh Oded Cats Advanced simulation methods Prof. Rubinstein.
3.1. Strategic Behavior Matilde Machado.
1 Efficiency and Nash Equilibria in a Scrip System for P2P Networks Eric J. Friedman Joseph Y. Halpern Ian Kash.
1. 2 Non-Cooperative games Player I Player II I want the maximum payoff to Player I I want the maximum payoff to Player II.
Monetary Economics Game and Monetary Policymaking.
Game Theory (Microeconomic Theory (IV)) Instructor: Yongqin Wang School of Economics and CCES, Fudan University December,
Microeconomics Course E John Hey. Examinations Go to Read.
Payoffs in Location Games Shuchi Chawla 1/22/2003 joint work with Amitabh Sinha, Uday Rajan & R. Ravi.
Perfect competition, with an infinite number of firms, and monopoly, with a single firm, are polar opposites. Monopolistic competition and oligopoly.
Leader-Follower Framework For Control of Energy Services Ali Keyhani Professor of Electrical and Computer Engineering The Ohio State University
Dominance Since Player I is maximizing her security level, she prefers “large” payoffs. If one row is smaller (element- wise) than another,
4.4 The Fundamental Theorem of Calculus
Banks and the Creation of Money. Basic Accounting and Bank Lending 1.For any business: Assets = Liabilities + Capital.
Part 3 Linear Programming
Auctions serve the dual purpose of eliciting preferences and allocating resources between competing uses. A less fundamental but more practical reason.
Algorithms for solving two-player normal form games
1. 2 You should know by now… u The security level of a strategy for a player is the minimum payoff regardless of what strategy his opponent uses. u A.
1 a1a1 A1A1 a2a2 a3a3 A2A Mixed Strategies When there is no saddle point: We’ll think of playing the game repeatedly. We continue to assume that.
Game Theory (Microeconomic Theory (IV)) Instructor: Yongqin Wang School of Economics, Fudan University December, 2004.
5.1.Static Games of Incomplete Information
GAME THEORY Day 5. Minimax and Maximin Step 1. Write down the minimum entry in each row. Which one is the largest? Maximin Step 2. Write down the maximum.
Chapter 7: Random Variables 7.2 – Means and Variance of Random Variables.
Stackleberg-Nash Equilibrium Presentation: Belov Nikolay.
Dynamic Games of Complete Information
Game Theory.
Economics & Evolution.
Multiagent Systems Repeated Games © Manfred Huber 2018.
Game Theory Solutions 1 Find the saddle point for the game having the following payoff table. Use the minimax criterion to find the best strategy for.
Other Convergence Tests
Eilon Solan, Tel Aviv University
Presentation transcript:

Attainability in Repeated Games with Vector Payoffs Eilon Solan Tel Aviv University Joint with: Dario Bauso, University of Palermo Ehud Lehrer, Tel Aviv University

Two players play a repeated game with vector payoffs which are d-dimensional. The total payoff up to stage n is G n. Definition (Blackwell, 1956): A set of payoff vectors A is approachable by player 1 if player 1 has a strategy such average payoff up to stage n, G n /n, converges to A, regardless of the strategy of player 2. Definition: A set of payoff vectors A is attainable by player if player 1 has a strategy such that the total payoff up to stage n, G n, converges to A, regardless of the strategy of player 2.

Motivation 1: Control theory d n is the demand at stage n (multi-dimensional, unknown). s n is the supply at stage n (multi-dimensional, controlled by the decision maker). s n – d n is the excess supply, the amount that is left in our storeroom. We need to bound the total excess supply. Motivation 2: Banking, Capital Adequacy Ratio. c n = bank's capital at stage n a n = bank’s risk-weighted assets at stage n. c n / a n = capital adequacy ratio at stage n. Definition: A set of payoff vectors A is attainable by player if player 1 has a strategy such that the total payoff up to stage n, G n, converges to A, regardless of the strategy of player 2.

A repeated game with vector payoffs that are d-dimensional (A 1, A 2, u). The Model The game is in continuous time. We consider non-anticipating behavior strategies with σ i = (σ i (t)) is a process with values in ∆(A i ), such that there is an increasing sequence of stopping times τ i1 < τ i2 < τ i3 < … that satisfies: For each t, τ ik ≤ t < τ i,k+1 σ i (t)) is measurable w.r.t. the information at time τ ik.

Definition: A set A in R d is strongly attainable by player 1 if player 1 has a strategy that guarantees that the distance lim t→∞ d(A,G t ) = 0, regardless of player 2’s strategy. Definition: A set A in R d is attainable by player 1 if for every ε the set B(A, ε) is strongly attainable by player 1. B(A, ε) := { x : d(x,A) ≤ ε } The Model g t = payoff at time t (given the mixed actions of the players). ∫ s=0 t g s (mixed action pair at time s)ds G t =

Theorem: the set of vectors attainable by player 1 is a closed and convex cone. If the vector x is attainable Then there is a strategy σ 1 that ensures that ∫ s=0 t g s (mixed action pair at time s)dslim t→∞ = x for every strategy σ 2 of player 2. The strategy σ 1, accelerated by a factor of β, attains x/ β. If the vectors x and y are attainable, to attain x+y, first attain x, then forget past play and attain y.

Theorem: the vector x is attainable by player 1 if and only if a) The vector 0 is attainable by player 1. b) For every function f : ∆(A 1 ) → ∆(A 2 ) the vector x is in the cone generated by { u(p,f(p)) : p in ∆(A 1 ) }. If (b) does not hold: Player 2 plays f(α) whenever player 1 plays the mixed aciton α. If (a) + (b) hold: Consider an auxiliary one shot-game game in which player 1 chooses a distribution over ∆(A 1 ) and player 2 chooses f : ∆(A 1 ) → ∆(A 2 ). For every strategy of player 2, player 1 has a response such that the average payoff is x. Therefore player 1 has a strategy that “pushes towards x” whatever f player 2 chooses.

Theorem: the following conditions are equivalent: a) The vector 0 is attainable by player 1. b) One has v λ ≥ 0 for every λ in R d, where v λ is the value of the game projected in the direction λ. If (b) does not hold: There is q in ∆(A 2 ) such that the payoff is in some open halfspace. If Player 2 always plays this q, the payoff does not converge to 0. If (b) holds: Player 1 plays in small intervals. In each interval he pushes the payoff towards 0.

1) Characterization of attainable sets. Further Questions 2) Characterization of strongly attainable sets and vectors. 3) Characterization of attainable sets in discrete time. 4) Characterization of attainable sets when payoff is discounted.