Download presentation
Presentation is loading. Please wait.
Published byElinor Moody Modified over 9 years ago
1
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 1/39 n-Player Stochastic Games with Additive Transitions Frank Thuijsman János Flesch & Koos Vrieze Maastricht University European Journal of Operational Research 179 (2007) 483–497
2
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 2/39 Outline Model Brief History of Stochastic Games Additive Transitions Examples
3
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 3/39 a s = (a 1 s, a 2 s, , a n s ) joint action r s (a s ) = (r 1 s (a s ), r 2 s (a s ), , r n s (a s )) rewards p s (a s ) = (p s (1|a s ), p s (2|a s ), , p s (z|a s )) transitions Infinite horizon Complete Information Perfect Recall Independent and Simultaneous Choices Finite Stochastic Game 1sz rs(as)rs(as) ps(as)ps(as)
4
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 4/39 3-Player Stochastic Game 0, 0, 0 1, 3, 00, 3, 13, 0, 1 1 2 34 (1, 0, 0, 0) (2/3, 1/3, 0, 0) (2/3, 0, 1/3, 0) (2/3, 0, 0, 1/3) (1/3, 1/3, 0, 1/3) (1/3, 0, 1/3, 1/3) (1/3, 1/3, 1/3, 0) (0, 1/3, 1/3, 1/3) (0, 1, 0, 0) (0, 0, 1, 0)(0, 0, 0, 1) 0, 0, 0 T L F R B N 1 2 3
5
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 5/39 general strategy i : N × S × H → X i (k, s, h) → X i s Markov strategy f i : N × S → X i (k, s) → X i s stationary strategy x i : S → X i (s) → X i s opponents’ strategy i, f i and x i Strategies mixed actions
6
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 6/39 -Discounted rewards (with 0 < < 1) i s ( ) = E s ((1 ) k k 1 R i k ) Limiting average rewards i s ( ) = E s (lim K → K 1 K k=1 R i k ) Rewards
7
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 7/39 MinMax Values -Discounted minmax v i s inf i sup i i s ( ) Limiting average minmax v i s inf i sup i i s ( ) Highest rewards player i can defend
8
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 8/39 Equilibrium = ( i ) i N is an -equilibrium if i s ( i, i ) ≤ i s ( ) + for all i, for all i and for all s.
9
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 9/39 Question (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3, 1 0, 0 Any -equilibrium ? (0, 0, 0, 1) 4 2,1
10
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 10/39 Highlights from (Finite) Stochastic Games History Shapley, 1953 0-sum, “discounted” Everett, 1957 0-sum, recursive, undiscounted Gillette, 1957 0-sum, big match problem
11
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 11/39 Highlights from (Finite) Stochastic Games History Fink, 1964 & Takahashi, 1964 n-player, discounted Blackwell & Ferguson, 1968 0-sum, big match solution Liggett & Lippmann, 1969 0-sum, perfect inf., undiscounted
12
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 12/39 Highlights from (Finite) Stochastic Games History Kohlberg, 1974 0-sum, absorbing, undiscounted Mertens & Neyman, 1981 0-sum, undiscounted Sorin, 1986 2-player, Paris Match, undiscounted
13
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 13/39 Highlights from (Finite) Stochastic Games History Vrieze & Thuijsman, 1989 2-player, absorbing, undiscounted Thuijsman & Raghavan, 1997 n-player, perfect inf., undiscounted Flesch, Thuijsman, Vrieze, 1997 3-player, absorbing example, undiscounted
14
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 14/39 Highlights from (Finite) Stochastic Games History Solan, 1999 3-player, absorbing, undiscounted Vieille, 2000 2-player, undiscounted Solan & Vieille, 2001 n-player, quitting, undiscounted
15
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 15/39 Additive Transitions p s (a s ) = n i=1 i s p i s (a i s ) p i s (a i s ) transition probabilities controlled by player i in state s i s transition power of player i in state s 0 ≤ i s ≤ 1 and i i s = 1 for each s
16
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 16/39 Example for 2-Player Additive Transitions p s (a s ) = n i=1 i s p i s (a i s ) (1, 0, 0) (0.7, 0, 0.3)(0, 0.7, 0.3) (0.3, 0.7, 0) p 1 1 (2) = (0, 0, 1) p 1 1 (1) = (1, 0, 0) p 2 1 (1)=(1, 0, 0)p 2 1 (2)=(0, 1, 0) 1 1 = 0.3 2 1 = 0.7 1
17
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 17/39 Results 1.0-equilibria for n-player AT games (threats!) 2.0-opt. stationary strat. for 0-sum AT games 3.Stat. -equilibria for 2-player abs. AT games 4.Result 3 can not be strengthened, neither to 3-player abs. AT games, nor to 2-player non-abs. AT games, nor to give stat. 0-equilibria
18
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 18/39 The Essential Observation Additive Transitions induce a Complete Ordering of the Actions If a i s is “better” than b i s against some strategy, Then a i s is “better” than b i s against any strategy.
19
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 19/39 “Better” Consider strategies a i s and b i s for player i If for some strategy a i s we have t S p s (t | a i s, a i s ) v i t ≥ t S p s (t | b i s, a i s ) v i t Then for all strategies b i s we have t S p s (t | a i s, b i s ) v i t ≥ t S p s (t | b i s, b i s ) v i t
20
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 20/39 Because …. If t S p s (t | a i s, a i s )v i t ≥ t S p s (t | b i s, a i s )v i t Then i s t S p s (t | a i s ) v i t + j i j s t S p s (t | a j s ) v i t ≥ i s t S p s (t | b i s ) v i t + j i j s t S p s (t | a j s ) v i t
21
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 21/39 which implies that …. i s t S p s (t | a i s ) v i t + j i j s t S p s (t | b j s ) v i t ≥ i s t S p s (t | b i s ) v i t + j i j s t S p s (t | b j s ) v i t And therefore t S p s (t | a i s, b i s )v i t ≥ t S p s (t | b i s, b i s )v i t
22
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 22/39 “Best” The “best” actions for player i in state s are those that maximize the expression t S p s (t | a i s ) v i t
23
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 23/39 The Restricted Game Let G be the original AT game and let G* be the restricted AT game, where each player is restricted to his “best” actions.
24
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 24/39 The Restricted Game Now v * i ≥ v i for each player i. In G* : t S p s (t | a* s ) v* i t = v* i s i, s, a* s In G : t S p s (t | b i s, a* i s ) v i t < v i s i, s, a* i s, b i s
25
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 25/39 The Restricted Game If x* i yields at least v* i in G*, then x* i yields at least v* i in G as well.
26
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 26/39 Ex. 1: 2-Player Absorbing AT Game (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0 1, 3 3, 1 (1, 0, 0) (0, 1, 0) (0, 0, 1) 0.5
27
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 27/39 (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0 1, 3 3, 1 T, L B, L B, RT, R 0, 0 -1, 3 -2, 2-3, 1 T, L 0, 0 NO stationary 0-equilibrium Ex. 1: 2-Player Absorbing AT Game
28
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 28/39 (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0 stationary -equilibrium with >0 1 1 0 equilibrium rewards ≈ (( 1 1, 3 3, 1 Ex. 1: 2-Player Absorbing AT Game
29
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 29/39 (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0 1, 3 3, 1 non-stationary -equilibrium Player 1: B, T, T, T, …. Player 2: R, R, R, R, …. equilibrium rewards (( 2 2 3, 1 1, 3 Ex. 1: 2-Player Absorbing AT Game
30
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 30/39 Ex. 2: 2-Player Non-Absorbing AT Game (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3, 1 0, 0 q > 0 p < q 1 q 1 p p q = 0 p > 1 q > 0 NO stationary -equilibrium with >0 (0, 0, 0, 1) 4 2,1
31
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 31/39 non-stationary -equilibrium Player 1: T, B, B, B, …. Player 2: R, R, R, R, …. (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3, 1 0, 0 (0, 0, 0, 1) 4 2,1 Ex. 2: 2-Player Non-Absorbing AT Game equilibrium rewards ((2, 1), (
32
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 32/39 alternative -equilibrium Player 1: T, T, T, T, …. Player 2: L, R, R, R, …. (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3, 1 0, 0 (0, 0, 0, 1) 4 2,1 Ex. 2: 2-Player Non-Absorbing AT Game equilibrium rewards ((2, 1), (
33
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 33/39 Ex. 3: 3-Player Absorbing AT Game 0, 0, 0 1, 3, 00, 1, 33, 0, 1 1 2 34 (1, 0, 0, 0) (2/3, 1/3, 0, 0) (2/3, 0, 1/3, 0) (2/3, 0, 0, 1/3) (1/3, 1/3, 0, 1/3) (1/3, 0, 1/3, 1/3) (1/3, 1/3, 1/3, 0) (0, 1/3, 1/3, 1/3) (0, 1, 0, 0) (0, 0, 1, 0)(0, 0, 0, 1) 0, 0, 0 T L F R B N How to share 4 among three people if only few solutions are allowed? 1 2 3
34
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 34/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1, 3, 00, 1, 33, 0, 1 1 2 34 (1, 0, 0, 0) (2/3, 1/3, 0, 0) (2/3, 0, 1/3, 0) (2/3, 0, 0, 1/3) (1/3, 1/3, 0, 1/3) (1/3, 0, 1/3, 1/3) (1/3, 1/3, 1/3, 0) (0, 1/3, 1/3, 1/3) (0, 1, 0, 0) (0, 0, 1, 0)(0, 0, 0, 1) 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N Ex. 3: 3-Player Absorbing AT Game
35
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 35/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1/3 * 2/3 * 1 * 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N NO stationary -equilibrium Ex. 3: 3-Player Absorbing AT Game 1, 3, 0 0, 1, 3 3, 0, 1
36
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 36/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1/3 * 2/3 * 1 * 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N Player 1 on B: 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, …. Player 2 on R: 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, …. Player 3 on F: 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, …. non-stationary -equilibrium Ex. 3: 3-Player Absorbing AT Game
37
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 37/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1/3 * 2/3 * 1 * 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N equilibrium rewards (1, 2, 1) Ex. 3: 3-Player Absorbing AT Game Player 1 on B: 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, …. Player 2 on R: 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, …. Player 3 on F: 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, ….
38
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 38/39 Results 1.0-equilibria for n-player AT games (threats!) 2.0-opt. stationary strat. for 0-sum AT games 3.Stat. -equilibria for 2-player abs. AT games 4.Result 3 can not be strengthened, neither to 3-player abs. AT games, nor to 2-player non-abs. AT games, nor to give stat. 0-equilibria
39
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 39/39 frank@math.unimaas.nl ?
40
05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 40/39 GAME VER
41
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 41/39 GAME VER
42
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 42/39 GAME VER
43
frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 43/39 GAME VER
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.