Presentation is loading. Please wait.

Presentation is loading. Please wait.

05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 1/39 n-Player Stochastic Games with Additive Transitions.

Similar presentations


Presentation on theme: "05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 1/39 n-Player Stochastic Games with Additive Transitions."— Presentation transcript:

1 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 1/39 n-Player Stochastic Games with Additive Transitions Frank Thuijsman János Flesch & Koos Vrieze Maastricht University European Journal of Operational Research 179 (2007) 483–497

2 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 2/39 Outline Model Brief History of Stochastic Games Additive Transitions Examples

3 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 3/39 a s = (a 1 s, a 2 s, , a n s ) joint action r s (a s ) = (r 1 s (a s ), r 2 s (a s ), , r n s (a s )) rewards p s (a s ) = (p s (1|a s ), p s (2|a s ), , p s (z|a s )) transitions Infinite horizon Complete Information Perfect Recall Independent and Simultaneous Choices Finite Stochastic Game 1sz rs(as)rs(as) ps(as)ps(as)

4 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 4/39 3-Player Stochastic Game 0, 0, 0 1, 3, 00, 3, 13, 0, 1 1 2 34 (1, 0, 0, 0) (2/3, 1/3, 0, 0) (2/3, 0, 1/3, 0) (2/3, 0, 0, 1/3) (1/3, 1/3, 0, 1/3) (1/3, 0, 1/3, 1/3) (1/3, 1/3, 1/3, 0) (0, 1/3, 1/3, 1/3) (0, 1, 0, 0) (0, 0, 1, 0)(0, 0, 0, 1) 0, 0, 0 T L F R B N 1 2 3

5 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 5/39 general strategy   i : N × S × H → X i (k, s, h) → X i s Markov strategy f i : N × S → X i (k, s) → X i s stationary strategy x i : S → X i (s) → X i s opponents’ strategy    i, f  i and x  i Strategies mixed actions

6 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 6/39  -Discounted rewards (with 0 <  < 1)  i  s (  ) = E s  ((1  )  k  k  1 R i k ) Limiting average rewards  i s (  ) = E s  (lim K →  K  1  K k=1 R i k ) Rewards

7 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 7/39 MinMax Values  -Discounted minmax v i  s  inf   i sup  i  i  s (  ) Limiting average minmax v i s  inf   i sup  i  i s (  ) Highest rewards player i can defend

8 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 8/39  Equilibrium  = (  i ) i  N is an  -equilibrium if  i s (  i,   i ) ≤  i s (  ) +  for all  i, for all i and for all s.

9 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 9/39 Question (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3,  1 0, 0 Any  -equilibrium ? (0, 0, 0, 1) 4 2,1

10 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 10/39 Highlights from (Finite) Stochastic Games History Shapley, 1953 0-sum, “discounted” Everett, 1957 0-sum, recursive, undiscounted Gillette, 1957 0-sum, big match problem

11 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 11/39 Highlights from (Finite) Stochastic Games History Fink, 1964 & Takahashi, 1964 n-player, discounted Blackwell & Ferguson, 1968 0-sum, big match solution Liggett & Lippmann, 1969 0-sum, perfect inf., undiscounted

12 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 12/39 Highlights from (Finite) Stochastic Games History Kohlberg, 1974 0-sum, absorbing, undiscounted Mertens & Neyman, 1981 0-sum, undiscounted Sorin, 1986 2-player, Paris Match, undiscounted

13 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 13/39 Highlights from (Finite) Stochastic Games History Vrieze & Thuijsman, 1989 2-player, absorbing, undiscounted Thuijsman & Raghavan, 1997 n-player, perfect inf., undiscounted Flesch, Thuijsman, Vrieze, 1997 3-player, absorbing example, undiscounted

14 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 14/39 Highlights from (Finite) Stochastic Games History Solan, 1999 3-player, absorbing, undiscounted Vieille, 2000 2-player, undiscounted Solan & Vieille, 2001 n-player, quitting, undiscounted

15 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 15/39 Additive Transitions p s (a s ) =  n i=1 i s p i s (a i s ) p i s (a i s ) transition probabilities controlled by player i in state s i s transition power of player i in state s 0 ≤ i s ≤ 1 and  i i s = 1 for each s

16 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 16/39 Example for 2-Player Additive Transitions p s (a s ) =  n i=1 i s p i s (a i s ) (1, 0, 0) (0.7, 0, 0.3)(0, 0.7, 0.3) (0.3, 0.7, 0) p 1 1 (2) = (0, 0, 1) p 1 1 (1) = (1, 0, 0) p 2 1 (1)=(1, 0, 0)p 2 1 (2)=(0, 1, 0) 1 1 = 0.3 2 1 = 0.7 1

17 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 17/39 Results 1.0-equilibria for n-player AT games (threats!) 2.0-opt. stationary strat. for 0-sum AT games 3.Stat.  -equilibria for 2-player abs. AT games 4.Result 3 can not be strengthened, neither to 3-player abs. AT games, nor to 2-player non-abs. AT games, nor to give stat. 0-equilibria

18 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 18/39 The Essential Observation Additive Transitions induce a Complete Ordering of the Actions If a i s is “better” than b i s against some strategy, Then a i s is “better” than b i s against any strategy.

19 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 19/39 “Better” Consider strategies a i s and b i s for player i If  for some strategy  a  i s we have  t  S p s (t | a i s, a  i s ) v i t ≥  t  S p s (t | b i s, a  i s ) v i t Then for all strategies b  i s we have  t  S p s (t | a i s, b  i s ) v i t ≥  t  S p s (t | b i s, b  i s ) v i t

20 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 20/39 Because …. If  t  S p s (t | a i s, a  i s )v i t ≥  t  S p s (t | b i s, a  i s )v i t Then i s  t  S p s (t | a i s ) v i t +  j  i j s  t  S p s (t | a  j s ) v i t ≥ i s  t  S p s (t | b i s ) v i t +  j  i j s  t  S p s (t | a  j s ) v i t

21 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 21/39 which implies that …. i s  t  S p s (t | a i s ) v i t +  j  i j s  t  S p s (t | b  j s ) v i t ≥ i s  t  S p s (t | b i s ) v i t +  j  i j s  t  S p s (t | b  j s ) v i t And therefore  t  S p s (t | a i s, b  i s )v i t ≥  t  S p s (t | b i s, b  i s )v i t

22 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 22/39 “Best” The “best” actions for player i in state s are those that maximize the expression  t  S p s (t | a i s ) v i t

23 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 23/39 The Restricted Game Let G be the original AT game and let G* be the restricted AT game, where each player is restricted to his “best” actions.

24 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 24/39 The Restricted Game Now v * i ≥ v i for each player i. In G* :  t  S p s (t | a* s ) v* i t = v* i s  i, s, a* s In G :  t  S p s (t | b i s, a*  i s ) v i t < v i s  i, s,  a*  i s, b i s

25 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 25/39 The Restricted Game If x* i yields at least v* i in G*, then x* i yields at least v* i in G as well.

26 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 26/39 Ex. 1: 2-Player Absorbing AT Game (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0  1, 3  3, 1 (1, 0, 0) (0, 1, 0) (0, 0, 1) 0.5

27 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 27/39 (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0  1, 3  3, 1 T, L B, L B, RT, R 0, 0 -1, 3 -2, 2-3, 1 T, L 0, 0 NO stationary 0-equilibrium Ex. 1: 2-Player Absorbing AT Game

28 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 28/39 (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0 stationary  -equilibrium with  >0 1  1 0 equilibrium rewards ≈ ((  1   1, 3  3, 1 Ex. 1: 2-Player Absorbing AT Game

29 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 29/39 (0, 0, 1)(0, 1, 0) (1, 0, 0)(0.5, 0.5, 0) (0, 0.5, 0.5) (0.5, 0, 0.5) 3 2 1 0, 0  1, 3  3, 1 non-stationary  -equilibrium Player 1: B, T, T, T, …. Player 2: R, R, R, R, …. equilibrium rewards ((  2  2  3, 1  1, 3  Ex. 1: 2-Player Absorbing AT Game

30 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 30/39 Ex. 2: 2-Player Non-Absorbing AT Game (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3,  1 0, 0 q > 0 p <  q 1  q 1  p p q = 0 p > 1   q > 0 NO stationary  -equilibrium with  >0 (0, 0, 0, 1) 4 2,1

31 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 31/39 non-stationary  -equilibrium Player 1: T, B, B, B, …. Player 2: R, R, R, R, …. (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3,  1 0, 0 (0, 0, 0, 1) 4 2,1 Ex. 2: 2-Player Non-Absorbing AT Game equilibrium rewards ((2, 1), ( 

32 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 32/39 alternative  -equilibrium Player 1: T, T, T, T, …. Player 2: L, R, R, R, …. (0, 0, 1, 0)(0, 1, 0, 0) (0, 0, 0, 1) (1, 0, 0, 0) (0, 0.5, 0.5, 0) 3 2 1 0, 0 3,  1 0, 0 (0, 0, 0, 1) 4 2,1 Ex. 2: 2-Player Non-Absorbing AT Game equilibrium rewards ((2, 1), ( 

33 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 33/39 Ex. 3: 3-Player Absorbing AT Game 0, 0, 0 1, 3, 00, 1, 33, 0, 1 1 2 34 (1, 0, 0, 0) (2/3, 1/3, 0, 0) (2/3, 0, 1/3, 0) (2/3, 0, 0, 1/3) (1/3, 1/3, 0, 1/3) (1/3, 0, 1/3, 1/3) (1/3, 1/3, 1/3, 0) (0, 1/3, 1/3, 1/3) (0, 1, 0, 0) (0, 0, 1, 0)(0, 0, 0, 1) 0, 0, 0 T L F R B N How to share 4 among three people if only few solutions are allowed? 1 2 3

34 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 34/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1, 3, 00, 1, 33, 0, 1 1 2 34 (1, 0, 0, 0) (2/3, 1/3, 0, 0) (2/3, 0, 1/3, 0) (2/3, 0, 0, 1/3) (1/3, 1/3, 0, 1/3) (1/3, 0, 1/3, 1/3) (1/3, 1/3, 1/3, 0) (0, 1/3, 1/3, 1/3) (0, 1, 0, 0) (0, 0, 1, 0)(0, 0, 0, 1) 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N Ex. 3: 3-Player Absorbing AT Game

35 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 35/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1/3 * 2/3 * 1 * 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N NO stationary  -equilibrium Ex. 3: 3-Player Absorbing AT Game 1, 3, 0 0, 1, 3 3, 0, 1

36 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 36/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1/3 * 2/3 * 1 * 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N Player 1 on B: 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, …. Player 2 on R: 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, …. Player 3 on F: 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, …. non-stationary  -equilibrium Ex. 3: 3-Player Absorbing AT Game

37 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 37/39 0, 0, 0 1, 3, 0 0, 1, 3 3, 0, 1 1/3 * 2/3 * 1 * 1/2, 2, 3/2 2, 3/2, 1/2 3/2, 1/2, 2 4/3, 4/3, 4/3 T L F R B N equilibrium rewards (1, 2, 1) Ex. 3: 3-Player Absorbing AT Game Player 1 on B: 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, …. Player 2 on R: 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, 0, 0, …. Player 3 on F: 0, 0, 0, 0, 1, ¾, 0, 0, 0, 0, 1, ¾, ….

38 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 38/39 Results 1.0-equilibria for n-player AT games (threats!) 2.0-opt. stationary strat. for 0-sum AT games 3.Stat.  -equilibria for 2-player abs. AT games 4.Result 3 can not be strengthened, neither to 3-player abs. AT games, nor to 2-player non-abs. AT games, nor to give stat. 0-equilibria

39 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 39/39 frank@math.unimaas.nl ?

40 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 40/39 GAME VER

41 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 41/39 GAME VER

42 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 42/39 GAME VER

43 frank@math.unimaas.nl 05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 43/39 GAME VER


Download ppt "05-11-2006 Center for the Study of Rationality Hebrew University of Jerusalem 1/39 n-Player Stochastic Games with Additive Transitions."

Similar presentations


Ads by Google