Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stochastic -Regular Games

Similar presentations


Presentation on theme: "Stochastic -Regular Games"— Presentation transcript:

1 Stochastic -Regular Games
Krishnendu Chatterjee UC Berkeley Dissertation Talk 11/6/2018

2 Games on Graphs Games on graphs Model interaction between components.
Control of finite state systems Solving two player games. 11/6/2018

3 Graph Models of Processes
vertices = states edges = transitions paths = behaviors 11/6/2018

4 Game Models of Processes
vertices = states edges = transitions paths = behaviors Different processes or processes in open environment. 11/6/2018

5 Graphs vs. Games a a a b a b 11/6/2018

6 Game Models Enable -synthesis [Church, Rabin, Ramadge/Wonham, Pnueli/Rosner] -receptiveness [Dill, Abadi/Lamport] -semantics of interaction [Abramsky] -reasoning about adversarial behavior -non-emptiness of non-deterministic automata -behavioral type systems and interface automata [deAlfaro/ Henzinger] -modular reasoning [Kupferman/Vardi, Alur/deAlfaro/Henzinger /Mang] -early error detection [deAlfaro/Henzinger/Mang] -model-based testing [Gurevich/Veanes et al.] -etc. 11/6/2018

7 Game Models Always about open systems:
-players = processes / components / agents -demonic or adversarial non-determinism 11/6/2018

8 Stochastic Games 11/6/2018

9 Stochastic Games Where: What for: How:
Arena: Game graphs with probabilistic transitions. What for: Objectives: -regular. How: Strategies. 11/6/2018

10 Game Graphs Two broad class Turn-based games Concurrent games
Players make moves in turns. Example: tic-tac-toe, chess. Concurrent games Players make moves simultaneously and independently. Example: matrix games. 11/6/2018

11 Goals Computational issues. Strategy classification
Characterize simplest class of strategies that achieve the optimal payoff. 11/6/2018

12 Three key components Classes of game graphs. Objectives.
Strategies and notion of values. 11/6/2018

13 Markov Decision Process
A Markov Decision Process (MDP) is defined as G=((V,E),(V1,V0)), where (V,E) is a finite graph. (V1,V0) is a partition of V. V0 are random nodes chooses between successors uniformly at random. 11/6/2018

14 A Markov Decision Process
1 1 1 1 1 11/6/2018

15 Markov Decision Process
Games against nature or randomness. Control of systems in presence of uncertainty. We call them 1 ½ -player games. 11/6/2018

16 Two-player Games A two-player game is defined as
G=((V,E),(V1,V2)), where (V,E) is a finite graph. (V1,V2) is a partition of V. V1 player 1 makes moves. V2 player 2 makes moves. 11/6/2018

17 Two-player Games Games against adversary. Control in open environment
controller synthesis. We call them 2-player games. 11/6/2018

18 Turn-based Probabilistic Games
A turn-based probabilistic game is defined as G=((V,E),(V1,V2,V0)), where (V,E) is a finite graph. (V1,V2,V0) is a partition of V. V1 player 1 makes moves. V2 player 2 makes moves. V0 randomly chooses successors. 11/6/2018

19 Turn-based Probabilistic Games
Games against adversary and nature. Controller synthesis in presence of uncertainty controller synthesis of stochastic reactive systems. We call them 2 ½ -player games. Also known as perfect-information stochastic games. 11/6/2018

20 Game Played Token placed on an initial vertex. If current vertex is
Player 1 vertex then player 1 chooses successor. Player 2 vertex then player 2 chooses successor. Player random vertex proceed to successors uniformly at random. Generates infinite sequence of vertices. 11/6/2018

21 Concurrent Games Players make move simultaneously.
Finite set of vertices V. Finite set of actions A. Action assignments 1,2:V ! 2A n ;. Probabilistic transition function (s, a1, a2)(t) = Pr [ t | s, a1, a2] 11/6/2018

22 A Concurrent Game Actions at s0: a, b for player 1, c, d for player 2.
ad Actions at s0: a, b for player 1, c, d for player 2. s0 ac,bd bc s2 s1 11/6/2018

23 Concurrent Games Games with simultaneous interaction.
Model synchronous interaction. 11/6/2018

24 Stochastic Games 1 ½ pl. 2 pl. 2 ½ pl. Conc. games 11/6/2018

25 Three key components Classes of game graphs. Objectives.
Strategies and notion of values. 11/6/2018

26 Plays Plays: infinite sequence of vertices or infinite trajectories.
V: set of all infinite plays or infinite trajectories. 11/6/2018

27 Objectives Plays: infinite sequence of vertices.
Objectives: subset of plays, 1 µ V. Play is winning for player 1 if it is in 1. Zero-sum game: 2 = V n1. 11/6/2018

28 Reachability and Safety
Let R µ V set of target vertices. Reachability objective requires to visit the set R of vertices. Let F µ V set of safe vertices. Safety objective requires never to visit any vertex outside F. 11/6/2018

29 Reachability and Safety
11/6/2018

30 Buechi and coBuechi Let B µ V a set of Buechi vertices.
Buechi objective requires that the set B is visited infinitely often. Let C µ V a set of coBuechi vertices. coBuechi objective requires that vertices outside C visited finitely often. 11/6/2018

31 Buechi and coBuechi C B coBuchi Buchi 11/6/2018

32 Rabin and Streett Let {(G1,R1), (G2,R2),…, (Gd,Rd)} set of vertex set pairs. Rabin: requires there is a pair (Gj,Rj) such that Gj finitely often and Rj infinitely often. Streett: requires for every pair (Gj,Rj) if Rj infinitely often then Gj infinitely often. Rabin-chain: both a Rabin-Streett, also known as parity objectives; it is a complementation closed subset of Rabin. 11/6/2018

33 Mueller Objective Mueller objective Fµ 2V is specified as a set of subset of vertices and requires that the set of states that are visited infinitely often is in F. Rabin, Streett, Rabin-chain objectives are all Mueller objectives. 11/6/2018

34 Objectives -regular: [,¢ , *,. Safety, Reachability, Liveness, etc.
Rabin and Streett canonical ways to express. Borel -regular 11/6/2018

35 Three key components Classes of game graphs. Objectives.
Strategies and notion of values. 11/6/2018

36 Strategies Given a finite sequence of vertices, (that represents the history of play) a strategy  for player 1 is a probability distribution over the set of successors.  : V* ¢ V1 ! D(V) 11/6/2018

37 Strategies 1/2 1 1 1/2 1 1 1 11/6/2018

38 Strategies 3/4 1 1 1/4 1 1 1 11/6/2018

39 Strategies with Memory
It has memory to remember the history. Let M represent a set called memory. Strategy with memory represented as two functions: m : V £ M ! M s : V1 £ M ! D(V) A finite memory strategy is one with finite memory size M. 11/6/2018

40 Subclass of Strategies
Memoryless (stationary) strategies: Strategy is independent of the history of the play and depends on the current vertex, i.e., |M|=1. : V1 ! D(V) Pure strategies: chooses a successor rather than a probability distribution. Pure-memoryless: both pure and memoryless (simplest class). 11/6/2018

41 Strategies The set of strategies
Set of strategy  for player 1; strategies . Set of strategy  for player 2; strategies . 11/6/2018

42 Values Given objectives 1 and 2 = V n 1 the value for the players are val1(1)(v) = sup 2  inf 2  Prv,(1). val2(2)(v) = sup 2  inf 2  Prv,(2). 11/6/2018

43 Determinacy Determinacy Determinacy means
val1(1)(v) + val2(2)(v) =1. Determinacy means sup inf = inf sup. von Neumann’s minmax theorem in matrix games. 11/6/2018

44 Optimal Strategies A strategy  is optimal for objective 1 if
val1(1)(v) = inf Prv, (1). Analogous definition for player 2. 11/6/2018

45 Computational Issues Algorithms to compute values in games.
Identify the simplest class of strategies that suffices for optimality. 11/6/2018

46 Outline of Results: Turn-based Games
11/6/2018

47 History and Results Two-player games.
Determinacy (sup inf = inf sup) theorem for Borel objectives. [Mar75] Strong determinacy: values only 0 and 1. Finite memory determinacy (i.e., finite memory optimal strategy exists) for -regular objective. [GurHar82] Pure memoryless optimal strategy exists for Rabin objectives. [EmeJut88] NP-complete. coNP-complete for Streett objectives. Mueller objectives Precisely characterize memory requirement by a number mF [DzeJurWal97]. PSPACE-complete [HunDaw05]. 11/6/2018

48 History and Results 2 ½ -player games.
Determinacy (sup inf = inf sup) theorem for Borel objectives. [Mar98] Strong determinacy (values only 0 and 1) does not hold. 11/6/2018

49 History and Results 2 ½ -player games Reachability objectives [Con92]
Pure memoryless optimal strategy exists. Decided in NP Å coNP. Decision problem: given a 2 ½ -player game, an objective 1, a vertex v, and a rational r, whether val1(1)(v) ¸ r. 11/6/2018

50 History and Results 2 ½ -player games
-regular objectives [deAl Maj01] Parity games: 2EXPTIME. Rabin, Streett and Mueller: 3EXPTIME. Algorithms obtained as a special case of an algorithm for the general case of concurrent games. 11/6/2018

51 Overview of Results Borel Mar75 1 ½ pl. GH82 2 pl. -regular EJ88
Mar98, dAM01 Conc. games 11/6/2018

52 Our Results: Turn-based Games
11/6/2018

53 Results on 2 ½ player Games
The complexity of Rabin, Streett and parity objectives [C/deAl Hen 05] Pure memoryless optimal strategies exist for Rabin objectives. NP-complete for Rabin (previously was 3EXPTIME). coNP-complete for Streett (previously was 3EXPTIME). NP Å coNP for parity (previously was 2EXPTIME). 11/6/2018

54 Rabin and Strett Objectives
2 ½ pl. EJ88 :PM NP comp. 2 pl. PM, NP comp. Reach Con 92: PM 11/6/2018

55 Results on 2 ½ player Games
The complexity of Mueller objectives [C 07] Winning strategies with memory mF suffices (the same memory upper bound as 2 player games). There is a matching lower bound also. PSPACE-complete (previously known 3EXPTIME). 11/6/2018

56 Results on 2 ½ player Games
Practical algorithms Strategy improvement algorithms: iterate over local optimizations of strategies, that must converge to global optimal. Converges fast in practice though worst case bound in exponential. Strategy improvement for parity objectives [C/Hen 06a]. Strategy improvement for Rabin and Streett objectives [C/Hen 06b]. 11/6/2018

57 Outline of Results: Concurrent Games
11/6/2018

58 Concurrent Games: Example
ad Actions at s0: a, b for player 1, c, d for player 2. Objective for player 1: to reach s1 Pure strategy not enough: a -> d b -> c s0 ac,bd bc s1 s2 11/6/2018

59 Concurrent Games: Example
ad Fact: For any strategy for player 1 she cannot win with prob. 1. As long it plays a deterministically play d, else play c with positive probability. s0 ac,bd bc s1 s2 11/6/2018

60 Concurrent Games: Example
1- ad Fix strategy for player 1 a !1- b !  c s0 d ac,bd bc 1- s1 s2 For every positive  player 1 can win with probability 1 -. 11/6/2018

61 Concurrent Games: Example
ad Actions at s0: a, b for player 1, c, d for player 2. Objective for player 1: to reach s1 infinitely often. Infinite memory is required. s0 ac,bd bc s2 s1 11/6/2018

62 Concurrent Games Optimal strategies need not exist:
-optimal strategies for all >0. Require randomization: pure strategies not enough. For -regular objectives -optimal strategies require infinite memory. 11/6/2018

63 History and Results Detailed analysis of concurrent games [Fil Vri 97]. Determinacy theorem for all Borel objectives [Mar 98]. Concurrent -regular games: Reachability objectives [deAl Hen Kup98]. Rabin-chain objectives [deAl Hen00]. Rabin-chain objectives [deAl Maj01]. 11/6/2018

64 History and Results Qualitative analysis Quantitative analysis
Reachability objectives [deAl Hen Kup98] Quadratic time. Parity objectives [deAl Hen 00] NP and coNP. Quantitative analysis Reachability objectives [Ete Yan 06] PSPACE. 11/6/2018

65 History and Results Concurrent games with parity objectives
3EXPTIME [deAl Maj 01] Quantitative -calculus formula. Formula in the theory of reals with length exponential in the size of the game and exponentially many quantifier alternations in the number of parities. Rabin, Streett and Mueller: 4EXPTIME. 11/6/2018

66 Overview of Results Borel 1 ½ pl. 2 pl. -regular 2 ½ pl. dAH00,dAM01
Reach dAH00,dAM01 Conc. games EY06 Mar98 11/6/2018

67 Our Results: Concurrent Games
11/6/2018

68 Results on Concurrent Games
The complexity of concurrent parity games [C/deAl Hen06]. PSPACE A polynomial guess followed by a formula in existential theory of reals. Previous formula in the theory of reals required exponential quantifier alternations and was 3EXPTIME. The complexity of concurrent Rabin, Streett and Mueller objectives EXPSPACE – reduction to parity (previous known was 4EXPTIME). 11/6/2018

69 Results on Concurrent Games
Practical algorithms Strategy improvement algorithm for reachability objectives [C/deAl Hen 06b]. Some improved analysis of strategies for concurrent reachability games. A new combinatorial proof of existence of memoryless  optimal strategies, for all >0. 11/6/2018

70 Outline of Results: Non-zero-sum Games
11/6/2018

71 Non-zero-sum Games So far we discussed strictly competitive games where the objectives were complementary. Non-zero-sum games Each player has own objective. Concept of rationality: Nash equilibrium, i.e., a strategy profile of all the players such that no player has incentive to deviate if the other players keep playing the strategies in the profile. -Nash equilibrium: can only gain at most  by deviation. 11/6/2018

72 Closed sets (Safety). [Sec Sud 02]
History and Results Two-player nonzero-sum stochastic games with limit-average payoff [Vie 00a, Vie 00b] Closed sets (Safety). [Sec Sud 02] Discounted/ Halting Games. [Fink 64] 11/6/2018

73 Overview of Results Borel -reg Nash:SecSud02 S n pl. conc. R
n pl. turn-based 2 pl. conc.  Nash:Vie00 Lim. avg 11/6/2018

74 Our Results: Non-zero-sum Games
11/6/2018

75 Our Results: Non-zero-sum Concurrent Games
An n-player non-zero-sum reachability game has an  Nash equilibrium in memoryless strategies for all >0 .[C/ Maj Jur 04] A two-player non-zero-sum parity game has an  Nash equilibrium for all  >0. [C 05] 11/6/2018

76 Our Results: Non-zero-sum Turn Based Games
Results from [C/ Maj Jur 04] n-player turn based probabilistic games with Borel payoffs have -Nash equilibria in deterministic (pure) strategies. n-player turn based deterministic games with Borel payoffs have Nash equilibria in deterministic (pure) strategies. 11/6/2018

77 Can it be strengthened? Can -Nash equilibrium be strengthened to Nash equilibrium? 11/6/2018

78 Can it be strengthened? Can -Nash equilibrium be strengthened to Nash equilibrium? No. Counter-example. 11/6/2018

79 Las Vegas Game Work 1/2 Jackpot Sorry you lose Go to Vegas Play again
11/6/2018

80 For every >0, Las Vegas game has a (1-)-optimal winning strategy
For  < 1/2n, work for n days before heading to Vegas. But no optimal winning strategy. 11/6/2018

81 -Regular? The Las Vegas game is not -regular
Turn based probabilistic non-zero-sum games with -regular objectives have pure Nash equilibria. 11/6/2018

82 Results in Non-zero-sum Games
Borel -reg Nash:SecSud02 S n pl. conc.  Nash Nash R n pl. turn-based  Nash  Nash 2 pl. conc. Lim. avg  Nash:Vil00 11/6/2018

83 Conclusion and Open Problems
11/6/2018

84 Conclusion Stochastic Games
Model for controller synthesis in the presence of uncertainty. Improved strategy complexity, computational complexity, and practical algorithms for various classes of games with different classes of objectives. Existence of Nash and  Nash equilibrium in various classes of games. 11/6/2018

85 Open Problems Complexity of parity games
NP Å coNP for 2-player and 2 ½ -player games. Polynomial time algorithm is a major open question. Complexity of 2 ½ player reachability games NP Å coNP. Polynomial time algorithm is a major open problem. Other open problems Algorithms for concurrent parity games. Better algorithmic analysis of 2-player, 2 ½ -player and concurrent games with sub-classes of parity objectives. 11/6/2018

86 Open Problems Several open problems in non-zero-sum games
Related to existence of equilibria and also the related complexity problems. 11/6/2018

87 Results in Non-zero-sum Games
Borel -reg Nash:SecSud02 S n pl. conc.  Nash Nash R n pl. turn-based  Nash  Nash 2 pl. conc. Lim. avg  Nash:Vil00 11/6/2018

88 Open Problems Borel -reg S n pl. conc. R n pl. turn-based 2 pl. conc.
Lim. avg 11/6/2018

89 Acknowledgments Joint work with Thanks to you all !!! Luca de Alfaro
Thomas A. Henzinger Marcin Jurdzinski Rupak Majumdar Thanks to you all !!! 11/6/2018

90 Questions ??? 11/6/2018


Download ppt "Stochastic -Regular Games"

Similar presentations


Ads by Google