Download presentation
Presentation is loading. Please wait.
1
Stochastic -Regular Games
Krishnendu Chatterjee UC Berkeley Dissertation Talk 11/6/2018
2
Games on Graphs Games on graphs Model interaction between components.
Control of finite state systems Solving two player games. 11/6/2018
3
Graph Models of Processes
vertices = states edges = transitions paths = behaviors 11/6/2018
4
Game Models of Processes
vertices = states edges = transitions paths = behaviors Different processes or processes in open environment. 11/6/2018
5
Graphs vs. Games a a a b a b 11/6/2018
6
Game Models Enable -synthesis [Church, Rabin, Ramadge/Wonham, Pnueli/Rosner] -receptiveness [Dill, Abadi/Lamport] -semantics of interaction [Abramsky] -reasoning about adversarial behavior -non-emptiness of non-deterministic automata -behavioral type systems and interface automata [deAlfaro/ Henzinger] -modular reasoning [Kupferman/Vardi, Alur/deAlfaro/Henzinger /Mang] -early error detection [deAlfaro/Henzinger/Mang] -model-based testing [Gurevich/Veanes et al.] -etc. 11/6/2018
7
Game Models Always about open systems:
-players = processes / components / agents -demonic or adversarial non-determinism 11/6/2018
8
Stochastic Games 11/6/2018
9
Stochastic Games Where: What for: How:
Arena: Game graphs with probabilistic transitions. What for: Objectives: -regular. How: Strategies. 11/6/2018
10
Game Graphs Two broad class Turn-based games Concurrent games
Players make moves in turns. Example: tic-tac-toe, chess. Concurrent games Players make moves simultaneously and independently. Example: matrix games. 11/6/2018
11
Goals Computational issues. Strategy classification
Characterize simplest class of strategies that achieve the optimal payoff. 11/6/2018
12
Three key components Classes of game graphs. Objectives.
Strategies and notion of values. 11/6/2018
13
Markov Decision Process
A Markov Decision Process (MDP) is defined as G=((V,E),(V1,V0)), where (V,E) is a finite graph. (V1,V0) is a partition of V. V0 are random nodes chooses between successors uniformly at random. 11/6/2018
14
A Markov Decision Process
1 1 1 1 1 11/6/2018
15
Markov Decision Process
Games against nature or randomness. Control of systems in presence of uncertainty. We call them 1 ½ -player games. 11/6/2018
16
Two-player Games A two-player game is defined as
G=((V,E),(V1,V2)), where (V,E) is a finite graph. (V1,V2) is a partition of V. V1 player 1 makes moves. V2 player 2 makes moves. 11/6/2018
17
Two-player Games Games against adversary. Control in open environment
controller synthesis. We call them 2-player games. 11/6/2018
18
Turn-based Probabilistic Games
A turn-based probabilistic game is defined as G=((V,E),(V1,V2,V0)), where (V,E) is a finite graph. (V1,V2,V0) is a partition of V. V1 player 1 makes moves. V2 player 2 makes moves. V0 randomly chooses successors. 11/6/2018
19
Turn-based Probabilistic Games
Games against adversary and nature. Controller synthesis in presence of uncertainty controller synthesis of stochastic reactive systems. We call them 2 ½ -player games. Also known as perfect-information stochastic games. 11/6/2018
20
Game Played Token placed on an initial vertex. If current vertex is
Player 1 vertex then player 1 chooses successor. Player 2 vertex then player 2 chooses successor. Player random vertex proceed to successors uniformly at random. Generates infinite sequence of vertices. 11/6/2018
21
Concurrent Games Players make move simultaneously.
Finite set of vertices V. Finite set of actions A. Action assignments 1,2:V ! 2A n ;. Probabilistic transition function (s, a1, a2)(t) = Pr [ t | s, a1, a2] 11/6/2018
22
A Concurrent Game Actions at s0: a, b for player 1, c, d for player 2.
ad Actions at s0: a, b for player 1, c, d for player 2. s0 ac,bd bc s2 s1 11/6/2018
23
Concurrent Games Games with simultaneous interaction.
Model synchronous interaction. 11/6/2018
24
Stochastic Games 1 ½ pl. 2 pl. 2 ½ pl. Conc. games 11/6/2018
25
Three key components Classes of game graphs. Objectives.
Strategies and notion of values. 11/6/2018
26
Plays Plays: infinite sequence of vertices or infinite trajectories.
V: set of all infinite plays or infinite trajectories. 11/6/2018
27
Objectives Plays: infinite sequence of vertices.
Objectives: subset of plays, 1 µ V. Play is winning for player 1 if it is in 1. Zero-sum game: 2 = V n1. 11/6/2018
28
Reachability and Safety
Let R µ V set of target vertices. Reachability objective requires to visit the set R of vertices. Let F µ V set of safe vertices. Safety objective requires never to visit any vertex outside F. 11/6/2018
29
Reachability and Safety
11/6/2018
30
Buechi and coBuechi Let B µ V a set of Buechi vertices.
Buechi objective requires that the set B is visited infinitely often. Let C µ V a set of coBuechi vertices. coBuechi objective requires that vertices outside C visited finitely often. 11/6/2018
31
Buechi and coBuechi C B coBuchi Buchi 11/6/2018
32
Rabin and Streett Let {(G1,R1), (G2,R2),…, (Gd,Rd)} set of vertex set pairs. Rabin: requires there is a pair (Gj,Rj) such that Gj finitely often and Rj infinitely often. Streett: requires for every pair (Gj,Rj) if Rj infinitely often then Gj infinitely often. Rabin-chain: both a Rabin-Streett, also known as parity objectives; it is a complementation closed subset of Rabin. 11/6/2018
33
Mueller Objective Mueller objective Fµ 2V is specified as a set of subset of vertices and requires that the set of states that are visited infinitely often is in F. Rabin, Streett, Rabin-chain objectives are all Mueller objectives. 11/6/2018
34
Objectives -regular: [,¢ , *,. Safety, Reachability, Liveness, etc.
Rabin and Streett canonical ways to express. Borel -regular 11/6/2018
35
Three key components Classes of game graphs. Objectives.
Strategies and notion of values. 11/6/2018
36
Strategies Given a finite sequence of vertices, (that represents the history of play) a strategy for player 1 is a probability distribution over the set of successors. : V* ¢ V1 ! D(V) 11/6/2018
37
Strategies 1/2 1 1 1/2 1 1 1 11/6/2018
38
Strategies 3/4 1 1 1/4 1 1 1 11/6/2018
39
Strategies with Memory
It has memory to remember the history. Let M represent a set called memory. Strategy with memory represented as two functions: m : V £ M ! M s : V1 £ M ! D(V) A finite memory strategy is one with finite memory size M. 11/6/2018
40
Subclass of Strategies
Memoryless (stationary) strategies: Strategy is independent of the history of the play and depends on the current vertex, i.e., |M|=1. : V1 ! D(V) Pure strategies: chooses a successor rather than a probability distribution. Pure-memoryless: both pure and memoryless (simplest class). 11/6/2018
41
Strategies The set of strategies
Set of strategy for player 1; strategies . Set of strategy for player 2; strategies . 11/6/2018
42
Values Given objectives 1 and 2 = V n 1 the value for the players are val1(1)(v) = sup 2 inf 2 Prv,(1). val2(2)(v) = sup 2 inf 2 Prv,(2). 11/6/2018
43
Determinacy Determinacy Determinacy means
val1(1)(v) + val2(2)(v) =1. Determinacy means sup inf = inf sup. von Neumann’s minmax theorem in matrix games. 11/6/2018
44
Optimal Strategies A strategy is optimal for objective 1 if
val1(1)(v) = inf Prv, (1). Analogous definition for player 2. 11/6/2018
45
Computational Issues Algorithms to compute values in games.
Identify the simplest class of strategies that suffices for optimality. 11/6/2018
46
Outline of Results: Turn-based Games
11/6/2018
47
History and Results Two-player games.
Determinacy (sup inf = inf sup) theorem for Borel objectives. [Mar75] Strong determinacy: values only 0 and 1. Finite memory determinacy (i.e., finite memory optimal strategy exists) for -regular objective. [GurHar82] Pure memoryless optimal strategy exists for Rabin objectives. [EmeJut88] NP-complete. coNP-complete for Streett objectives. Mueller objectives Precisely characterize memory requirement by a number mF [DzeJurWal97]. PSPACE-complete [HunDaw05]. 11/6/2018
48
History and Results 2 ½ -player games.
Determinacy (sup inf = inf sup) theorem for Borel objectives. [Mar98] Strong determinacy (values only 0 and 1) does not hold. 11/6/2018
49
History and Results 2 ½ -player games Reachability objectives [Con92]
Pure memoryless optimal strategy exists. Decided in NP Å coNP. Decision problem: given a 2 ½ -player game, an objective 1, a vertex v, and a rational r, whether val1(1)(v) ¸ r. 11/6/2018
50
History and Results 2 ½ -player games
-regular objectives [deAl Maj01] Parity games: 2EXPTIME. Rabin, Streett and Mueller: 3EXPTIME. Algorithms obtained as a special case of an algorithm for the general case of concurrent games. 11/6/2018
51
Overview of Results Borel Mar75 1 ½ pl. GH82 2 pl. -regular EJ88
Mar98, dAM01 Conc. games 11/6/2018
52
Our Results: Turn-based Games
11/6/2018
53
Results on 2 ½ player Games
The complexity of Rabin, Streett and parity objectives [C/deAl Hen 05] Pure memoryless optimal strategies exist for Rabin objectives. NP-complete for Rabin (previously was 3EXPTIME). coNP-complete for Streett (previously was 3EXPTIME). NP Å coNP for parity (previously was 2EXPTIME). 11/6/2018
54
Rabin and Strett Objectives
2 ½ pl. EJ88 :PM NP comp. 2 pl. PM, NP comp. Reach Con 92: PM 11/6/2018
55
Results on 2 ½ player Games
The complexity of Mueller objectives [C 07] Winning strategies with memory mF suffices (the same memory upper bound as 2 player games). There is a matching lower bound also. PSPACE-complete (previously known 3EXPTIME). 11/6/2018
56
Results on 2 ½ player Games
Practical algorithms Strategy improvement algorithms: iterate over local optimizations of strategies, that must converge to global optimal. Converges fast in practice though worst case bound in exponential. Strategy improvement for parity objectives [C/Hen 06a]. Strategy improvement for Rabin and Streett objectives [C/Hen 06b]. 11/6/2018
57
Outline of Results: Concurrent Games
11/6/2018
58
Concurrent Games: Example
ad Actions at s0: a, b for player 1, c, d for player 2. Objective for player 1: to reach s1 Pure strategy not enough: a -> d b -> c s0 ac,bd bc s1 s2 11/6/2018
59
Concurrent Games: Example
ad Fact: For any strategy for player 1 she cannot win with prob. 1. As long it plays a deterministically play d, else play c with positive probability. s0 ac,bd bc s1 s2 11/6/2018
60
Concurrent Games: Example
1- ad Fix strategy for player 1 a !1- b ! c s0 d ac,bd bc 1- s1 s2 For every positive player 1 can win with probability 1 -. 11/6/2018
61
Concurrent Games: Example
ad Actions at s0: a, b for player 1, c, d for player 2. Objective for player 1: to reach s1 infinitely often. Infinite memory is required. s0 ac,bd bc s2 s1 11/6/2018
62
Concurrent Games Optimal strategies need not exist:
-optimal strategies for all >0. Require randomization: pure strategies not enough. For -regular objectives -optimal strategies require infinite memory. 11/6/2018
63
History and Results Detailed analysis of concurrent games [Fil Vri 97]. Determinacy theorem for all Borel objectives [Mar 98]. Concurrent -regular games: Reachability objectives [deAl Hen Kup98]. Rabin-chain objectives [deAl Hen00]. Rabin-chain objectives [deAl Maj01]. 11/6/2018
64
History and Results Qualitative analysis Quantitative analysis
Reachability objectives [deAl Hen Kup98] Quadratic time. Parity objectives [deAl Hen 00] NP and coNP. Quantitative analysis Reachability objectives [Ete Yan 06] PSPACE. 11/6/2018
65
History and Results Concurrent games with parity objectives
3EXPTIME [deAl Maj 01] Quantitative -calculus formula. Formula in the theory of reals with length exponential in the size of the game and exponentially many quantifier alternations in the number of parities. Rabin, Streett and Mueller: 4EXPTIME. 11/6/2018
66
Overview of Results Borel 1 ½ pl. 2 pl. -regular 2 ½ pl. dAH00,dAM01
Reach dAH00,dAM01 Conc. games EY06 Mar98 11/6/2018
67
Our Results: Concurrent Games
11/6/2018
68
Results on Concurrent Games
The complexity of concurrent parity games [C/deAl Hen06]. PSPACE A polynomial guess followed by a formula in existential theory of reals. Previous formula in the theory of reals required exponential quantifier alternations and was 3EXPTIME. The complexity of concurrent Rabin, Streett and Mueller objectives EXPSPACE – reduction to parity (previous known was 4EXPTIME). 11/6/2018
69
Results on Concurrent Games
Practical algorithms Strategy improvement algorithm for reachability objectives [C/deAl Hen 06b]. Some improved analysis of strategies for concurrent reachability games. A new combinatorial proof of existence of memoryless optimal strategies, for all >0. 11/6/2018
70
Outline of Results: Non-zero-sum Games
11/6/2018
71
Non-zero-sum Games So far we discussed strictly competitive games where the objectives were complementary. Non-zero-sum games Each player has own objective. Concept of rationality: Nash equilibrium, i.e., a strategy profile of all the players such that no player has incentive to deviate if the other players keep playing the strategies in the profile. -Nash equilibrium: can only gain at most by deviation. 11/6/2018
72
Closed sets (Safety). [Sec Sud 02]
History and Results Two-player nonzero-sum stochastic games with limit-average payoff [Vie 00a, Vie 00b] Closed sets (Safety). [Sec Sud 02] Discounted/ Halting Games. [Fink 64] 11/6/2018
73
Overview of Results Borel -reg Nash:SecSud02 S n pl. conc. R
n pl. turn-based 2 pl. conc. Nash:Vie00 Lim. avg 11/6/2018
74
Our Results: Non-zero-sum Games
11/6/2018
75
Our Results: Non-zero-sum Concurrent Games
An n-player non-zero-sum reachability game has an Nash equilibrium in memoryless strategies for all >0 .[C/ Maj Jur 04] A two-player non-zero-sum parity game has an Nash equilibrium for all >0. [C 05] 11/6/2018
76
Our Results: Non-zero-sum Turn Based Games
Results from [C/ Maj Jur 04] n-player turn based probabilistic games with Borel payoffs have -Nash equilibria in deterministic (pure) strategies. n-player turn based deterministic games with Borel payoffs have Nash equilibria in deterministic (pure) strategies. 11/6/2018
77
Can it be strengthened? Can -Nash equilibrium be strengthened to Nash equilibrium? 11/6/2018
78
Can it be strengthened? Can -Nash equilibrium be strengthened to Nash equilibrium? No. Counter-example. 11/6/2018
79
Las Vegas Game Work 1/2 Jackpot Sorry you lose Go to Vegas Play again
11/6/2018
80
For every >0, Las Vegas game has a (1-)-optimal winning strategy
For < 1/2n, work for n days before heading to Vegas. But no optimal winning strategy. 11/6/2018
81
-Regular? The Las Vegas game is not -regular
Turn based probabilistic non-zero-sum games with -regular objectives have pure Nash equilibria. 11/6/2018
82
Results in Non-zero-sum Games
Borel -reg Nash:SecSud02 S n pl. conc. Nash Nash R n pl. turn-based Nash Nash 2 pl. conc. Lim. avg Nash:Vil00 11/6/2018
83
Conclusion and Open Problems
11/6/2018
84
Conclusion Stochastic Games
Model for controller synthesis in the presence of uncertainty. Improved strategy complexity, computational complexity, and practical algorithms for various classes of games with different classes of objectives. Existence of Nash and Nash equilibrium in various classes of games. 11/6/2018
85
Open Problems Complexity of parity games
NP Å coNP for 2-player and 2 ½ -player games. Polynomial time algorithm is a major open question. Complexity of 2 ½ player reachability games NP Å coNP. Polynomial time algorithm is a major open problem. Other open problems Algorithms for concurrent parity games. Better algorithmic analysis of 2-player, 2 ½ -player and concurrent games with sub-classes of parity objectives. 11/6/2018
86
Open Problems Several open problems in non-zero-sum games
Related to existence of equilibria and also the related complexity problems. 11/6/2018
87
Results in Non-zero-sum Games
Borel -reg Nash:SecSud02 S n pl. conc. Nash Nash R n pl. turn-based Nash Nash 2 pl. conc. Lim. avg Nash:Vil00 11/6/2018
88
Open Problems Borel -reg S n pl. conc. R n pl. turn-based 2 pl. conc.
Lim. avg 11/6/2018
89
Acknowledgments Joint work with Thanks to you all !!! Luca de Alfaro
Thomas A. Henzinger Marcin Jurdzinski Rupak Majumdar Thanks to you all !!! 11/6/2018
90
Questions ??? 11/6/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.