Download presentation
Presentation is loading. Please wait.
1
Games, Times, and Probabilities: Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger
2
Graph Models of Systems vertices = states edges = transitions paths = behaviors
3
graph Extended Graph Models CONTROL: game graph OBJECTIVE: -automaton PROBABILITIES: Markov decision process stochastic game regular game CLOCKS: timed automaton stochastic hybrid system
4
Graphs vs. Games a ba a b a
5
Games model Open Systems Two players: environment / controller / input vs. system / plant / output Multiple players: processes / components / agents Stochastic players: nature / randomized algorithms
6
Example P1: init x := 0 loop choice | x := x+1 mod 2 | x := 0 end choice end loop 1 : ( x = y ) P2: init y := 0 loop choice | y := x | y := x+1 mod 2 end choice end loop 2 : ( y = 0 )
7
Graph Questions 8 ( x = y ) 9 ( x = y ) CTL
8
Graph Questions 8 ( x = y ) 9 ( x = y ) 00 1011 01 X CTL
9
Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) ATL [Alur/H/Kupferman]
10
Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 ATL [Alur/H/Kupferman]
11
Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 ATL [Alur/H/Kupferman] X
12
Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 ATL [Alur/H/Kupferman] X
13
Nonzero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 Secure equilibra [Chatterjee/H/Jurdzinski]
14
Nonzero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 Secure equilibra [Chatterjee/H/Jurdzinski]
15
Strategies Strategies x,y: Q * ! Q From a state q, a pair (x,y) of a player-1 strategy x2 1 and a player-2 strategy y2 2 gives a unique infinite path Outcome x,y (q) 2 Q .
16
Strategies hhP1ii 1 = (9 x2 1 ) (8 y2 2 ) 1 (x,y) Short for: q ² hhP1ii 1 iff (9 x2 1 ) (8 y2 2 ) ( Outcome x,y (q) ² 1 ) Strategies x,y: Q * ! Q From a state q, a pair (x,y) of a player-1 strategy x2 1 and a player-2 strategy y2 2 gives a unique infinite path Outcome x,y (q) 2 Q .
17
Strategies hhP1ii 1 = (9 x2 1 ) (8 y2 2 ) 1 (x,y) hhP1ii 1 hhP2ii 2 = (9 x2 1 ) (9 y2 2 ) [ ( 1 Æ 2 )(x,y) Æ (8 y’2 2 ) ( 2 ! 1 )(x,y’) Æ (8 x’2 1 ) ( 2 ! 1 )(x,y) ] Strategies x,y: Q * ! Q From a state q, a pair (x,y) of a player-1 strategy x2 1 and a player-2 strategy y2 2 gives a unique infinite path Outcome x,y (q) 2 Q .
18
Objectives and 2 Qualitative: reachability; Buechi; parity ( -regular) Quantitative: max; lim sup; lim avg
19
Reachability} a Safety a= :}: a Normal Forms of -Regular Sets Borel-1
20
Reachability} a Safety a= :}: a Buechi } a coBuechi} a = : }: a Normal Forms of -Regular Sets Borel-1 Borel-2
21
Reachability} a Safety a= :}: a Buechi } a coBuechi} a = : }: a StreettÆ ( } a ! } b ) = Æ ( } : a Ç } b ) RabinÇ ( } a Æ } b ) Parity: complement-closed subset of Streett/Rabin Normal Forms of -Regular Sets Borel-1 Borel-2 Borel-2.5
22
Buechi Game q4q4 q0q0 q2q2 q1q1 q3q3 G B
23
q4q4 q0q0 q2q2 q1q1 q3q3 G B Secure equilibrium (x,y) at q 0 : x: if q 1 ! q 0, then q 2 else q 4. y: if q 3 ! q 1, then q 0 else q 4. Strategies require memory.
24
Zero-Sum Games: Determinacy W1W1 W2W2 1 = : 2 hhP2ii 2 hhP1ii 1
25
Nonzero-sum Games W 10 hhP1ii ( 1 Æ : 2 ) W 01 hhP2ii ( 2 Æ : 1 ) W 11 W 00 hhP1ii 1 hhP2ii 2
26
Objectives Qualitative: reachability; Buchi; parity ( -regular) Quantitative: max; lim sup; lim avg
27
Objectives Qualitative: reachability; Buchi; parity ( -regular) Quantitative: max; lim sup; lim avg Borel-1 Borel-2 Borel-3
28
Quantitative Games hhP1ii lim sup hhP1ii lim avg 4 2 2 0 2 0 0 4 3
29
Quantitative Games hhP1ii lim sup = 3 hhP1ii lim avg 4 2 2 0 2 0 0 4 3
30
Quantitative Games hhP1ii lim sup = 3 hhP1ii lim avg = 1 4 2 2 0 2 0 0 4 3
31
Solving Games by Value Iteration Generalization of the -calculus: computing fixpoints of transfer functions (pre; post). Generalization of dynamic programming: iterative optimization. q Region R: Q ! V q’ R(q’)
32
Solving Games by Value Iteration Generalization of the -calculus: computing fixpoints of transfer functions (pre; post). Generalization of dynamic programming: iterative optimization. q Region R: Q ! V q’ R(q’) R(q) := pre(R(q’))
33
Q states transition labels : Q Q transition function Graph
34
Q states transition labels : Q Q transition function = [ Q ! {0,1} ] regions with V = B 9 pre: q 9 pre(R) iff ( ) (q, ) R 8 pre: q 8 pre(R) iff ( ) (q, ) R Graph
35
acb 9 c =( X) ( c Ç 9pre(X) )
36
acb Graph 9 c =( X) ( c Ç 9pre(X) )
37
acb Graph 9 c =( X) ( c Ç 9pre(X) )
38
acb Graph 9 c =( X) ( c Ç 9pre(X) ) 8 c=( X) ( c Ç 8pre(X) )
39
Graph Reachability R Given R µ Q, find the states from which some path leads to R. R
40
R R [ pre(R) R = ( X) (R Ç 9 pre(X)) Given R µ Q, find the states from which some path leads to R. Graph Reachability
41
R R [ pre(R) R [ pre(R) [ pre 2 (R) R = ( X) (R Ç 9 pre(X)) Given R µ Q, find the states from which some path leads to R. Graph Reachability
42
R... RR R [ pre(R) R [ pre(R) [ pre 2 (R) R = ( X) (R Ç 9 pre(X)) Given R µ Q, find the states from which some path leads to R. Graph Reachability
43
R... RR R [ pre(R) R [ pre(R) [ pre 2 (R) R = ( X) (R Ç 8 pre(X)) Given R µ Q, find the states from which all paths lead to R. Graph Reachability
44
Value Iteration Algorithms consist of A.LOCAL PART: 9pre and 8pre computation B.GLOBAL PART: evaluation of a fixpoint expression We need to generalize both parts to solve games.
45
Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q Q transition function Turn-based Game
46
Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff q 2 Q 1 Æ ( ) (q, ) R or q 2 Q 2 Æ ( 8 2 ) (q, ) 2 R Turn-based Game
47
Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff q 2 Q 1 Æ ( ) (q, ) R or q 2 Q 2 Æ ( 8 2 ) (q, ) 2 R 2pre: q 2pre(R) iff q 2 Q 1 Æ ( 8 ) (q, ) R or q 2 Q 2 Æ ( 9 2 ) (q, ) 2 R Turn-based Game
48
c ab
49
c ab hhP1ii c =( X) ( c Ç 1pre(X) )
50
c Turn-based Game ab hhP1ii c =( X) ( c Ç 1pre(X) )
51
c Turn-based Game ab hhP1ii c =( X) ( c Ç 1pre(X) ) hhP2ii c=( X) ( c Ç 2pre(X) )
52
c Turn-based Game ab hhP1ii c =( X) ( c Ç 1pre(X) ) hhP2ii c=( X) ( c Ç 2pre(X) )
53
c Turn-based Game ab hhP1ii c =( X) ( c Ç 1pre(X) ) hhP2ii c=( X) ( c Ç 2pre(X) )
54
R P1 R Given R µ Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game
55
R R [ 1pre(R) P1 R Given R µ Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game
56
R R [ 1pre(R) R [ 1pre(R) [ 1pre 2 (R) P1 R Given R µ Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game
57
R... 1 R R [ 1pre(R) R [ 1pre(R) [ 1pre 2 (R) P1 R = ( X) (R Ç 1pre(X)) Given R µ Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game
58
P1 R Given R µ Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game
59
R \ 1pre(R) P1 R Given R µ Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game
60
R \ 1pre(R) R \ 1pre(R) \ 1pre 2 (R) P1 R Given R µ Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game
61
... 1 R R \ 1pre(R) R \ 1pre(R) \ 1pre 2 (R) P1 R = ( X) (R Æ 1pre(X)) Given R µ Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game
62
Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q N £ Q transition function Quantitative Game
63
Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q N £ Q transition function = [ Q ! N ] regions with V = N 1pre: 1pre(R)(q) = (max ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 1 (min 2 ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 2 Quantitative Game
64
Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q N £ Q transition function = [ Q ! N ] regions with V = N 1pre: 1pre(R)(q) = (max ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 1 (min 2 ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 2 2pre: 2pre(R)(q) = (min ) max( 1 (q, ), R( (q, )) ) if q 2 Q 1 (max 2 ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 2 Quantitative Game
65
c Maximizing Game ab 0 1 2 5 3
66
c ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 0 0 0
67
c Maximizing Game ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 1 0 0
68
c Maximizing Game ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 1 2 0
69
c Maximizing Game ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 2 2 0
70
B B Given B µ Q, find the states from which some path visits B infinitely often. Buechi Graph
71
B R 1 = pre(B)... pre(B) pre(B) [ pre 2 (B) B Given B µ Q, find the states from which some path visits B infinitely often. Buechi Graph
72
B R 1 = pre(B) B Given B µ Q, find the states from which some path visits B infinitely often. Buechi Graph
73
B R 1 = pre(B) R 2 = pre(B Å R 1 ) B Given B µ Q, find the states from which some path visits B infinitely often. Buechi Graph
74
B... B B = ( Y) 9 (B Æ 9 pre(Y)) Given B µ Q, find the states from which some path visits B infinitely often. Buechi Graph
75
B... B B = ( Y) ( X) ((B Æ 9 pre(Y)) Ç 9 pre(X)) Given B µ Q, find the states from which some path visits B infinitely often. Buechi Graph
76
B P1 B Given B µ Q, find the states from which player 1 has a strategy to force the game to B infinitely often. Buechi Game
77
B... P1 B R 2 = P1 1pre(B Å R 1 ) R 1 = P1 1pre(B) P1 B = ( Y) ( X) ((B Æ 1pre(Y)) Ç 1pre(X)) Given B µ Q, find the states from which player 1 has a strategy to force the game to B infinitely often. Buechi Game
78
Can we use the same value iteration scheme? Yes, iff the fixpoint expression computes correctly on all single-player (player 1 and player 2) structures. Reachability:9 p = ( X) (p Ç 9pre(X)) 8 p = ( X) (p Ç 8pre(X)) Hence:hhP1ii p = ( X) (p Ç 1pre(X)) hhP2ii p = ( X) (p Ç 2pre(X)) From Graphs to Games
79
Complexity of Turn-based Games 1.Reachability, safety: linear time (P-complete) 2.Buechi: quadratic time (optimal ???) 3.Parity: NP Å coNP (in P ???)
80
Complexity of Turn-based Games 1.Reachability, safety: linear time (P-complete) 2.Buechi: quadratic time (optimal ???) 3.Parity: NP Å coNP (in P ???) on graphs polynomial on graphs linear
81
Graph-based (finite-carrier) systems: Q = B m = boolean formulas [e.g. BDDs] 9 pre = ( 9 x 2 B ) Timed and hybrid systems: Q = B m £ R n = formulas of ( Q, ·,+) [e.g. polyhedral sets] 9 pre = ( 9 x 2 Q ) Beyond Graphs as Finite Carrier Sets
82
Q states 1, 2 moves of both players : Q 1 2 Q transition function Concurrent Game
83
Q states 1, 2 moves of both players : Q 1 2 Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff ( 1 1 ) ( 2 2 ) (q, 1, 2 ) R Concurrent Game
84
Q states 1, 2 moves of both players : Q 1 2 Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff ( 1 1 ) ( 2 2 ) (q, 1, 2 ) R 2pre: q 2pre(R) iff ( 2 2 ) ( 1 1 ) (q, 1, 2 ) R Concurrent Game
85
acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1
86
acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1 hhP2ii c=( X) ( c Ç 2pre(X) )
87
acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1 Concurrent Game hhP2ii c=( X) ( c Ç 2pre(X) )
88
acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1 Concurrent Game hhP2ii c=( X) ( c Ç 2pre(X) ) Pr(1): 0.5 Pr(2): 0.5
89
graph Extended Graph Models CONTROL: game graph OBJECTIVE: -automaton PROBABILITIES: Markov decision process stochastic game regular game CLOCKS: timed automaton stochastic hybrid system
90
Nondeterministic closed system. q1 q2 q3 Graph: 1 Player b a a
91
a Probabilistic closed system. 0.40.6 q1 q3 q2 q5 q4 MDP: 1.5 Players a b a c
92
Asynchronous open system. q1 q3 q2 q5 q4 Turn-based Game: 2 Players a b a c a
93
a Probabilistic asynchronous open system. 0.40.6 q1 q3 q2 q5 q4 q7 q6 Turn-based Stochastic Game: 2.5 Players cb c a b a
94
a aa q1 bb q2 q4q5q3 1,1 1,2 2,1 2,2 Concurrent Game Synchronous open system.
95
a aa q1 bb q2 q4q5q3 q2: 0.3 q3: 0.2 q4: 0.5 q5: q2: 0.1 q3: 0.1 q4: 0.5 q5: 0.3 q2: q3: 0.2 q4: 0.1 q5: 0.7 q2: 1.0 q3: q4: q5: 12 2 1 Matrix game at each vertex. q1: Concurrent Stochastic Game Probabilistic synchronous open system.
96
Graph: nondeterministic generator of behaviors (possibly stochastic) Strategy: deterministic selector of behaviors (possibly randomized) Graph + Strategies for both players ! Behavior
97
Two pure strategies at q1: “left” and “right”. Two pure behaviors: ab; aa. Model = graph Pure behavior = path q1 q2 q3 b a a
98
Two pure strategies at q1: “left” and “right”. Two pure behaviors: {ab: 1}; {aac: 0.4, aaa: 0.6}. Model = MDP Pure behavior = probability distribution on paths = p-path a 0.40.6 q1 q3 q2 q5 q4 a b a c
99
Model = turn-based game Pure behavior = path Two pure pl. 1 strategies at q1: “left” and “right”. Two pure pl. 2 strategies at q3: “left” and “right”. Three pure behaviors: ab; aac; aaa. q1 q3 q2 q5 q4 a b a c a
100
Model = turn-based game Pure behavior = path General (randomized) behavior = p-path Three pure behaviors: ab; aac; aaa. Infinitely many behaviors, e.g. {aac: 0.5, aaa: 0.5}. q1 q3 q2 q5 q4 a b a c a
101
The objective of each player is to find a strategy that optimizes the value of the resulting behavior. How do we define “value”? A. Assign a value to each path B. Assign a value to each behavior (expected value of A.) C. Assign a value to each state (strategy sup inf of B.)
102
A. Value of Paths Qualitative value function: : Q ! {0,1} e.g. -regular subsets of Q
103
B. Value of Behaviors path t: (T) = (t) p-path T: (T) = Exp { (T)} (expected value) Example: T = {aaa: 0.2, aab: 0.7, bbb: 0.1 } (} b)(T) = 0.8
104
C. Value of States hh1ii (q) = sup x inf y ( Outcome x,y (q) ) hh2ii (q) = sup y inf x ( Outcome x,y (q) )
105
Q states 1, 2 moves of both players : Q 1 2 Dist(Q) probabilistic transition function = [ Q ! [0,1] ] regions with V = [0,1] Concurrent Stochastic Game
106
Q states 1, 2 moves of both players : Q 1 2 Dist(Q) probabilistic transition function = [ Q ! [0,1] ] regions with V = [0,1] 1pre: 1pre(R)(q) = (sup 1 1 ) (inf 2 2 ) R( (q, 1, 2 )) Concurrent Stochastic Game
107
Q states 1, 2 moves of both players : Q 1 2 Dist(Q) probabilistic transition function = [ Q ! [0,1] ] regions with V = [0,1] 1pre: 1pre(R)(q) = (sup 1 1 ) (inf 2 2 ) R( (q, 1, 2 )) 2pre: 2pre(R)(q) = (sup 2 2 ) (inf 1 1 ) R( (q, 1, 2 )) Concurrent Stochastic Game
108
acb 1 1 2 2 Pl.1 Pl.2 a: 0.6 b: 0.4 a: 0.1 b: 0.9 a: 0.5 b: 0.5 a: 0.2 b: 0.8 1 1 2 2 Pl.1 Pl.2 a: 0.0 c: 1.0 a: 0.7 c: 0.3 a: 0.0 c: 1.0 Concurrent Stochastic Game
109
acb 1 1 2 2 Pl.1 Pl.2 a: 0.6 b: 0.4 a: 0.1 b: 0.9 a: 0.5 b: 0.5 a: 0.2 b: 0.8 1 1 2 2 Pl.1 Pl.2 a: 0.0 c: 1.0 a: 0.7 c: 0.3 a: 0.0 c: 1.0 Concurrent Stochastic Game hhP1ii c =( X) max( c, 1pre(X) ) 0 10
110
acb 1 1 2 2 Pl.1 Pl.2 a: 0.6 b: 0.4 a: 0.1 b: 0.9 a: 0.5 b: 0.5 a: 0.2 b: 0.8 1 1 2 2 Pl.1 Pl.2 a: 0.0 c: 1.0 a: 0.7 c: 0.3 a: 0.0 c: 1.0 Concurrent Stochastic Game hhP1ii c =( X) max( c, 1pre(X) ) 0 11
111
acb 1 1 2 2 Pl.1 Pl.2 a: 0.6 b: 0.4 a: 0.1 b: 0.9 a: 0.5 b: 0.5 a: 0.2 b: 0.8 1 1 2 2 Pl.1 Pl.2 a: 0.0 c: 1.0 a: 0.7 c: 0.3 a: 0.0 c: 1.0 Concurrent Stochastic Game hhP1ii c =( X) max( c, 1pre(X) ) 0.8 11
112
acb 1 1 2 2 Pl.1 Pl.2 a: 0.6 b: 0.4 a: 0.1 b: 0.9 a: 0.5 b: 0.5 a: 0.2 b: 0.8 1 1 2 2 Pl.1 Pl.2 a: 0.0 c: 1.0 a: 0.7 c: 0.3 a: 0.0 c: 1.0 Concurrent Stochastic Game hhP1ii c =( X) max( c, 1pre(X) ) 0.96 11
113
acb 1 1 2 2 Pl.1 Pl.2 a: 0.6 b: 0.4 a: 0.1 b: 0.9 a: 0.5 b: 0.5 a: 0.2 b: 0.8 1 1 2 2 Pl.1 Pl.2 a: 0.0 c: 1.0 a: 0.7 c: 0.3 a: 0.0 c: 1.0 Concurrent Stochastic Game hhP1ii c =( X) max( c, 1pre(X) ) limit 1 11
114
Solving Games by Value Iteration Reachability / max: Buechi / lim sup: Parity: …
115
Solving Games by Value Iteration Reachability / max: Buechi / lim sup: Parity: … Many open questions: How do different evaluation orders compare? How fast do these algorithms converge? When are they optimal?
116
1.Number of players: 1, 1.5, 2, 2.5 2.Alternation: turn-based or concurrent 3.Strategies: pure or randomized 4.Value of a path: qualitative (boolean) or quantitative (real) 5.Objective: Borel 1, 2, 3 6.Zero-sum vs. nonzero-sum Summary: Classification of Games
117
The two players have complementary path values: 2 (t) = 1 – 1 (t) -reachability vs. safety / max vs. min -Buechi vs. coBuechi / lim sup vs. lim inf -Rabin vs. Streett Main Theorem [Martin75, Martin98]: The concurrent stochastic games are determined for all Borel objectives, i.e., hh1ii 1 (q) + hh2ii 2 (q) = 1. sup inf = inf sup Summary: Zero-Sum Games
118
1.5 players 2 players 2.5 players concurrent parity CY98, dAl97: polynomial GH82, EJ88 dAM01 dAH00, CdAH06: NP Å coNP Summary: Zero-Sum Games
119
-optimal strategies may not exist -limit values may not be rational - -close strategies, for fixed , may require infinite memory -no determinacy for pure strategies a aa q1 bb 1,1 1,2 2,1 2,2 hhP1ii (} a) (q1) = 0 hhP2ii (} b) (q1) = 0 Concurrent Games are Difficult
120
-optimal strategies always exist [McIver/Morgan] -in the non-stochastic case, pure finite-memory optimal strategies exist for -regular objectives [Gurevich/Harrington] -for parity objectives, pure memoryless optimal strategies exist [Emerson/Jutla: non-stochastic Rabin; Condon: stochastic reachability; Chatterjee/deAlfaro/H: stochastic Rabin], hence NP Å coNP Turn-based Games are More Pleasant
121
-optimal strategies always exist [McIver/Morgan] -in the non-stochastic case, pure finite-memory optimal strategies exist for -regular objectives [Gurevich/Harrington] -for parity objectives, pure memoryless optimal strategies exist [Emerson/Jutla: non-stochastic Rabin; Condon: stochastic reachability; Chatterjee/deAlfaro/H: stochastic Rabin], hence NP Å coNP If solvable in P is open for non-stochastic parity games and for stochastic reachability games. Turn-based Games are More Pleasant
122
Summary Verification and control are very special (boolean) cases of graph-based optimization problems. They can be generalized to solve questions that involve multiple players, quantitative resources, probabilistic transitions, and continuous state spaces. The theory and practice of this is still wide open …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.