Download presentation
Presentation is loading. Please wait.
Published byMelvyn Cross Modified over 9 years ago
1
Institute for Applied Information Processing and Communications (IAIK) 1 TU Graz/Computer Science/IAIK Graz, 2009 AK Design and Verification Presentation for the Lecture: AK Design and Verification by Robert Könighofer robert.koenighofer@student.tugraz.at A Strategy Improvement Algorithm for Mean Payoff Games
2
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 2 TU Graz/Computer Science/IAIK AK Design and Verification Contents Main Source: H. Björklund, S. Sandberg, S. Vorobyov: A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games. [1] Recap: Mean Payoff Games 0-Mean Partition Problem Longest Shortest Path Problem Algorithm Improvements Ergodic Partition Problem Complexity Appendix: proof sketches of main theorems
3
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 3 TU Graz/Computer Science/IAIK AK Design and Verification Recap: Mean Payoff Games (MPG) Given: Finite, directed, edge weighted, leafless graph: G = (V, E, w) V = V MAX ∪ V MIN, w: E {-W, …, 0, …, W} Example: 2 -8 4 1 7...V MAX...V MIN
4
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 4 TU Graz/Computer Science/IAIK AK Design and Verification Recap: Mean Payoff Games (MPG) Notation: 2 Player: MIN and MAX Play ρ = e 0 e 1 e 2 e 3 … payoff(ρ) = val(ρ) = average(w(e i )) Positional Strategy for MAX: σ MAX : V MAX V so that (v, σ (v)) ∈ E Goals: MAX: maximize val(ρ) MIN: minimize val(ρ)
5
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 5 TU Graz/Computer Science/IAIK AK Design and Verification MPG: Properties Optimal strategy is positional val(v) = val(ρ) when ρ starts in v and both players play optimal Optimal σ MAX : ensures payoff(ρ) ≥ val(v) Optimal σ MIN : ensures payoff(ρ) ≤ val(v) Play ρ = finite stem + loop val(ρ) = average(loop)
6
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 6 TU Graz/Computer Science/IAIK AK Design and Verification Computational Problems Decision Problem: Can MAX guarantee payoff > p from v 0 ? p-Mean Partition: Divide V into V ≤p and V >p MAX can guarantee payoff >p from all v ∈ V >p MIN can guarantee payoff ≤p from all v ∈ V ≤p Ergodic Partition: Compute val(v) for all v ∈V
7
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 7 TU Graz/Computer Science/IAIK AK Design and Verification 0-Mean Partition: Approach MPG LSP (Longest Shortest Path Problem) Solve LSP by Strategy Improvement: σ = σ 0 while(σ changes): σ = Improve(σ)
8
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 8 TU Graz/Computer Science/IAIK AK Design and Verification Longest Shortest Path Problem Given: Finite, directed, edge weighted graph: G = (V, t, E, w) V = V MAX ∪ V MIN t = unique sink, t ∉ V MAX here: σ 0, avoiding negative cycles Find: positional σ : shortest path from every v to t is as long as possible in G σ = G ∩ σ
9
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 9 TU Graz/Computer Science/IAIK AK Design and Verification Transformation MPG LSP Insert ‘retreat vertex’ t For all v i ∈ V MAX : ad d edge e i = (v i,t), w(e i ) = 0 Add edge (t,t) with w(t,t) = 0 Example: 2 -8 4 1 7...V MAX...V MIN 0 0 0 t
10
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 10 TU Graz/Computer Science/IAIK AK Design and Verification Relation MPG LSP MPGLSP v ∈ V >0 dist(v,t) = ∞ MAX: enforce pos. loop MAX: enforce pos. loop MIN: enforce neg. loop MAX: retreat, dist(v,t) < ∞ 2 -8 4 1 7 2 -8 4 1 70 0 0 t 2 -8 -2 1 7 2 -8 -2 1 70 0 0 t
11
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 11 TU Graz/Computer Science/IAIK AK Design and Verification Relation MPG LSP Admissable strategy: enforces positive loops OR retreat we iterate over admissable strategies only σ 0 : go to t from every v ∈V MAX
12
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 12 TU Graz/Computer Science/IAIK AK Design and Verification Remember our approach: MPG LSP (Longest Shortest Path Problem) Solve LSP by Strategy Improvement: σ = σ 0 while(σ changes): σ = Improve(σ)
13
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 13 TU Graz/Computer Science/IAIK AK Design and Verification Quality of a Strategy Only admissable strategies MIN: take shortest path to t (any other loop is positive) val σ ( v): shortest distance from v to t in G σ σ is better than σ* (σ > σ*) iff: ∀v∈V: val σ ( v) ≥ val σ* ( v) AND ∃v∈V: val σ ( v) > val σ* ( v)
14
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 14 TU Graz/Computer Science/IAIK AK Design and Verification Computing val σ ( v): Shortest Path Problem Given: Finite, directed, edge weighted graph: G = (V, t, E, w) t = unique sink Find: shortest path from every v to t Algorithms: Dijkstra's algorithm: only positive weights Bellman Ford algorithm: also negative weights
15
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 15 TU Graz/Computer Science/IAIK AK Design and Verification Bellman Ford Algorithm [3] Foreach v in V distance[v]= ∞ succ[v] = None distance[t] = 0 succ[t] = t do |V|-1 times: foreach (u,v) in E: if(distance[v] + w(u,v) < distance[u]): distance[u] = distance[v] + w(u,v) succ[u] = v uv 2 5 2 uv 2 4 2
16
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 16 TU Graz/Computer Science/IAIK AK Design and Verification Bellman Ford Algorithm Example: 2 -8 4 1 7 0 0 0 t v0v0 v1v1 v2v2 v3v3...V MAX...V MIN -8 7 0 0 0 t v0v0 v1v1 v2v2 v3v3 e1e1 e2e2 e3e3 e4e4 e5e5 0123 t0000000 v0v0 ∞∞ 00000 v1v1 ∞ 000000 v2v2 ∞∞∞∞ 6 v3v3 ∞∞∞ distances:
17
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 17 TU Graz/Computer Science/IAIK AK Design and Verification Bellman Ford Algorithm Another Example: 0 t v0v0 v1v1 012 t0000 v0v0 ∞12 9 v1v1 ∞∞ 10 distances: 12 -2 e2e2 e1e1 e3e3 Bellman Ford does not work with negative loops
18
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 18 TU Graz/Computer Science/IAIK AK Design and Verification Switching the strategy Picking another successor in a V MAX vertex: Notation: σ [x y] σ [x y](x) = y σ [x y](a) = σ(a) Switch σ [v u] is: attractive iff: w(v,u) + val σ (u) > val σ (v) profitable iff: σ [v u] > σ expensive to check vu 3 4 2 vu 3 5 2
19
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 19 TU Graz/Computer Science/IAIK AK Design and Verification Main Theorems Theorem 5.1: Switch is attractive Switch is profitable Also holds for combinations of switches Theorem 5.2: No more attractive switches strategy at least as good as any other admissable strategy.
20
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 20 TU Graz/Computer Science/IAIK AK Design and Verification Putting the pieces together solve_0-mean_partition(G’): G = MPGtoLSP(G’) σ 0 = computeInitialAdmissableStrategy(G) σ = σ 0 while(σ changes): (σ, distance) = Improve(σ, G) V MAX = V MIN = emptySet foreach v in (G.V\t): distance[v] == ∞ ? V MAX.add(v) : V MIN.add(v) return (V MAX, V MIN, σ) Improve(σ, G): G σ = restrictGraph(G, σ) distance = BellmanFord(G σ ) (v, u, failed) = findAttractiveSwitch(distance) if(failed): return (σ, distance) return (σ[v->u], None) findAttractiveSwitch(distance): foreach (v,u) in (G.E \ G σ.E): if(w(v,u) + distance[u] > distance [v]): return (v,u,0) return (None, None,1)
21
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 21 TU Graz/Computer Science/IAIK AK Design and Verification Putting the pieces together Example: 2 -8 4 1 7 2 -8 4 1 70 0 0 t 0 0 MPG to LSP σ = σ 0 2 -8 4 1 70 0 0 t ∞ ∞ ∞ ∞ σ = Improve(σ) 2 -8 4 1 70 0 0 t 1 0 -7 σ = Improve(σ) 2 -8 4 1 7 0 0 t 1 0 -7 σ = Improve(σ)
22
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 22 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Switches Any combination of attractive switches improves the strategy Multiple switches per iteration Try heuristics for selecting single or multiple attractive switches Random, all attractive switches,... Initial Multiple Switching Proceeding in Stages
23
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 23 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Randomization Order of switches is crutial for complexity Facet F[u v] = set of strategies where succ[u] = v, u ∈ V MAX Randomization scheme [4]: find_best_strategy(σ,G) if(G == G σ ): return σ while(true): randomly pick some F[u->v] not containing σ σ* = find_best_strategy(σ, G\(u,v)) if(σ* is optimal in G): return σ* G = F σ = σ[u->v]
24
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 24 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Randomization Example: 2 -8 4 1 70 0 0 t v0 v1 v2 v3 pick F[v1 v0] 2 -8 1 70 0 0 t v0 v1 v2 v3 (G\(v1,v0), σ): (G, σ): σ* optimal in G? NO! There is an attractive switch! 2 -8 4 70 0 t v0 v1 v2 v3 σ = σ*, G = F[v1 v0] (G, σ): 2 -8 4 1 70 0 0 t v0 v1 v2 v3 σ* = find_best_strategy(σ, G\(v1,v0)) (G, σ*): call recursive
25
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 25 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Randomization Example continued: 2 -8 4 70 0 t v0 v1 v2 v3 pick F[v0->t] (G\(v0,t), σ ): 2 -8 4 70 t v0 v1 v2 v3 σ* optimal in G? YES! No more attractive switches! 2 -8 4 1 70 0 0 t v0 v1 v2 v3 (G, σ): σ* = find_best_strategy(σ, G\(v0,t)) (G, σ*): 2 -8 4 70 t v0 v1 v2 v3 0 recursive call
26
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 26 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Recomputing the measure Switch from σ to σ* = σ [v u] val σ ( v) = val σ* (v) for some v Compute nodes that change their value Bellman Ford Algorithm only for these nodes
27
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 27 TU Graz/Computer Science/IAIK AK Design and Verification Which values change? σ* = σ[u 1 v 1 ][u 2 v 2 ]... U = {u 1, u 2,...} Mark all vertices in U while(U not empty): u = U.pop() foreach unmarked predecessors p of u in G σ* : if w(p,x) + d[x] > d[p] for all unmarked succ x of p in G σ* : mark u U.push(u) u p x1x1 x2x2 5 2 3 46
28
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 28 TU Graz/Computer Science/IAIK AK Design and Verification Which values change? Example: 2 -8 4 1 70 0 0 t v0v0 v1v1 v2v2 v3v3 0 0 (G, σ): 2 -8 4 1 70 0 0 t v0v0 v1v1 v2v2 v3v3 (G, σ*): σ* = σ[v 0 v 3 ] Switch -8 2 -8 70 0 t 0 0 v0v0 v1v1 v2v2 v3v3 (G σ*, σ*): markedUu{p i }{x j }conditionTRUE for all x j ? v0v0 {v 0 }v0v0 v2v2 v3v3 w(v 2,v 3 )+d[v 3 ]>d[v 2 ] 7-1 > -8 YES v 0, v 2 {v 2 }v2v2 ----
29
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 29 TU Graz/Computer Science/IAIK AK Design and Verification Complexity: p-Mean Partition Without Improvement: n·W finite values n²·W switches Bellman Ford: O(n·m) O(n³·m·W) With Improvement: Bellman Ford: O(n i · m), n i = |changing nodes| Σ(n i ) = n²·W O(n²·m·W)
30
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 30 TU Graz/Computer Science/IAIK AK Design and Verification Complexity: p-Mean Partition With Randomization [4]: All together:
31
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 31 TU Graz/Computer Science/IAIK AK Design and Verification Ergodic Partition w -, w + : smallest and biggest edge weights val(ρ) = average(w(e i )) in [w -, w + ] denominator ≤ n Repeated p-mean partitioning: Break interval [w -, w + ] in parts of length 1/n² decide which v are in each interval Min. difference between 2 values: Unique value in each interval
32
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 32 TU Graz/Computer Science/IAIK AK Design and Verification Complexity: Comparison This ApproachZwick / Paterson Ergodic Partition p-Mean Partition
33
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 33 TU Graz/Computer Science/IAIK AK Design and Verification Summary The first strongly subexponential algorithm Algorithm for p-Mean Partition problem Longest Shortest Path Problem Strategy improvement Improvements Extended to Ergodic Partition problem
34
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 34 TU Graz/Computer Science/IAIK AK Design and Verification References [1]H. Björklund, S. Sandberg, S. Vorobyov, A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games, in: Proc. 29th International Symposium on Mathematical Foundations of Computer Science (MFCS), Vol. 3153 of Lecture Notes in Computer Science, Springer-Verlag, 2004, pp. 673–685. [2] U. Zwick and M. Paterson. The complexity of mean payoff games on graphs. Theor.Comput. Sci., 158:343–359, 1996. [3]T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press and McGraw-Hill Book Company, Cambridge, MA, 2nd edition, 2001. [4]J. Matousek, M. Sharir, and M. Welzl. A subexponential bound for linear programming. In 8th ACM Symp. on Computational Geometry, pages 1–8, 1992.
35
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 35 TU Graz/Computer Science/IAIK AK Design and Verification Questions / Discussion... thank you for your attention
36
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 36 TU Graz/Computer Science/IAIK AK Design and Verification Appendix Proof sketches for Theorem 5.1 and 5.2
37
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 37 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.1 (attractive profitable) Value increases at least in one vertex: Attractive switch σ * = σ [v u]: w(v,u) + val σ (u) > val σ (v) val σ * (v) > val σ (v) Values do not decrease: New loops are positive New paths to the sink are longer
38
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 38 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.1 (attractive profitable) New loops are positive: Switch σ * = σ [v u]: New loop must contain switching vertex v 0 t v y u x y = val σ (v) < w(v,u) + val σ (u) ≤ x + y x > 0 switch is attractive val σ (u) ≤ x – w(v,u) + y
39
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 39 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.1 (attractive profitable) New paths to the sink are longer : Switch σ * = σ [v u]: New path from any vertex n to t must contain switching vertex v 0 t v y u x y = val σ (v) < w(v,u) + val σ (u) ≤ x y < x n a switch is attractive val σ (u) ≤ x – w(v,u)
40
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 40 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.2 (stable optimal) Proof for one-player games: MIN has no choices Finite values cannot become infinite no more attractive switches no more new positive loops Finite values do not improve finitely no more attractive switches no more new longer paths to t Extension to two-player games: MIN does not need choices
41
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 41 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.2 (stable optimal) No more new positive loops: Assumption: there is a new positive loop 0 t v y u x switch attractive y < x + y no attractive switches y ≥ x + y x ≤ 0
42
http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 42 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.2 (stable optimal) No more new longer paths to t : Assumption: there is a new longer path to t switch attractive y < x no attractive switches y ≥ x can not have better finite values 0 t v y u x n a
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.