Presentation is loading. Please wait.

Presentation is loading. Please wait.

Institute for Applied Information Processing and Communications (IAIK) 1 TU Graz/Computer Science/IAIK Graz, 2009 AK Design and Verification Presentation.

Similar presentations


Presentation on theme: "Institute for Applied Information Processing and Communications (IAIK) 1 TU Graz/Computer Science/IAIK Graz, 2009 AK Design and Verification Presentation."— Presentation transcript:

1 Institute for Applied Information Processing and Communications (IAIK) 1 TU Graz/Computer Science/IAIK Graz, 2009 AK Design and Verification Presentation for the Lecture: AK Design and Verification by Robert Könighofer robert.koenighofer@student.tugraz.at A Strategy Improvement Algorithm for Mean Payoff Games

2 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 2 TU Graz/Computer Science/IAIK AK Design and Verification Contents  Main Source: H. Björklund, S. Sandberg, S. Vorobyov: A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games. [1]  Recap: Mean Payoff Games  0-Mean Partition Problem  Longest Shortest Path Problem  Algorithm  Improvements  Ergodic Partition Problem  Complexity  Appendix: proof sketches of main theorems

3 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 3 TU Graz/Computer Science/IAIK AK Design and Verification Recap: Mean Payoff Games (MPG) Given:  Finite, directed, edge weighted, leafless graph: G = (V, E, w)  V = V MAX ∪ V MIN, w: E  {-W, …, 0, …, W} Example: 2 -8 4 1 7...V MAX...V MIN

4 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 4 TU Graz/Computer Science/IAIK AK Design and Verification Recap: Mean Payoff Games (MPG) Notation:  2 Player: MIN and MAX  Play ρ = e 0 e 1 e 2 e 3 …  payoff(ρ) = val(ρ) = average(w(e i ))  Positional Strategy for MAX: σ MAX : V MAX  V so that (v, σ (v)) ∈ E Goals:  MAX: maximize val(ρ)  MIN: minimize val(ρ)

5 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 5 TU Graz/Computer Science/IAIK AK Design and Verification MPG: Properties  Optimal strategy is positional  val(v) = val(ρ) when ρ starts in v and both players play optimal  Optimal σ MAX : ensures payoff(ρ) ≥ val(v)  Optimal σ MIN : ensures payoff(ρ) ≤ val(v)  Play ρ = finite stem + loop  val(ρ) = average(loop)

6 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 6 TU Graz/Computer Science/IAIK AK Design and Verification Computational Problems Decision Problem:  Can MAX guarantee payoff > p from v 0 ? p-Mean Partition:  Divide V into V ≤p and V >p  MAX can guarantee payoff >p from all v ∈ V >p  MIN can guarantee payoff ≤p from all v ∈ V ≤p Ergodic Partition:  Compute val(v) for all v ∈V

7 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 7 TU Graz/Computer Science/IAIK AK Design and Verification 0-Mean Partition: Approach  MPG  LSP (Longest Shortest Path Problem)  Solve LSP by Strategy Improvement: σ = σ 0 while(σ changes): σ = Improve(σ)

8 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 8 TU Graz/Computer Science/IAIK AK Design and Verification Longest Shortest Path Problem Given:  Finite, directed, edge weighted graph: G = (V, t, E, w)  V = V MAX ∪ V MIN  t = unique sink, t ∉ V MAX  here: σ 0, avoiding negative cycles Find:  positional σ : shortest path from every v to t is as long as possible in G σ = G ∩ σ

9 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 9 TU Graz/Computer Science/IAIK AK Design and Verification Transformation MPG  LSP  Insert ‘retreat vertex’ t  For all v i ∈ V MAX : ad d edge e i = (v i,t), w(e i ) = 0  Add edge (t,t) with w(t,t) = 0  Example: 2 -8 4 1 7...V MAX...V MIN 0 0 0 t

10 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 10 TU Graz/Computer Science/IAIK AK Design and Verification Relation MPG  LSP MPGLSP  v ∈ V >0  dist(v,t) = ∞  MAX: enforce pos. loop  MAX: enforce pos. loop  MIN: enforce neg. loop  MAX: retreat, dist(v,t) < ∞ 2 -8 4 1 7 2 -8 4 1 70 0 0 t 2 -8 -2 1 7 2 -8 -2 1 70 0 0 t

11 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 11 TU Graz/Computer Science/IAIK AK Design and Verification Relation MPG  LSP Admissable strategy:  enforces positive loops  OR retreat  we iterate over admissable strategies only  σ 0 : go to t from every v ∈V MAX

12 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 12 TU Graz/Computer Science/IAIK AK Design and Verification Remember our approach:  MPG  LSP (Longest Shortest Path Problem)  Solve LSP by Strategy Improvement: σ = σ 0 while(σ changes): σ = Improve(σ)

13 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 13 TU Graz/Computer Science/IAIK AK Design and Verification Quality of a Strategy  Only admissable strategies  MIN: take shortest path to t (any other loop is positive)  val σ ( v): shortest distance from v to t in G σ  σ is better than σ* (σ > σ*) iff:  ∀v∈V: val σ ( v) ≥ val σ* ( v) AND  ∃v∈V: val σ ( v) > val σ* ( v)

14 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 14 TU Graz/Computer Science/IAIK AK Design and Verification Computing val σ ( v): Shortest Path Problem Given:  Finite, directed, edge weighted graph: G = (V, t, E, w)  t = unique sink Find:  shortest path from every v to t Algorithms:  Dijkstra's algorithm: only positive weights  Bellman Ford algorithm: also negative weights

15 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 15 TU Graz/Computer Science/IAIK AK Design and Verification Bellman Ford Algorithm [3] Foreach v in V distance[v]= ∞ succ[v] = None distance[t] = 0 succ[t] = t do |V|-1 times: foreach (u,v) in E: if(distance[v] + w(u,v) < distance[u]): distance[u] = distance[v] + w(u,v)‏ succ[u] = v uv 2 5 2 uv 2 4 2

16 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 16 TU Graz/Computer Science/IAIK AK Design and Verification Bellman Ford Algorithm Example: 2 -8 4 1 7 0 0 0 t v0v0 v1v1 v2v2 v3v3...V MAX...V MIN -8 7 0 0 0 t v0v0 v1v1 v2v2 v3v3 e1e1 e2e2 e3e3 e4e4 e5e5 0123 t0000000 v0v0 ∞∞ 00000 v1v1 ∞ 000000 v2v2 ∞∞∞∞ 6 v3v3 ∞∞∞ distances:

17 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 17 TU Graz/Computer Science/IAIK AK Design and Verification Bellman Ford Algorithm Another Example: 0 t v0v0 v1v1 012 t0000 v0v0 ∞12 9 v1v1 ∞∞ 10 distances: 12 -2 e2e2 e1e1 e3e3  Bellman Ford does not work with negative loops

18 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 18 TU Graz/Computer Science/IAIK AK Design and Verification Switching the strategy  Picking another successor in a V MAX vertex:  Notation: σ [x  y]  σ [x  y](x) = y  σ [x  y](a) = σ(a) Switch σ [v  u] is:  attractive iff: w(v,u) + val σ (u) > val σ (v)  profitable iff: σ [v  u] > σ  expensive to check vu 3 4 2 vu 3 5 2

19 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 19 TU Graz/Computer Science/IAIK AK Design and Verification Main Theorems Theorem 5.1:  Switch is attractive  Switch is profitable  Also holds for combinations of switches Theorem 5.2:  No more attractive switches  strategy at least as good as any other admissable strategy.

20 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 20 TU Graz/Computer Science/IAIK AK Design and Verification Putting the pieces together solve_0-mean_partition(G’): G = MPGtoLSP(G’)‏ σ 0 = computeInitialAdmissableStrategy(G)‏ σ = σ 0 while(σ changes): (σ, distance) = Improve(σ, G)‏ V MAX = V MIN = emptySet foreach v in (G.V\t): distance[v] == ∞ ? V MAX.add(v) : V MIN.add(v)‏ return (V MAX, V MIN, σ)‏ Improve(σ, G): G σ = restrictGraph(G, σ)‏ distance = BellmanFord(G σ )‏ (v, u, failed) = findAttractiveSwitch(distance)‏ if(failed): return (σ, distance)‏ return (σ[v->u], None)‏ findAttractiveSwitch(distance): foreach (v,u) in (G.E \ G σ.E): if(w(v,u) + distance[u] > distance [v]): return (v,u,0)‏ return (None, None,1)‏

21 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 21 TU Graz/Computer Science/IAIK AK Design and Verification Putting the pieces together Example: 2 -8 4 1 7 2 -8 4 1 70 0 0 t 0 0 MPG to LSP σ = σ 0 2 -8 4 1 70 0 0 t ∞ ∞ ∞ ∞ σ = Improve(σ) 2 -8 4 1 70 0 0 t 1 0 -7 σ = Improve(σ) 2 -8 4 1 7 0 0 t 1 0 -7 σ = Improve(σ)

22 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 22 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Switches  Any combination of attractive switches improves the strategy  Multiple switches per iteration  Try heuristics for selecting single or multiple attractive switches  Random, all attractive switches,...  Initial Multiple Switching  Proceeding in Stages

23 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 23 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Randomization  Order of switches is crutial for complexity  Facet F[u  v] = set of strategies  where succ[u] = v, u ∈ V MAX  Randomization scheme [4]: find_best_strategy(σ,G)‏ if(G == G σ ): return σ while(true): randomly pick some F[u->v] not containing σ σ* = find_best_strategy(σ, G\(u,v)) if(σ* is optimal in G): return σ* G = F σ = σ[u->v]

24 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 24 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Randomization Example: 2 -8 4 1 70 0 0 t v0 v1 v2 v3 pick F[v1  v0] 2 -8 1 70 0 0 t v0 v1 v2 v3 (G\(v1,v0), σ): (G, σ): σ* optimal in G? NO! There is an attractive switch! 2 -8 4 70 0 t v0 v1 v2 v3 σ = σ*, G = F[v1  v0] (G, σ): 2 -8 4 1 70 0 0 t v0 v1 v2 v3 σ* = find_best_strategy(σ, G\(v1,v0)) (G, σ*): call recursive

25 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 25 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Randomization Example continued: 2 -8 4 70 0 t v0 v1 v2 v3 pick F[v0->t] (G\(v0,t), σ ): 2 -8 4 70 t v0 v1 v2 v3 σ* optimal in G? YES! No more attractive switches! 2 -8 4 1 70 0 0 t v0 v1 v2 v3 (G, σ): σ* = find_best_strategy(σ, G\(v0,t)) (G, σ*): 2 -8 4 70 t v0 v1 v2 v3 0 recursive call

26 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 26 TU Graz/Computer Science/IAIK AK Design and Verification Improvements: Recomputing the measure  Switch from σ to σ* = σ [v  u]  val σ ( v) = val σ* (v) for some v  Compute nodes that change their value  Bellman Ford Algorithm only for these nodes

27 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 27 TU Graz/Computer Science/IAIK AK Design and Verification Which values change?  σ* = σ[u 1  v 1 ][u 2  v 2 ]...  U = {u 1, u 2,...} Mark all vertices in U while(U not empty): u = U.pop()‏ foreach unmarked predecessors p of u in G σ* : if w(p,x) + d[x] > d[p] for all unmarked succ x of p in G σ* : mark u U.push(u)‏ u p x1x1 x2x2 5 2 3 46

28 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 28 TU Graz/Computer Science/IAIK AK Design and Verification Which values change? Example: 2 -8 4 1 70 0 0 t v0v0 v1v1 v2v2 v3v3 0 0 (G, σ): 2 -8 4 1 70 0 0 t v0v0 v1v1 v2v2 v3v3 (G, σ*): σ* = σ[v 0  v 3 ] Switch -8 2 -8 70 0 t 0 0 v0v0 v1v1 v2v2 v3v3 (G σ*, σ*): markedUu{p i }{x j }conditionTRUE for all x j ? v0v0 {v 0 }v0v0 v2v2 v3v3 w(v 2,v 3 )+d[v 3 ]>d[v 2 ] 7-1 > -8 YES v 0, v 2 {v 2 }v2v2 ----

29 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 29 TU Graz/Computer Science/IAIK AK Design and Verification Complexity: p-Mean Partition Without Improvement:  n·W finite values  n²·W switches  Bellman Ford: O(n·m)  O(n³·m·W) With Improvement:  Bellman Ford: O(n i · m), n i = |changing nodes|  Σ(n i ) = n²·W  O(n²·m·W)

30 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 30 TU Graz/Computer Science/IAIK AK Design and Verification Complexity: p-Mean Partition With Randomization [4]: All together:

31 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 31 TU Graz/Computer Science/IAIK AK Design and Verification Ergodic Partition  w -, w + : smallest and biggest edge weights  val(ρ) = average(w(e i )) in [w -, w + ]  denominator ≤ n  Repeated p-mean partitioning:  Break interval [w -, w + ] in parts of length 1/n²  decide which v are in each interval  Min. difference between 2 values:  Unique value in each interval

32 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 32 TU Graz/Computer Science/IAIK AK Design and Verification Complexity: Comparison This ApproachZwick / Paterson Ergodic Partition p-Mean Partition

33 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 33 TU Graz/Computer Science/IAIK AK Design and Verification Summary  The first strongly subexponential algorithm  Algorithm for p-Mean Partition problem  Longest Shortest Path Problem  Strategy improvement  Improvements  Extended to Ergodic Partition problem

34 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 34 TU Graz/Computer Science/IAIK AK Design and Verification References [1]H. Björklund, S. Sandberg, S. Vorobyov, A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games, in: Proc. 29th International Symposium on Mathematical Foundations of Computer Science (MFCS), Vol. 3153 of Lecture Notes in Computer Science, Springer-Verlag, 2004, pp. 673–685. [2] U. Zwick and M. Paterson. The complexity of mean payoff games on graphs. Theor.Comput. Sci., 158:343–359, 1996. [3]T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press and McGraw-Hill Book Company, Cambridge, MA, 2nd edition, 2001. [4]J. Matousek, M. Sharir, and M. Welzl. A subexponential bound for linear programming. In 8th ACM Symp. on Computational Geometry, pages 1–8, 1992.

35 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 35 TU Graz/Computer Science/IAIK AK Design and Verification Questions / Discussion... thank you for your attention

36 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 36 TU Graz/Computer Science/IAIK AK Design and Verification Appendix  Proof sketches for Theorem 5.1 and 5.2

37 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 37 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.1 (attractive  profitable) Value increases at least in one vertex:  Attractive switch σ * = σ [v  u]:  w(v,u) + val σ (u) > val σ (v)  val σ * (v) > val σ (v) Values do not decrease:  New loops are positive  New paths to the sink are longer

38 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 38 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.1 (attractive  profitable) New loops are positive:  Switch σ * = σ [v  u]:  New loop must contain switching vertex v 0 t v y u x y = val σ (v) < w(v,u) + val σ (u) ≤ x + y  x > 0 switch is attractive val σ (u) ≤ x – w(v,u) + y

39 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 39 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.1 (attractive  profitable) New paths to the sink are longer :  Switch σ * = σ [v  u]:  New path from any vertex n to t must contain switching vertex v 0 t v y u x y = val σ (v) < w(v,u) + val σ (u) ≤ x  y < x n a switch is attractive val σ (u) ≤ x – w(v,u)

40 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 40 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.2 (stable  optimal) Proof for one-player games:  MIN has no choices  Finite values cannot become infinite  no more attractive switches  no more new positive loops  Finite values do not improve finitely  no more attractive switches  no more new longer paths to t Extension to two-player games:  MIN does not need choices

41 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 41 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.2 (stable  optimal) No more new positive loops:  Assumption: there is a new positive loop 0 t v y u x switch attractive  y < x + y no attractive switches  y ≥ x + y  x ≤ 0

42 http://www.iaik.tugraz.at Institute for Applied Information Processing and Communications (IAIK) 42 TU Graz/Computer Science/IAIK AK Design and Verification Proof sketch: Theorem 5.2 (stable  optimal) No more new longer paths to t :  Assumption: there is a new longer path to t switch attractive  y < x no attractive switches  y ≥ x  can not have better finite values 0 t v y u x n a


Download ppt "Institute for Applied Information Processing and Communications (IAIK) 1 TU Graz/Computer Science/IAIK Graz, 2009 AK Design and Verification Presentation."

Similar presentations


Ads by Google