Download presentation
Presentation is loading. Please wait.
Published byGeorgina Cook Modified over 9 years ago
2
Uri Zwick – Tel Aviv Univ. Randomized pivoting rules for the simplex algorithm Lower bounds TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA MDS summer school “The Combinatorics of Linear and Semidefinite Programming” August 14-16, 2012
3
Largest improvement Largest slope Dantzig’s rule – Largest modified cost Bland’s rule – avoids cycling Lexicographic rule – also avoids cycling Deterministic pivoting rules All known to require an exponential number of steps, in the worst-case Klee-Minty (1972) Jeroslow (1973), Avis-Chvátal (1978), Goldfarb-Sit (1979), …, Amenta-Ziegler (1996)
4
Klee-Minty cubes (1972) Taken from a paper by Gärtner-Henk-Ziegler
5
Random-Edge Choose a random improving edge Randomized pivoting rules Random-Facet is sub-exponential! Random-Facet Described in previous lecture ☺ [Kalai (1992)] [Matoušek-Sharir-Welzl (1996)] Are Random-Edge and Random-Facet polynomial ???
6
Abstract objective functions (AOFs) Every face should have a unique sink Acyclic Unique Sink Orientations (AUSOs)
7
AUSOs of n-cubes The directed diameter is exactly n Stickney, Watson (1978) Morris (2001) Szabó, Welzl (2001) Gärtner (2002) USOs and AUSOs Exercise: Prove it. 2n facets 2 n vertices
8
AUSO results Random-Facet is sub-exponential [Kalai (1992)] [Matoušek-Sharir-Welzl (1996)] Sub-exponential lower bound for Random-Facet [Matoušek (1994)] Sub-exponential lower bound for Random-Edge [Matoušek-Szabó (2006)] Lower bounds do not correspond to actual linear programs Can geometry help?
9
Random-Edge, Random-Facet are not polynomial for LPs Consider LPs that correspond to Markov Decision Processes (MDPs) Simplex Policy iteration Obtain sub-exponential lower bounds for the Random-Edge and Random-Facet variants of the Policy Iteration algorithm for MDPs
10
Upper boundLower boundAlgorithm RANDOM EDGE RANDOM FACET Randomized Pivoting Rules [Kalai ’92] [Matousek-Sharir-Welzl ’92] [Friedmann-Hansen-Z ’11] Lower bounds obtained for LPs whose diameter is n
11
3-bit counter
12
Limiting average version Discounted version Total reward version Turn-based 2-Player Stochastic Games [Shapley ’53] [Gillette ’57] … [Condon ’92] Both players have optimal positional strategies Can optimal strategies be found in polynomial time?
13
Stopping condition For the total reward version assume: No matter what the players do, the game stops with probability 1. Exercise: Show that discounted games correspond directly to stopping total reward games
14
A deterministic strategy specifies which action to take given every possible history A memoryless strategy is a strategy that depends only on the current state A positional strategy is a deterministic memoryless strategy Strategies / Policies A mixed strategy is a probability distribution over deterministic strategies
15
Values Both players have positional optimal strategies positional general positional general There are positional strategies that are optimal for every starting position
16
Markov Decision Processes [Shapley ’53] [Bellman ’57] [Howard ’60] … Optimal positional policies can be found using LP Is there a strongly polynomial time algorithm? Limiting average version Discounted version Total reward version
17
Stochastic shortest paths (SSPs) Minimize the expected cost of getting to the target
18
Limiting average version Discounted version Total reward version Turn-based non-Stochastic Games [Ehrenfeucht-Mycielski (1979)] Both players have optimal positional strategies Still no polynomial time algorithms known! Easy
19
Turn-based Stochastic Games (SGs) long-term planning in a stochastic and adversarial environment Deterministic MDPs (DMDPs) non-stochastic, non-adversarial Markov Decision Processes (MDPs) non-adversarial stochastic Non-Stochastic Games (MPGs) adversarial non-stochastic 2½-players 2-players1½-players 1-player
20
Parity Games (PGs) A simple example 2 141 32 EVEN wins if largest priority seen infinitely often is even Priorities
21
Parity Games (PGs) EVEN 3 ODD 8 EVEN wins if largest priority seen infinitely often is even Equivalent to many interesting problems in automata and verification: Non-emptyness of -tree automata modal -calculus model checking
22
Parity Games (PGs) EVEN 3 ODD 8 Replace priority k by payoff ( n) k Mean Payoff Games (MPGs) Move payoffs to outgoing edges [Stirling (1993)] [Puri (1995)]
23
Let’s focus on MDPs
24
Evaluating a policy MDP + policy Markov Chain Values of a fixed policy can be found by solving a system of linear equations
25
Improving a policy (using a single switch)
27
Policy iteration for MDPs [Howard ’60]
28
Dual LP formulation for MDPs
29
Basic solution (positional) Policy a is not an improving switch
30
Primal LP formulation for MDPs Vertex Complement of a Policy
31
TB2SG NP co-NP TB2SG P ???
32
Policy iteration variants
33
Random-Facet for MDPs Choose a random action not in the current policy and ignore it. Solve recursively without this action. If the ignored action is not an improving switch with respect to the returned policy, we are done. Otherwise, switch to the ignored action and solve recursively.
34
Policy iteration for 2-player games Keep a strategy of player 1 and an optimal counter-strategy of player 2. Perform improving switches for player 1 and recompute an optimal counter-strategy for player 2. Exercise: Does it really work? Random-Facet yields a sub-exponential algorithm for turn-based 2-player stochastic games!
35
Lower bounds for Policy Iteration Switch-All for Parity Games is exponential [Friedmann ’09] Switch-All for MDPs is exponential [Fearnley ’10] Random-Facet for Parity Games is sub-exponential [Friedmann-Hansen-Z ’11] Random-Facet and Random-Edge for MDPs and hence for LPs are sub-exponential [FHZ’11]
36
Lower bound for Random-Facet Implement a randomized counter
37
Lower bound for Random-Facet Implement a randomized counter Lower bound for Random-Edge Implement a standard counter
38
Dantzig’s pivoting rule, and the standard policy iteration algorithm, Switch-All, are polynomial for discounted MDPs, with a fixed discount factor [Ye ’10] Switch-All is almost linear for discounted MDPs and discounted turn-based 2-player Stochastic Games, with a fixed discount factor [Hansen-Miltersen-Z ’11] Upper bounds for Policy Iteration
39
Non- discounted DiscountedAlgorithm SWITCH BEST SWITCH ALL [Ye ’10] [Hansen-Miltersen-Z ’11] [Friedmann ’09] [Fearnley ’10] Deterministic Algorithms [Condon ’93]
40
3-bit counter (−N) 15
41
3-bit counter 010
42
3-bit counter – Improving switches 010 Random-Edge can choose either one of these improving switches…
43
Cycle gadgets Cycles close one edge at a time Shorter cycles close faster
44
Cycle gadgets Cycles open “simultaneously”
45
3-bit counter 2 3 010 1
46
From b to b+1 in seven phases B k -cycle closes C k -cycle closes U-lane realigns A i -cycles and B i -cycles for i<k open A k -cycle closes W-lane realigns C i -cycles of 0-bits open
47
3-bit counter 3 4 01 1
48
Size of cycles Various cycles and lanes compete with each other Some are trying to open while some are trying to close We need to make sure that our candidates win! Length of all A-cycles = 8n Length of all C-cycles = 22n Length of B i -cycles = 25i 2 n O(n 4 ) vertices for an n-bit counter Can be improved using a more complicated construction and an improved analysis (work in progress)
49
Related results Sub-exponential lower bound for Zadeh’s pivoting rule [Friedmann ’10] Dantzig’s pivoting rule, and the standard policy iteration algorithm, Switch-All, are polynomial for discounted MDPs, with a fixed discount factor [Ye ’10] Switch-All is almost linear for discounted MDPs and discounted turn-based 2-player Stochastic Games, with a fixed discount factor [Hansen-Miltersen-Z ’11]
50
Concluding remarks and open problems “Game-theoretic” perspective help understand the behavior of randomized pivoting rules Polynomial pivoting rule? Polynomial bound on diameter? Strongly polynomial algorithms for MDPs? Polynomial algorithms 2-player games?
51
THE END
52
Which AUSOs can result from MDPs / PGs / MPGs / SPGs ??? AUSOs 2-player games 1½-player games Are all containments strict? 2½-player games 1-player games LP on cubes Parity games
53
Hard Parity Games for Random Facet [Friedmann-Hansen-Z ’10]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.