Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas.

Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas Sandholm

Outline Two-person zero-sum sequential games First-order methods for convex optimization Nesterov’s excessive gap technique (EGT) EGT for sequential games Heuristics for EGT Application to Texas Hold’em poker

We want to solve: If Q 1 and Q 2 are simplices, this is the Nash equilibrium problem for two-person zero-sum matrix games If Q 1 and Q 2 are complexes, this is the Nash equilibrium problem for two-person zero-sum sequential games

What’s a complex? It’s just like a simplex, but more complex. Each player’s complex encodes her set of realization plans in the game In particular, player 1’s complex is where E and e depend on the game…

A B C D E F G H

Recall our problem: where Q 1 and Q 2 are complexes Since Q1 and Q2 have a linear description, this problem can be solved as an LP. However, current LP solution methods do not scale

(Un)scalability of LP solvers Rhode Island poker [Shi & Littman 01] –LP has 91 million rows and columns –Applying GameShrink automated abstraction algorithm yields an LP with only 1.2 million rows and columns, and 50 million non- zeros [G. & Sandholm, 06a] –Solution requires 25 GB RAM and over a week of CPU time Texas Hold’em poker –~10 18 nodes in game tree –Lossy abstractions need to be performed –Limitations of current solver technology primary limitation to achieving expert-level strategies [G. & Sandholm 06b, 07a] Instead of standard LP solvers, what about a first-order method?

Convex optimization Suppose we want to solve where f is convex. For general f, convergence requires O(1/ε 2 ) iterations (e.g., for subgradient methods) For smooth, strongly convex f with Lipschitz- continuous gradient, can be done in O(1/ε ½ ) iterations Note that this formulation captures ALL convex optimization problems (can model feasible space using an indicator function) Analysis based on black-box oracle access model. Can we do better by looking inside the box?

Strong convexity A function is strongly convex if there exists such that for all and all is the strong convexity parameter of d

Recall our problem: where Q 1 and Q 2 are complexes Equivalently: where and

,, Unfortunately, Φ and f are non-smooth Fortunately, they have a special structure Let d 1,d 2 be smooth and strongly convex on Q 1,Q 2 These are called prox-functions Now let μ > 0 and consider: These are well-defined smooth functions

Excessive gap condition From weak duality, we have that f(y) ≤ Φ(x) The excessive gap condition requires that f μ (y) ≤ Φ μ (x) (EGC) The algorithm maintains (EGC), and gradually decreases μ As μ decreases, the smoothed functions approach the non-smooth functions, and thus iterates satisfying (EGC) converge to optimal solutions

Nesterov’s main theorem Theorem [Nesterov 05] There exists an algorithm such that after at most N iterations, the iterates have duality gap at most Furthermore, each iteration only requires solving three problems of the form and performing three matrix-vector product operations on A.

Nice prox functions A prox function d for Q is nice if it is: 1.Strongly convex continuous everywhere in Q, and differentiable in the relative interior of Q 2.The min of d over Q is 0 3.The following maps are easily computable:

Nice simplex prox function 1: Entropy

Nice simplex prox function 2: Euclidean sargmax can be computed in O(n log n) time

From the simplex to the complex Theorem [Hoda, G., Peña 06] A nice prox function can be constructed for the complex via a recursive application of any nice prox function for the simplex

Prox function example Let be any nice simplex prox function. The prox function for this matrix is:

Solving

(similar to b(i-vii))

Heuristics [G., Hoda, Peña, Sandholm 07] Heuristic 1: Aggressive μ reduction –The μ given in the previous algorithm is a conservative choice guaranteeing convergence –In practice, we can do much better by aggressively pushing μ, while checking that the excessive gap condition is satisfied Heuristic 2: Balanced μ reduction –To prevent one μ from dominating the other, we also perform periodic adjustments to keep them within a small factor of one another

Matrix-vector multiplication in poker [G., Hoda, Peña, Sandholm 07] The main time and space bottleneck of the algorithm is the matrix-vector product on A Instead of storing the entire matrix, we can represent it as a composition of Kronecker products We can also effectively take advantage of parallelization in the matrix-vector product to achieve near-linear speedup

Memory usage comparison InstanceCPLEX IPMCPLEX SimplexEGT 10k0.082 GB>0.051 GB0.012 GB 160k2.25 GB>0.664 GB0.035 GB RI25.2 GB>3.45 GB0.15 GB Texas>458 GB 2.49 GB

Poker Poker is a recognized challenge problem in AI because (among other reasons) –the other players’ cards are hidden; –bluffing and other deceptive strategies are needed in a good player; –there is uncertainty about future events. Texas Hold’em: most popular variant of poker Two-player game tree has ~10 18 nodes

Potential-aware automated abstraction [G., Sandholm, Sørensen 07] Most prior automated abstraction algorithms employ a myopic expected value computation as a similarity metric –This ignores hands like flush draws where although the probability of winning is small, the payoff could be high Our newest algorithm considers higher-dimensional spaces consisting of histograms over abstracted classes of states from later stages of the game This enables our bottom-up abstraction algorithm to automatically take into account positive and negative potential

Solving the four-round model Computed abstraction with –20 first-round buckets –800 second-round buckets –4800 third-round buckets –28800 fourth-round buckets Algorithm using 30 GB RAM –Simply representing as an LP requires 32 TB –Outputs new, improved solution every 2.5 days

[G., Sandholm, Sørensen 07]

Future research Customizing second-order (e.g. interior- point methods) for the equilibrium problem Additional heuristics for improving practical performance of EGT algorithm Techniques for finding an optimal solution from an ε-solution

Thank you ☺

Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas.

Similar presentations

Presentation on theme: "Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas.

Similar presentations

Presentation on theme: "Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas."— Presentation transcript:

Similar presentations

About project

Feedback