Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.

Similar presentations


Presentation on theme: "Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005."— Presentation transcript:

1 Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005

2 This talk Extensive form games –Representation –Computing equilibrium Poker AI –History of poker research –Current research

3 Extensive form representation 1.I = {0, 1, …, n} – players 2.(V,E), terminals Z – tree 3.P: V \ Z H – controlling player 4.H = {H 0, …, H n } – information sets 5.A = {A 0, …, A n } – actions 6.u : Z R n – payoffs 7.p – chance probabilities Perfect recall assumption: Players never forget information Game from: Bernhard von Stengel. Efficient Computation of Behavior Strategies. In Games and Economic Behavior 14:220-246, 1996.

4 Computing equilibria via normal form Normal form exponential, in worst case and in practice (e.g. poker)

5 Sequence form Instead of a move for every information set, consider choices necessary to reach each information set and each leaf These choices are sequences and constitute the pure strategies in the sequence form S 1 = {{}, l, r, L, R} S 2 = {{}, c, d}

6 Realization plans Players strategies are specified as realization plans over sequences: Prop. Realization plans are equivalent to behavior strategies.

7 Computing equilibria via sequence form Players 1 and 2 have realization plans x and y Realization constraint matrices E and F specify constraints on realizations {} l r L R {} c d {} v v’ {} u

8 Computing equilibria via sequence form Payoffs for player 1 and 2 are: and for suitable matrices A and B Creating payoff matrix: –Initialize each entry to 0 –For each leaf, there is a (unique) pair of sequences corresponding to an entry in the payoff matrix –Weight the entry by the product of chance probabilities along the path from the root to the leaf {} c d {} l r L R

9 Computing equilibria via sequence form PrimalDual Holding x fixed, compute best response Holding y fixed, Compute best response Primal Dual

10 Computing equilibria via sequence form: An example min p1 subject to x1: p1 - p2 - p3 >= 0 x2: 0y1 + p2 >= 0 x3: -y2 + y3 + p2 >= 0 x4: 2y2 - 4y3 + p3 >= 0 x5: -y1 + p3 >= 0 q1: -y1 = -1 q2: y1 - y2 - y3 = 0 bounds y1 >= 0 y2 >= 0 y3 >= 0 p1 Free p2 Free p3 Free end

11 Sequence form summary Poly-time algorithm for computing Nash equilibria in 2-player zero-sum games Poly-size linear complementarity problem (LCP) for computing Nash equilibria in 2-player general- sum games Major shortcomings: –Not well understood when more than two players –Sometimes, polynomial is still slow (e.g. poker)

12 Poker Poker is a wildly popular card game –This year’s World Series of Poker is expected to have prizes totaling almost $50 million Challenges –Incomplete information –Risk assessment –Deception and counter-deception Sequence form does not directly apply –Two-player Texas Hold’em has ~10 18 nodes

13 Hold’em Poker Every player receives hole cards Some cards are placed on the table (flop, turn, river) Betting rounds after each deal of cards –Players can bet, raise, check, fold, call At end of the game, player with best hand takes the pot

14 Previous work in poker research Rule-based Simulation/Learning Game-theoretic –Manual abstraction “Approximating Game-Theoretic Optimal Strategies for Full-scale Poker”, Billings, Burch, Davidson, Holte, Schaeffer, Schauenberg, Szafron, IJCAI-03. Distinguished Paper Award. –Automated abstraction

15 Finding equilibria in large sequential games of incomplete information (Joint with Tuomas Sandholm, 2005) Outline: –Extensive game isomorphism –Restricted game isomorphic abstraction transformation –GameShrink – automatically shrinking games –Application to poker –Approximation methods

16 Extensive game isomorphism: example

17

18 Extensive game isomorphism: definition Let G=(I,V,E,P,H,A,u,p) and G’=(I’,V’,E’,P’,H’,A’, u’,p’) be given. A bijection f:V V’ is an extensive game isomorphism if: 1.f induces a graph isomorphism between (V,E) and (V’,E’) 2.For each information set h in G, f induces a bijection between the nodes of h and some h’ in G’ 3.P(x) = P’(f(x)) for all x in V \ Z 4.U(x) = u’(f(x)) for all x in Z 5.p(h,a) = p’(f(h), f(a)) for all h in H 0

19 Restricted game isomorphic abstraction transformation The restricted game G x is obtained from G by removing all nodes except x and its descendants. (G x,G y ) is contractible within G if 1.x and y are in the same information set 2.Every node in that information set has the same parent, and the parent is either in a singleton information set or a chance node 3.G x and G y are extensive game isomorphic For (G x,G y ) contractible, the restricted game isomorphic abstraction transformation is the game where G x and G y are “merged”

20 Restricted game isomorphic abstraction transformation: example

21

22

23 Main equilibrium result Thm. Let G be a sequential game with observable actions, let G’ be obtained by one application of the restricted game isomorphic abstraction transformation, and let s’ be a Nash equilibrium for G’. Then the corresponding s for G is a Nash equilibrium.

24 Computing ExtensiveGameIsomorphic?(x,y) 1.If x and y both leaves, return u(x) == u(y) 2.If x and y have different number of children, or if a different player controls them, return false 3.Construct bipartite graph G x,y (see next slide). 4.Return true if G x,y has a perfect matching; otherwise return false.

25 Constructing G x,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

26 Constructing G x,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

27 Constructing G x,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

28 Constructing G x,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

29 Constructing G x,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

30 Constructing G x,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

31 Constructing G x,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

32 GameShrink: Efficiently computing restricted game isomorphic abstraction transformations 1.Bottom-up pass: Compute the ExtensiveGameIsomorphic relation for each pair of equal depth nodes. 2.Top-down pass: For i from 0 to height(G): For each information set h at level i whose nodes share a common parent: Apply the restricted game isomorphic abstraction transformation to each applicable x and y in h

33 Enhancements Disjoint-set data structure for storing isomorphisms Implicit enumeration of game tree nodes Necessary conditions for extensive game isomorphism Payoff histogram database

34 Application to poker Theorem. In poker, can compute isomorphisms only considering card tree. J1 J2 J1 J2 K KK 0 011

35 Rhode Island Hold’em Invented as a testbed for AI research [Shi & Littman 2001] More than 3.1 billion game tree nodes Applying sequence form: –LP has 91 million rows and columns Applying GameShrink: –LP has 1.2 million rows and columns –Solvable in about 1 week –GameShrink itself takes less than 1 second, the LP solve still dominates

36 Future poker research More difficult games –Multi-player LP only handles two players Possible mapping of n-player strategy to (n+1)- player strategy –Tournament Size of bankroll changes aggressiveness of players Maximally vs. Optimally –Opponent modeling


Download ppt "Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005."

Similar presentations


Ads by Google