Computing equilibria in extensive form games

Slides:



Advertisements
Similar presentations
CPS Extensive-form games Vincent Conitzer
Advertisements

Chapter 9 Graphs.
M9302 Mathematical Models in Economics Instructor: Georgi Burlakov 3.1.Dynamic Games of Complete but Imperfect Information Lecture
Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.
Review Binary Search Trees Operations on Binary Search Tree
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
Totally Unimodular Matrices
Tuomas Sandholm, Andrew Gilpin Lossless Abstraction of Imperfect Information Games Presentation : B 趙峻甫 B 蔡旻光 B 駱家淮 B 李政緯.
1 Maximum flow sender receiver Capacity constraint Lecture 6: Jan 25.
Game Theoretical Insights in Strategic Patrolling: Model and Analysis Nicola Gatti – DEI, Politecnico di Milano, Piazza Leonardo.
Nash Equilibria In Graphical Games On Trees Edith Elkind Leslie Ann Goldberg Paul Goldberg.
Game-theoretic analysis tools Necessary for building nonmanipulable automated negotiation systems.
Extensive-form games. Extensive-form games with perfect information Player 1 Player 2 Player 1 2, 45, 33, 2 1, 00, 5 Players do not move simultaneously.
Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta.
Lists A list is a finite, ordered sequence of data items. Two Implementations –Arrays –Linked Lists.
1 Computing Nash Equilibrium Presenter: Yishay Mansour.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker Andrew Gilpin and Tuomas Sandholm Carnegie.
A competitive Texas Hold’em poker player via automated abstraction and real-time equilibrium computation Andrew Gilpin and Tuomas Sandholm Carnegie Mellon.
Games of Chance Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker * Andrew Gilpin and Tuomas Sandholm, CMU,
Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas.
Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.
Graphical Models for Game Theory by Michael Kearns, Michael L. Littman, Satinder Singh Presented by: Gedon Rosner.
Discrete Mathematics Lecture 9 Alexander Bukharovich New York University.
Graphs, relations and matrices
Finding equilibria in large sequential games of imperfect information Andrew Gilpin and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie Mellon University.
Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves.
Week 11 - Wednesday.  What did we talk about last time?  Graphs  Euler paths and tours.
Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.
Module #19: Graph Theory: part II Rosen 5 th ed., chs. 8-9.
Week 11 - Monday.  What did we talk about last time?  Binomial theorem and Pascal's triangle  Conditional probability  Bayes’ theorem.
The challenge of poker NDHU CSIE AI Lab 羅仲耘. 2004/11/04the challenge of poker2 Outline Introduction Texas Hold’em rules Poki’s architecture Betting Strategy.
Agenda Review: –Planar Graphs Lecture Content:  Concepts of Trees  Spanning Trees  Binary Trees Exercise.
Ásbjörn H Kristbjörnsson1 The complexity of Finding Nash Equilibria Ásbjörn H Kristbjörnsson Algorithms, Logic and Complexity.
Algorithms for solving two-player normal form games
Better automated abstraction techniques for imperfect information games Andrew Gilpin and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Extensive Form (Dynamic) Games With Perfect Information (Theory)
C&O 355 Lecture 19 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.
Finding equilibria in large sequential games of imperfect information CMU Theory Lunch – November 9, 2005 (joint work with Tuomas Sandholm)
Strategy Grafting in Extensive Games
Hamiltonian Graphs Graphs Hubert Chan (Chapter 9.5)
Nash Equilibrium: P or NP?
Decision Trees DEFINITION: DECISION TREE A decision tree is a tree in which the internal nodes represent actions, the arcs represent outcomes of an action,
Extensive-Form Game Abstraction with Bounds
Lap Chi Lau we will only use slides 4 to 19
The Duality Theorem Primal P: Maximize
Game Theory Just last week:
Hans Bodlaender, Marek Cygan and Stefan Kratsch
Topics in Algorithms Lap Chi Lau.
Non-additive Security Games
The minimum cost flow problem
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Hamiltonian Graphs Graphs Hubert Chan (Chapter 9.5)
Communication Complexity as a Lower Bound for Learning in Games
Extensive-form games and how to solve them
PC trees and Circular One Arrangements
Noam Brown and Tuomas Sandholm Computer Science Department
Chapter 5. Optimal Matchings
Multiagent Systems Extensive Form Games © Manfred Huber 2018.
Analysis of Algorithms
Chapter 6: Transform and Conquer
1.3 Modeling with exponentially many constr.
CPS Extensive-form games
Enumerating All Nash Equilibria for Two-person Extensive Games
Lecture 20 Linear Program Duality
Vincent Conitzer Extensive-form games Vincent Conitzer
CPS 173 Extensive-form games
Normal Form (Matrix) Games
Finding equilibria in large sequential games of imperfect information
Presentation transcript:

Computing equilibria in extensive form games Andrew Gilpin Mathematical Games - March 29, 2005

This talk Extensive form games Poker AI Representation Computing equilibrium Poker AI History of poker research Current research

Extensive form representation I = {0, 1, …, n} – players (V,E), terminals Z – tree P: V \ Z H – controlling player H = {H0, …, Hn} – information sets A = {A0, …, An} – actions u : Z Rn – payoffs p – chance probabilities Perfect recall assumption: Players never forget information Game from: Bernhard von Stengel. Efficient Computation of Behavior Strategies. In Games and Economic Behavior 14:220-246, 1996.

Computing equilibria via normal form Normal form exponential, in worst case and in practice (e.g. poker)

Sequence form Instead of a move for every information set, consider choices necessary for each leaf These choices are sequences and constitute the pure strategies in the sequence form S1 = {{}, l, r, L, R} S2 = {{}, c, d}

Realization plans Players strategies are specified as realization plans over sequences: Prop. Realization plans are equivalent to behavior strategies.

Computing equilibria via sequence form Players 1 and 2 have realization plans x and y Realization constraint matrices E and F specify constraints on realizations {} l r L R {} v v’ {} c d {} u

Computing equilibria via sequence form Payoffs for player 1 and 2 are: and for suitable matrices A and B Creating payoff matrix: Initialize each entry to 0 For each leaf, there is a (unique) pair of sequences corresponding to an entry in the payoff matrix Weight the entry by the product of chance probabilities along the path from the root to the leaf {} c d {} l r L R

Computing equilibria via sequence form Primal Dual Holding x fixed, compute best response Holding y fixed, Compute best response Primal Dual

Computing equilibria via sequence form: An example min p1 subject to x1: p1 - p2 - p3 >= 0 x2: 0y1 + p2 >= 0 x3: -y2 + y3 + p2 >= 0 x4: 2y2 - 4y3 + p3 >= 0 x5: -y1 + p3 >= 0 q1: -y1 = -1 q2: y1 - y2 - y3 = 0 bounds y1 >= 0 y2 >= 0 y3 >= 0 p1 Free p2 Free p3 Free end

Sequence form summary Poly-time algorithm for computing Nash equilibria in 2-player zero-sum games Poly-size linear complementarity problem (LCP) for computing Nash equilibria in 2-player general-sum games Major shortcomings: Not well understood when more than two players Sometimes, polynomial is still slow (e.g. poker)

Poker Poker is a wildly popular card game Challenges This year’s World Series of Poker is expected to have prizes totaling almost $50 million Challenges Incomplete information Risk assessment Deception and counter-deception Sequence form does not directly apply Two-player Texas Hold’em has ~1018 nodes

Hold’em Poker Every player receives hole cards Some cards are placed on the table (flop, turn, river) Betting rounds after each deal of cards Players can bet, raise, check, fold, call At end of the game, player with best hand takes the pot

Previous work in poker research Rule-based Simulation/Learning Game-theoretic Manual abstraction “Approximating Game-Theoretic Optimal Strategies for Full-scale Poker”, Billings, Burch, Davidson, Holte, Schaeffer, Schauenberg, Szafron, IJCAI-03. Distinguished Paper Award. Automated abstraction

Finding equilibria in large sequential games of incomplete information (Joint with Tuomas Sandholm) Outline: Extensive game isomorphism Restricted game isomorphic abstraction transformation GameShrink – automatically shrinking games Application to poker Approximation methods

Extensive game isomorphism: example

Extensive game isomorphism: example

Extensive game isomorphism: definition Let G=(I,V,E,P,H,A,u,p) and G’=(I’,V’,E’,P’,H’,A’, u’,p’) be given. A bijection f:V V’ is an extensive game isomorphism if: f induces a graph isomorphism between (V,E) and (V’,E’) For each information set h in G, f induces a bijection between the nodes of h and some h’ in G’ P(x) = P’(f(x)) for all x in V \ Z U(x) = u’(f(x)) for all x in Z p(h,a) = p’(f(h), f(a)) for all h in H0

Restricted game isomorphic abstraction transformation The restricted game Gx is obtained from G by removing all nodes except x and its descendants. (Gx,Gy) is contractible within G if x and y are in the same information set Every node in that information set has the same parent, and the parent is either in a singleton information set or a chance node Gx and Gy are extensive game isomorphic For (Gx,Gy) contractible, the restricted game isomorphic abstraction transformation is the game where Gx and Gy are “merged”

Restricted game isomorphic abstraction transformation: example

Main equilibrium result Thm. Let G be a sequential game with observable actions, let G’ be obtained by one application of the restricted game isomorphic abstraction transformation, and let s’ be a Nash equilibrium for G’. Then the corresponding s for G is a Nash equilibrium.

Computing ExtensiveGameIsomorphic?(x,y) If x and y both leaves, return u(x) == u(y) If x and y have different number of children, or if a different player controls them, return false Construct bipartite graph Gx,y (see next slide). Return true if Gx,y has a perfect matching; otherwise return false.

Constructing Gx,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Constructing Gx,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Constructing Gx,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Constructing Gx,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Constructing Gx,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Constructing Gx,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

Constructing Gx,y Each vertex corresponds to an information set containing a child node. Edges connect information sets where there exists a bijection between extensive game isomorphic vertices (extensive game isomorphic information sets)

GameShrink: Efficiently computing restricted game isomorphic abstraction transformations Bottom-up pass: Compute the ExtensiveGameIsomorphic relation for each pair of equal depth nodes. Top-down pass: For i from 0 to height(G): For each information set h at level i whose nodes share a common parent: Apply the restricted game isomorphic abstraction transformation to each applicable x and y in h

Enhancements Disjoint-set data structure for storing isomorphisms Implicit enumeration of game tree nodes Necessary conditions for extensive game isomorphism Payoff histogram database

Application to poker Theorem. In poker, can compute isomorphisms only considering card tree. J1 K J2 J2 K K J1 J1 J2 -1 -1 1 1

Rhode Island Hold’em Invented as a testbed for AI research [Shi & Littman 2001] More than 3.1 billion game tree nodes Applying sequence form: LP has 91 million rows and columns Applying GameShrink: LP has 1.2 million rows and columns Solvable in about 1 week GameShrink itself takes less than 1 second, the LP solve still dominates

Future poker research More difficult games Maximally vs. Optimally Multi-player Tournament Maximally vs. Optimally