Strategy Grafting in Extensive Games

Slides:



Advertisements
Similar presentations
An Introduction to Game Theory Part V: Extensive Games with Perfect Information Bernhard Nebel.
Advertisements

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.
M9302 Mathematical Models in Economics Instructor: Georgi Burlakov 3.1.Dynamic Games of Complete but Imperfect Information Lecture
Basics on Game Theory Class 2 Microeconomics. Introduction Why, What, What for Why Any human activity has some competition Human activities involve actors,
Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
EC941 - Game Theory Lecture 7 Prof. Francesco Squintani
Extensive-form games. Extensive-form games with perfect information Player 1 Player 2 Player 1 2, 45, 33, 2 1, 00, 5 Players do not move simultaneously.
INFORMS 2006, Pittsburgh, November 8, 2006 © 2006 M. A. Zinkevich, AICML 1 Games, Optimization, and Online Algorithms Martin Zinkevich University of Alberta.
Game-Theoretic Approaches to Multi-Agent Systems Bernhard Nebel.
Basics on Game Theory For Industrial Economics (According to Shy’s Plan)
An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.
Chapter 6 Extensive Games, perfect info
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
Extensive Game with Imperfect Information Part I: Strategy and Nash equilibrium.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker * Andrew Gilpin and Tuomas Sandholm, CMU,
Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
MAKING COMPLEX DEClSlONS
Chapter 9 Games with Imperfect Information Bayesian Games.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
Games with Imperfect Information Bayesian Games. Complete versus Incomplete Information So far we have assumed that players hold the correct belief about.
Dynamic Games & The Extensive Form
Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.
Regret Minimizing Equilibria of Games with Strict Type Uncertainty Stony Brook Conference on Game Theory Nathanaël Hyafil and Craig Boutilier Department.
Ásbjörn H Kristbjörnsson1 The complexity of Finding Nash Equilibria Ásbjörn H Kristbjörnsson Algorithms, Logic and Complexity.
EC941 - Game Theory Prof. Francesco Squintani Lecture 6 1.
OPPONENT EXPLOITATION Tuomas Sandholm. Traditionally two approaches to tackling games Game theory approach (abstraction+equilibrium finding) –Safe in.
Game theory basics A Game describes situations of strategic interaction, where the payoff for one agent depends on its own actions as well as on the actions.
Satisfaction Games in Graphical Multi-resource Allocation
Bayesian games and mechanism design
Extensive-Form Game Abstraction with Bounds
Joint work with Sam Ganzfried
The Duality Theorem Primal P: Maximize
Making complex decisions
Computing equilibria in extensive form games
Applied Mechanism Design For Social Good
Communication Complexity as a Lower Bound for Learning in Games
Extensive-form games and how to solve them
Vincent Conitzer CPS Repeated games Vincent Conitzer
Noam Brown and Tuomas Sandholm Computer Science Department
Econ 805 Advanced Micro Theory 1
Game Theory.
Convergence, Targeted Optimality, and Safety in Multiagent Learning
Structured Models for Multi-Agent Interactions
Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 2 Bayesian Games Zhu Han, Dusit Niyato, Walid Saad, Tamer.
Multiagent Systems Extensive Form Games © Manfred Huber 2018.
Multiagent Systems Game Theory © Manfred Huber 2018.
Presented By Aaron Roth
Vincent Conitzer Mechanism design Vincent Conitzer
Vincent Conitzer CPS 173 Mechanism design Vincent Conitzer
Games with Imperfect Information Bayesian Games
ECE700.07: Game Theory with Engineering Applications
CPS Extensive-form games
Multiagent Systems Repeated Games © Manfred Huber 2018.
Vincent Conitzer Repeated games Vincent Conitzer
Chapter 14 & 15 Repeated Games.
Chapter 14 & 15 Repeated Games.
Vincent Conitzer Extensive-form games Vincent Conitzer
CPS 173 Extensive-form games
Molly W. Dahl Georgetown University Econ 101 – Spring 2009
Part 4 Nonlinear Programming
15th Scandinavian Workshop on Algorithm Theory
M9302 Mathematical Models in Economics
Normal Form (Matrix) Games
A Technique for Reducing Normal Form Games to Compute a Nash Equilibrium Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University, Computer Science.
Vincent Conitzer CPS Repeated games Vincent Conitzer
Finding equilibria in large sequential games of imperfect information
Presentation transcript:

Strategy Grafting in Extensive Games Kevin Waugh, Nolan Bard and Michael Bowling NIPS 2009 (Dec) / Boyoung Kim GT 2010-07-07

Introduction Extensive game – often used to model the interaction of multiple agents within an environment. Trend – increasing the size of an extensive game that can be feasibly solved. E.g. 2-player limit Texas Hold’em/ approximately 1018 states. The classic linear programming technique/ approximately 107 states. More recent techniques/ approximately 1012 states. (Andrew Gilpin, Samid Hoda, Javier Pe˜na, and Tuomas Sandholm. Gradient-based Algorithms for Finding Nash Equilibria in Extensive Form Games. In Proceedings of the Eighteenth International Conference on Game Theory, 2007. & Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret Minimization in Games with Incomplete Information. In Advances in Neural Information Processing Systems Twenty, pages 1729–1736, 2008. A longer version is available as a University of Alberta Technical Report, TR07-14.)

Introduction Abstraction technique: reduce the original game to an abstract game  solve it  the resulting strategy is played in the original game. Abstract

Background Def 1 (Extensive Game) A finite extensive game w/ imperfect information is denoted Γ and has the following components: A finite set N of players. A finite set H of sequences: the possible histories of actions. The empty sequence is in H and every prefix of a sequence in H is also in H. Terminal histories: Z⊆H s.t. No sequence in Z is a strict prefix of any sequence in H. A(h)={a : (h,a) ∈ H} are the actions available after a non-terminal history h ∈ H\Z. A Player function P that assigns to each non-terminal history a member of N∪{c}, where c represents chance. P(h): the player who takes an action after the history h. Hi: the set of histories where player i choose the next action. H={Φ,A,B,AC,AD,BE,BF,BFG,BFH} Z={AC,AD,BE,BFG,BFH} A(BF)={G,H} P(A)=2, P(BF)=1 H1={Φ,BF} H2={A,B}

Background A function fc that associates w/ every history h ∈ Hc a probability distribution fc( · | h ) on A(h). fc( a | h ): the prob. that a occur given h. For each player i ∈ N, a utility function ui that assigns each terminal history a real value. ui(z) is rewarded to player i for reaching terminal history z. If N={1,2} and for all z ∈ Z, u1(z)=-u2(z), an extensive game is said to be zero-sum. For each player i ∈ N, a partition Ii of Hi w/ the property that A(h)=A(h’) whenever h and h’ are in the same member of the partition. Ii : the information partition of player i; a set Ii ∈ Ii is an information set of player i. □ In this paper we only focus on: 2-player zero-sum games w/ perfect recall. Perfect recall: a restriction on the information partitions that excludes unrealistic situations where a player is forced to forget his own past information or decisions.

Background Def 2 (Strategy) A strategy for player i, σi , that assigns a probability distribution Over A(h) to each h ∈ Hi . This function is constrained so that σi(h)= σi(h’) whenever h and h’ are in the same information set. A strategy is pure if no randomization is required. Σi : the set of all strategies for player i. □ Def 3 (Strategy Profile) A strategy profile in extensive game Γ: a set of strategies, σ={σ1, σ2, … ,σN}, that contains one strategy for each player. σ-i : the set strategies for all players except player i. Σ : the set of all strategy profiles. □ ui(σ): the expected utility of player i when all players play according to a strategy profile σ. ui( σi , σ-i ): the expected utility of player i when all other players play according to σ-i and player i plays according to σi.

Background Def 4 (Nash Equilibrium) A Nash equilibrium is a strategy profile σ where For all i∈N and for all σi‘∈Σi , ui (σi , σ-i) ≥ ui (σi‘, σ-i) An approximation of a Nash equilibrium or ε- Nash equilibrium is a strategy profile σ where For all i∈N and for all σi‘∈Σi , ui (σi , σ-i) + ε ≥ ui (σi‘, σ-i) □ A Nash Equilibrium exists in all extensive games. Recall (Essentials of Game Theory, Thm 2.3.1) (Nash, 1951) Every game w/ a finite number of players and action profiles has at least one Nash equilibrium. In a zero-sum game, we say it is optimal to play any strategy belonging to an equilibrium because this guarantees the equilibrium player the highest expected utility in the worst case. In this sense, we call computing an equilibrium in a zero-sum game solving game.

Background Many games are too large to solve directly  abstraction (reduce the game to a manageable size one). The abstraction game is solved  the resulting strategy is presumed to be strong in the original game. Abstraction can be achieved: 1) Merging information sets together 2) restricting the actions a player can take from a given history 3) or a combination of both.

Background Strategies for abstract games are defined in the same manner.

Strategy grafting There is no guarantee that optimal strategies in abstract games are strong in the original game( [1] Kevin Waugh, David Schnizlein, Michael Bowling, and Duane Szafron. Abstraction Pathologies in Extensive Games. In Proceedings of the Eighth International Joint Conference on Autonomous Agents and Multi-Agent Systems, pages 781–788, 2009. ). …ⓐ And these strategies empirically perform well. Easy to show: Strategy space must include at least as good (if not better) strategies than a smaller space that it refines([1]). …ⓑ ⓑ would seem to imply that a larger abstraction would always be better?  It depends on the method of selecting a strategy.

Strategy grafting Def 6 (Dominated Strategy) A dominated strategy for player i is a pure strategy, σi , s.t. there exists another strategy, σi ’ , where for all opponent strategies σ-i , ui (σi ’, σ-i) ≥ ui (σi , σ-i) and the inequality must hold strictly for at least one opponent strategy. □ Abstraction does not (necessarily) preserve strategy domination: When Abstracting, one can merge a dominated strategy in w/ a non-dominated strategy. In abstract game, this combined strategy might become part of an equilibrium  Abstract strategy make mistakes. Finer abstraction may better preserve domination. Decomposition: Natural approach for using strategy spaces w/o additional computational costs. In extensive games w/ imperfect information, straightforward decomposition can be a trouble – Opponent might be able to determine which subgame is being played.

Strategy grafting - The solution to these sub-games: grafts.

Strategy grafting Start out w/ a base strategy for the player – use the same base strategy for all grafts  only information shared. Only the portion of the game is allowed to vary – remaining parts are played by the base strategy.  block of the grafting partition. Not interested in the pair of strategies. When we construct a graft, our opponent must learn a strategy for the entire game.

Strategy grafting - The quality of the grafted strategies.