Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.

Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University

Outline Game Theory Description Nash Equilibria: Solutions to Games How to Find Them: Continuation Methods Normal Form Games: Govindan and Wilson 2002 Graphical Games Results for Graphical Games Multi-Agent Influence Diagrams Overview of MAIDs Continuation methods for MAIDs

Game Theory Normal-form games Model the joint behavior of multiple agents All players move once, simultaneously Each player’s payoff depends on actions of all others Representation exponential in number of players Structured games (graphical games, MAIDs) Computer science’s contribution to game theory: exploit structure, independencies More compact, elegant representations

Nash Equilibria Strategy profile Assigns a strategy to every player A pure strategy chooses one action A mixed strategy is a distribution over pure strategies A Nash equilibrium is a “solution” to the game A strategy profile where no player can improve his payoff by unilaterally deviating from his strategy Not the perfect notion of a solution, but useful

Why Compute Equilibria? Descriptive power Describes stable outcomes of systems Useful for testing accuracy of an economic model See if model’s equilibria correspond to real behavior Fast computation of equilibria required Prescriptive power Choose the way an agent should act The minimum requirement for an optimal strategy Lets us prune the continuous space of strategies to a few discrete possibilities

Big Computational Idea Govindan and Wilson ‘02: Continuation method for normal-form games Perturb game by a vector of bonuses Pay each player an additional bonus for each one of their actions independently If the perturbation is large, the game is easily solvable Unique pure equilibrium in which each player chooses the action with the highest bonus

Choose a random bonus vector b, perturb game g by,  a scale factor Follow the path back from to Big Idea: The Picture Finds equilibria of Solutions at vertical axis Multiple equilibria found with a single ray b

Continuation Methods General framework for satisfying continuous constraints Start at easily solved perturbed problem, then trace the solution back to the original problem Task Specification is the scale factor for perturbation: 1 is fully perturbed, 0 is unperturbed Formulate set of constraints on the solution, s, to the –perturbed problem: F continuous, zero iff s is a solution to the perturbed problem at 

Continuation Methods General Methodology Start with such that Take time derivative If lambda and s change by small amounts in these directions, F is unchanged Follow differential system with small discrete steps Requires inverting the matrix at each step

Continuation Method for Games Set of constraints, F, based on homeomorphism between space of games and their equilibria [Kohlberg & Mertens ’86] One constraint for every action, of each player, p In, the entry at corresponds to the payoff to player p when p and p’ deviate from s by playing and respectively

Implementation of GNM Implemented Govindan and Wilson’s normal form algorithm (the Global Newton Method) Much faster than other leading algorithms for normal-form games Available on the web under the GNU Public License http://dags.stanford.edu/Games/gametracer.html Still too slow for large games Calulcation of requires exponential time

Graphical Games How to reduce exponential representation? Exploit structure! Only connect players whose payoffs depend on each other Each player has a payoff matrix, a function of neighbors and self only Representation: where d is number of neighbors

Example Graphical Games Road game Landowners along a road Grid game Territorial issues

Solving Graphical Games Want to find equilibria in graphical games Graphical structure is a useful way to represent large games Current exact algorithms are impractical Enforce unreasonable restrictions [Kearns, Littman, Singh ’01] 2 actions per player Tree structure Too slow Approximate equilibria are problematic [Vickrey & Koller ’01] No guarantee of an exact equilibrium in the vicinity Granularity must be crude for reasonable execution time Best of both worlds: a general exact algorithm

Our Algorithm Based on Govindan and Wilson’s Uses the same continuation method constraint function Trace solutions along a ray of perturbed games Efficient computation using structure Game structure allows us to compute the components of locally If two players aren’t adjacent, payoffs don’t depend on each other so derivative is zero Otherwise, can use the local game matrix Exponential in family size, NOT in game size

Graphical Game Results 6x6 grid (intractable for most algorithms): 27s Results for different road sizes: Equilibrium error 10 -4 (Vickrey & Koller) <10 -14 (GNM)

Multi-Agent Influence Diagrams Directed acyclic graph, like a BN Three types of nodes Chance nodes: acts of nature Decision nodes: acts of players Utility nodes: payoffs for a player, can’t be parents Multiple agents (players) Payoff is expected sum of owned utility nodes Strategies: entries in the CPTs of owned decision nodes

Tree Killer Example Alice (dark gray) must decide whether to poison Bob’s tree to get a better patio view Bob (light gray) must decide whether to call a tree doctor

MAIDs vs. Other Games MAIDs correspond to extensive form games (game trees) Different from normal form games: sequential actions Different homeomorphism and constraint function needed Different strategy representation required

Finding Equilibria in MAIDs Another continuation method Based on Govindan’s and Wilson’s extensive form constraint function A MAID induces an extensive form game Exponentially larger than the MAID itself Induced strategy profiles are just as compact as in the MAID We can therefore use the extensive form constraint function, with all computations done inside the MAID New compact strategy representation: non-exclusion probabilities

Non-Exclusion Probabilities Assumes a game with perfect recall No player “forgets” anything that he has learned If comes after, then Non-exclusion probability representation: For player i, topologically sort decision variables One non-exclusion probability for each instantiation of For outcome z,

Non-Exclusion Constraints 0.4 A1: a0,a1 A A2: a2,a3 B1: b0,b1 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 0.50.25 0.50.75 A1 a0 a1 Non-exclusion probabilities: Decision node CPTs: Strategy representation now has constraints

Non-Exclusion Constraints 0.4 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 0.50.25 0.50.75 A1 a0 a1 0.2 Non-exclusion probabilities: Decision node CPTs: Strategy representation now has additional constraints A1: a0,a1 A A2: a2,a3 B1: b0,b1

Non-Exclusion Constraints 0.4 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 0.50.25 0.50.75 A1 a0 a1 0.20.1 0.20.3 Non-exclusion probabilities: Decision node CPTs: Strategy representation now has additional constraints A1: a0,a1 A A2: a2,a3 B1: b0,b1

Non-Exclusion Constraints 0.4 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 A1,B1 A2 a2 a3 a0,b0 a0,b1a1,b0a1,b1 0.50.25 0.50.75 A1 a0 a1 0.20.1 0.20.3 Non-exclusion probabilities: Decision node CPTs: Strategy representation now has additional constraints + + A1: a0,a1 A A2: a2,a3 B1: b0,b1

Calculation of Jacobian One component of F for each non-exclusion probability In, each element is again the payoff to one player when he and another deviate This can be calculated by changing the decision node CPTs (no longer probabilities) Zero out entries leading to other outcomes Run “inference” to find reward node expectations

What’s next? Implement a MAID algorithm We have a continuation method constraint function F Calculation of can be done with standard probabilistic inference in the MAID Calculation of the retraction operator is linear in the strategy representation, because constraints are orthogonal; can project onto each one in turn Now we just have to implement

Conclusions Applied new methods from economics to structured games Graphical games Fastest general algorithm Exact MAIDs Adapted extensive form theoretical framework to calculations entirely within the MAID Software available for download

Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.

Similar presentations

Presentation on theme: "Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.

Similar presentations

Presentation on theme: "Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University."— Presentation transcript:

Similar presentations

About project

Feedback