Presentation is loading. Please wait.

Presentation is loading. Please wait.

Temporal Action-Graph Games: A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British.

Similar presentations


Presentation on theme: "Temporal Action-Graph Games: A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British."— Presentation transcript:

1 Temporal Action-Graph Games: A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British Columbia Avi Pfeffer Charles River Analytics

2 Game Theory In One Slide A game: – an interaction between two or more self-interested agents – each agent independently chooses a strategy – each agent derives utility from the resulting strategy profile Strategies: – simultaneous-move games: choosing from a set of actions – dynamic games: choosing actions at multiple points in time; conditioned on observations – can randomize over actions Reasoning about games: – Often involves computation of solution concepts e.g. Nash equilibrium

3 Representations of Games no utility structure Simultaneous- move normal form Temporal extensive form

4 Representations of Games no utility structure strict utility independence Simultaneous- move normal form Graphical Games [Kearns, Littman & Singh 2001] Temporal extensive form Multi-agent influence diagrams (MAIDs) [Koller & Milch 2001]

5 Representations of Games no utility structure strict utility independence context-specific indep., anonymity Simultaneous- move normal form Graphical Games [Kearns, Littman & Singh 2001] Action-Graph Games (AGGs) [Bhat & Leyton-Brown 2004] [Jiang & Leyton-Brown 2006] Temporal extensive form Multi-agent influence diagrams (MAIDs) [Koller & Milch 2001]

6 Representations of Games no utility structure strict utility independence context-specific indep., anonymity Simultaneous- move normal form Graphical Games [Kearns, Littman & Singh 2001] Action-Graph Games (AGGs) [Bhat & Leyton-Brown 2004] [Jiang & Leyton-Brown 2006] Temporal extensive form Multi-agent influence diagrams (MAIDs) [Koller & Milch 2001] Temporal Action-Graph Games (TAGGs)

7 Overview AGGs TAGGs Computation Experiments

8 AGGs played on a set of action nodes each agent chooses an action node from a subset of action nodes for each action node, an action count is tallied utility dependence expressed by the action graph – utility of an agent depends only on action chosen by the agent action counts on the neighbors of the chosen action (configuration) representation is compact when action graph has small in- degrees

9 Example: simultaneous-move tollbooth game tollbooth with 3 lanes – 5 cars arrive – cars choose lanes simultaneously – utility depends on number of cars choosing same lane context-specific independence (CSI): different independencies under diff context (player’s own action) anonymity: utility depends on the numbers of agents taking certain actions, not their identities L1 L2L3

10 Example: Dynamic Tollbooth Game tollbooth with 3 lanes – 20 cars arrive in 4 waves of 5 cars each – in each wave, cars choose lanes simultaneously – driver can observe number of cars in each lane – utility depends on number of cars choosing same lane, either before him or at the same time Extending AGGs to multiple time steps: – Action counts accumulate over time – Need to be able to specify agents’ observations – Model uncertainty using chance variables 0 00 2 03 3 43

11 Defining TAGGs A TAGG is a tuple – N: set of players – T: duration – set of actions – set of chance variables – set of decisions – set of utility functions

12 TAGGs A decision D – Action set: a subset of – Observation set O[D]: of actions, decisions and chance vars – Set of payoff times pt(D) Utility function – One for each action at each time step – Set of parents – Utility depends only on its parents’ instantiation at time – Evaluated at payoff times of decisions

13 Strategies A behavior strategy at decision D is a mapping from an instantiation of O[D] at time t(D)-1 to a distribution over its action set. A behavior strategy for player i is a tuple consisting of a behavior strategy for each of her decisions Behavior strategy profile: tuple of behavior strategies for all players

14 Induced Bayes Net Induced BN of a TAGG given a strategy profile – Formally describes how the TAGG is played out – Decisions, chance variables and utilities correspond to random variables in the BN – Action counts are time-dependent: we have a separate action count variable for each action at each time step – Decision-payoff variable as utility of decision D received at payoff time – Expected utility of a player is the sum of the expected values of her decisions’ decision-payoff variables. – Can similarly define induced MAID of a TAGG

15 Induced BN / MAID: tollbooth example

16 TAGGs and MAIDs Any TAGG is utility-equivalent to its induced MAID – However, induced MAID (and BN) has large in- degree; exponentially larger representation size For the other direction, any MAID can be efficiently encoded as a TAGG with same space complexity

17 Overview AGGs TAGGs Computation Experiments

18 Computing Expected Utility Computing expected utility of a player, given a behavior strategy profile – An essential step in many game-theoretic computations Can be cast as inference problem on the induced BN: compute expected values of – Can apply standard BN inference algorithm – TAGGs have additional structure; can be exploited to speedup computation

19 Exploiting anonymity Induced BN has large in-degrees for action-count variables – Their CPDs are counting functions with causal independence structure [Heckerman&Breese 1996] – Can reduce in-degree by transforming the BN Create nodes representing intermediate counts

20 Exploiting Temporal Structure A network satisfies the Markov property if parents of variables at time t are at t or t-1. – Parts of the transformed BN (action counts) satisfy MP – Can transform it to one satisfying MP by duplicating variables Adapt the interface algorithm for Dynamic BNs – interface: set of variables in time t that have children in time t+1 d-separates “past” from “future” – algorithm eliminates variables in the temporal order; keeping distribution over interface variables

21 Tollbooth example interface at t: the action count variables at time t

22 Computation, Ctd Further exploiting the structure of transformed BN within the same time step We can exploit CSI for further speedup Our algorithm computes EU in poly time if – the number of interface variables at each time are bounded – inference over the chance variables can be done efficiently Our methods an be applied to speedup computation of Nash equilibria

23 Overview AGGs TAGGs Computation Experiments

24 Compute EU for tollbooth games (3 lanes) Approach 1: standard clique tree algorithm on induced BN Approach 2: same clique tree algorithm on transformed BN Approach 3: our algorithm

25 Experiments: Tollbooth (5 cars per time step, varying T)

26 Experiments: Tollbooth (T=3, varying # cars per time step)

27 Conclusions Temporal Action-Graph Games (TAGGs) – novel compact representation for dynamic games – extends AGGs to dynamic setting – compactly express wider variety of utility structure, including CSI and anonymity – exploit such structure for efficient computation expected utility Nash equilibrium

28

29 2 03 3 43

30 Game Theory In One Slide A game: – an interaction between two or more self-interested agents – each agent independently chooses a strategy – each agent derives utility from the resulting strategy profile Strategies: – simultaneous-move games: choosing from a set of actions – dynamic games: choosing actions at multiple points in time; conditioned on observations – can randomize over actions Best Response: – I play a strategy that maximizes my own utility, given a particular strategy profile for the other agents Nash Equilibrium: – a strategy profile with the property that every agent’s strategy is a best response to the strategies of the others

31 Representations of Games Goal: modeling real-world systems Standard Representations – normal form for simultaneous-move games: specify utilities for every action profile – extensive form (game trees) for dynamic games: specify utilities for every possible sequence of actions – such representations ignore structure in the games’ utilities – sizes grow exponentially in parameters of the games Compact Representations – simultaneous-move: Graphical Games [Kearns, Littman & Singh 2001], Action-Graph Games (AGGs) [Bhat & Leyton-Brown 2004][Jiang & Leyton-Brown 2006] – dynamic: Multi-Agent Influence Diagrams (MAIDs) [Koller & Milch 2001]

32 Representing Dynamic Games Multi-Agent Influence Diagrams(MAIDs): directed acyclic graph over – Chance nodes – Decision nodes: a player chooses an action, observing instantiation of parents – Utility nodes: utility for a player, function of instantiation of parents MAIDs are compact when utility functions exhibit strict independencies – Does not capture other kinds of structure

33 Temporal Action-Graph Games TAGGs: compact representation for dynamic games – Extension of AGGs to dynamic setting – Can efficiently encode arbitrary MAIDs – Can compactly represent dynamic games with CSI and/or anonymity structure – Efficient computation

34 TAGGs at a high level Played over a series of time steps 1 … T – At each time step, an AGG is played by a set of players, on a set of action nodes – Action counts on the action nodes are accumulated over time Model uncertainty using chance variables Decisions: player at each decision may observe a set of previous decisions, chance variables and action counts Utility functions: action-specific and time-specific

35 TAGG: tollbooth example N correspond to the cars T=4 One action node for each lane At each time step, we have 5 decisions – Action set is the entire set – Payoff time same as decision time: pt(D) = {t(D)} Utility function has A as its only parent Size of representation


Download ppt "Temporal Action-Graph Games: A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British."

Similar presentations


Ads by Google