Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen.

Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen

So far we have seen … Players payoffs and the games are represented in tabular form n agents with 2 actions: n matrices of exponential size: Needed: More compact representations and algorithms for manipulating them

Graphical models (not formal) n-player game is given by undirected graph with n vertices and n matrices Payoff is determined only by the neighbors “ local games ” composing “ global game ”

Examples Games with geographical aspects involved (salespersons) Topology of computer networks with a limited set of neighbors … and so on

Reminder … n-player two-action game: n matrices of size specifies the payoff for pure strategy x Nash-Equilibrium: (for all i and for all p ’ ) -Nash-Equilibrium:

Graphical Games Graphical game: (G,M) G is undirected graph on n vertices M is a set of n matrices representing the payoff of player i with its neighbors Size of is when

Algorithm TreeNash Works in two passes: the downstream pass and the upstream pass Downstream: passes indicator tables (with witnesses) from the leafs to the root Upstream: selects witnesses from root to the leafs (see the attached appendix)

TreeNash – more details Downstream: A parent U will send to a child V a binary-valued table T(v,u) s.t.: T(v,u)=1  there is NE for in which U=u (v,u – mixed strategies) Upstream: A child V will be V=v s.t. for all its parents :

Downstream in general W – child, V – current node, U – parents (b.r. – best response)

How? - Downstream UV W Z T(w,u)=1  u b.r. to w T(w,v)=1  v b.r. to w (b.r. – best response) T(z,w)=1  for some (u,v): 1.T(w,u)=1, T(w,v)=1 2.W=w b.r. to U=u,V=v,Z=z T(z)=1  for some w: 1.T(z,w)=1 2.Z=z b.r. to W=w

How? – Upstream UV W Z Choose U=u, V=v s.t. T(w,u)=1 and T(w,u)=1 Choose Z=z, W=w s.t. T(z,w)=1

TreeNash Theorem: TreeNash computes a Nash equilibrium for the tree game (G,M) Non-deterministic choices: select all of them, and all NE will be found But the tables are continuous … How do we compute them?

Approximate TreeNash Tables will be of finite size: All computations of best responses are computations of -best responses in the grid Each table has entries, therefore running time is (k parents)

Approximate TreeNash (2) Lemma: Let p be a NE for (G,M) and let q be the nearest (in metric) mixed strategy on the. Then provided q is a -NE for (G,M)

Approximate TreeNash (3) Theorem: For any >0, let Then ApproximateTreeNash computes an -NE for the tree game (G,M).

Exact TreeNash Tables will be made of finite unions of rectangles Each table T(v,u) will be represented by a v-list: For each interval there is a subset of [0,1] of disjoint intervals: where T(v,u)=1

Exact TreeNash (2) Assume share a common v-list (by merging) Downstream: How do we find T(w,v) using them, and keep such representation of rectangles?

Exact TreeNash (3) Fix a v-interval and set of intervals appropriate to the v-interval for each parent: T(w,v)=1 is of the form WxI - why? What would be the region W for which some v in the interval is b.r. to u,w?

Exact TreeNash (4) Denote expected payoff of V Lemma: If then W is either empty, a continuous interval in [0,1] or union of two intervals.

Exact TreeNash (5) Can be shown that the leafs can be represented using at most 3 rectangles Therefore, the representation can be kept and is exponential in the number of vertices Witnesses can be found easily, because representation is finite

ExactTreeNash Theorem: ExactTreeNash computes a Nash equilibrium for the tree game (G,M). The algorithm runs in exponential time in the number of vertices of G

Polynomial algorithm Use downstream pass and upstream pass as well Pass breakpoints policies (W child of V): Interpretation ( “ b.p. for V ” ):

How? - Downstream Denote: - ordered set of breakpoints of V ’ s parents - Set of values that W can play that allow V to play any strategy, given - Set of values that W can play, and V ’ s parents play according to V=b, then V=b is a best response -

How? - Downstream Lemma: is either empty, a single interval or the union of two intervals Lemma: Construct the policy for V by covering [0,1] with them – will produce at most set of 2+l breakpoints. How do we start with the leafs?

How? - Upstream Add a dummy root with constant payoff and no influence on the real root Once we select a value for the child, the value for the parents are determined according to the policies

Running time Sorting and computing new breakpoint policy: (t – number of breakpoints) Number of breakpoints is bounded by 2n, therefore total running time:

Summary First framework gave us: 1. Finding approximation for NE in graphical games which are trees in polynomial time 2. Finding NE for trees in exponential time (ALL of the NEs representation) Second algorithm: finding NE in polynomial time

Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen.

Similar presentations

Presentation on theme: "Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen.

Similar presentations

Presentation on theme: "Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen."— Presentation transcript:

Similar presentations

About project

Feedback