Exploiting Graphical Structure in Decision-Making Ben Van Roy Stanford University
Overview Graphical models in decision-making Singly-connected efficient computation General decision problems intractable Sparsity reduction in computation? Sequential decision-making Curse of dimensionality Sparsity or other graphical structure reduced computational requirements? Structured value functions and/or policies? Preliminary results and research directions
Graphical Models in Inference Conditional independencies simplify inference Singly-connected graphs (trees) General sparse graphs Preprocessing Approximations? x1 x2 x3 x4
Graphical Models in Decision-Making Deterministic dynamic programming Nonserial dynamic programming General sparse graphs Preprocessing Approximations? x1 x2 x3 x4
Sequential Decision-Making u(t) state x(t) system strategy Bellman’s equation J*(x(t)) = max E[g(x(t), u(t)) + a J*(x(t+1)) | x(t)]
The Curse of Dimensionality # states is exponential in # variables The value function encodes one value per state Storage is intractable Computation is intractable Research objective: exploit sparsity and other special graphical structure to reduce computational requirements of sequential decision problems
Dynamic Bayesian Networks x1(t) x1(t+1) x2(t) x2(t+1) x3(t) x3(t+1) x4(t) x4(t+1)
Example Multiclass Queueing Networks
Can We Exploit Proximity? Idea: variables that are “far” from each don’t interact much Does this allow us to decompose the problem?
Yes… The value function decomposes N(i) = a neighborhood; i.e. a set of nodes within some “distance” of i Complexity: O(nd) O(dnN) …but there’s a problem here…
Optimal Decisions Depend on Global State information u1(t+1) x1(t) x1(t+1) x2(t) x2(t+1) x3(t) x3(t+1) x4(t) x4(t+1)
Things Still Work Out… Conjecture: Consequence: If decision ui influences only xi Then near-optimal decisions can be made based only on variables “near” xi Consequence: u1(t+1) x1(t) x1(t+1) x2(t) x2(t+1) x3(t) x3(t+1) x4(t) x4(t+1)
The Underlying Problem x7 x2 x3 x1 x6 x4 x5 Which fij’s do I need to know to choose a near-optimal uk (without coordination)?
A Simple Case Let N(i) = nodes within r steps x1 x2 x3 x4 x5 x6 Let N(i) = nodes within r steps Result: loss of optimality ~ O(1/r) Note: amount of information required is independent of the graph size (Rusmevichientong and Van Roy, 2000)
Future Work Extending this result to general graphs Exploring practical implications Expected practical utility: reduction of complexity in approximation algorithms Problem is no longer O(nd) May instead be O(dnr) Still computationally prohibitive, but not exponential in problem size Simplification of decision-supporting information?
More Future Work Current work exploits proximity Many graphs arising in practical problems pose additional special structure (e.g., symmetries, multiple “layers” of relationships, etc.) Can we also exploit such structure? (e.g., are there sometimes appropriate hierarchical representations?)