CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep
Basic Idea of Belief Propagation (BP) Let be the marginal prob. of the MRF on the subtree rooted at j, and so on. i j k … …
Belief Propagation (BP) ijk
i j ∏ Belief at node i at time t: NiNi For t>n, and
Properties of BP (and MP) Exact for trees Each node separates Graph into 2 disjoint components On a tree, the BP algorithm converges in time proportional to diameter of the graph – at most linear For general Graphs Exact inference is NP-hard Constant Approximate inference is hard
Loopy Belief Propagation Approaches for general graphs Exact Inference Computation tree based approach (for graph with large girth) Junction Tree algorithm (for bounded tree width graph) Graph cut algorithm (for submodular MRF) Approximate Inference Loopy BP Sampling based algorithm Graph decomposition based approximation
Loopy Belief Propagation If BP is used on graphs with loops, messages may circulate indefinitely Empirically, a good approximation is still achievable Stop after fixed # of iterations Stop when no significant change in beliefs If solution is not oscillatory but converges, it usually is a good approximation Example: LDPC Codes
Fixed point of BP Messages of BP at time t forms a dimensional real vector. Let M(t) be this vector. If we normalize, the output of BP(marginal probabilities) is the same. BP algorithm is a continuous function that maps M(t) to M(t+1). BP: Hence, by Brouwer Fixed Point Theorem, BP has at least one fixed point. (since the domain is a convex, compact set)
Fixed point of BP Now important questions are “Is there a unique fixed point ?” “Does BP converges to a fixed point ?” “If it does, how fast ?” Studying these questions are of current research topics. Ex, studying them for restricted class of MRF (ex graphs with large girth) Studying relations of BP fixed point with other values (ex Minima of the Bethe Free energy)
Girth of a Graph For a graph G=(V,E), the girth of G is the length of a shortest cycle contained in G. If G has girth, and bounded degree, and the MRF satisfies exponential (spatial) correlation decay, then BP computes good approximation of the solution. Proof: By considering computation tree of BP It can be used to design a system based on MRF Ex: LDPC code
Computation Tree of BP Graph G Computation tree of G at x1
(Temporal) Decay of correlations in Markov chains A Markov chain with transition matrix satisfies decay of correlation (mixes) if and only if it is aperiodic (Spatial) Decay of correlations Same thing, but time is replaced by a “spatial” distance Correlation Decay
A sequence of spatially (graph) related random variables exhibits a correlation decay(long-range independence), if when is large Principle motivation - statistical phyisics. Uniqueness of Gibbs measures on infinite lattices, Dobrushin [60s]. Correlation Decay
Weitz [05]. Independent sets - graph Goldberg, Martin & Paterson [05]. Coloring. General graphs Jonasson [01]. Coloring. Regular trees is the maximum vertex degree of G. in the independent set is the weight for each vertex. (i.e. weight for an independent set of size I is ) q in the coloring problem is the number of possible colors. What is known about correlation decay ?