Convergent and Correct Message Passing Algorithms Nicholas Ruozzi and Sekhar Tatikonda Yale University TexPoint fonts used in EMF. Read the TexPoint manual.

Slides:



Advertisements
Similar presentations
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.
Advertisements

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Mathematics1 Mathematics 1 Applied Informatics Štefan BEREŽNÝ.
Unsupervised Learning
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.
ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
An Introduction to Variational Methods for Graphical Models.
Convergent Message-Passing Algorithms for Inference over General Graphs with Convex Free Energies Tamir Hazan, Amnon Shashua School of Computer Science.
1 Fast Primal-Dual Strategies for MRF Optimization (Fast PD) Robot Perception Lab Taha Hamedani Aug 2014.
On the size of dissociated bases Raphael Yuster University of Haifa Joint work with Vsevolod Lev University of Haifa.
F IXING M AX -P RODUCT : A U NIFIED L OOK AT M ESSAGE P ASSING A LGORITHMS Nicholas Ruozzi and Sekhar Tatikonda Yale University.
Distributed Message Passing for Large Scale Graphical Models Alexander Schwing Tamir Hazan Marc Pollefeys Raquel Urtasun CVPR2011.
Global Approximate Inference Eran Segal Weizmann Institute.
Approximation Algorithms
Message Passing Algorithms for Optimization
MAE 552 – Heuristic Optimization Lecture 26 April 1, 2002 Topic:Branch and Bound.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
Approximation Algorithms: Bristol Summer School 2008 Seffi Naor Computer Science Dept. Technion Haifa, Israel TexPoint fonts used in EMF. Read the TexPoint.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Photo-realistic Rendering and Global Illumination in Computer Graphics Spring 2012 Stochastic Radiosity K. H. Ko School of Mechatronics Gwangju Institute.
CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep
C&O 355 Mathematical Programming Fall 2010 Lecture 4 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
1 A Graph-Theoretic Approach to Webpage Segmentation Deepayan Chakrabarti Ravi Kumar
Planar Cycle Covering Graphs for inference in MRFS The Typhon Algorithm A New Variational Approach to Ground State Computation in Binary Planar Markov.
Message-Passing for Wireless Scheduling: an Experimental Study Paolo Giaccone (Politecnico di Torino) Devavrat Shah (MIT) ICCCN 2010 – Zurich August 2.
Minimum Cost Flows. 2 The Minimum Cost Flow Problem u ij = capacity of arc (i,j). c ij = unit cost of shipping flow from node i to node j on (i,j). x.
1 Oblivious Routing in Wireless networks Costas Busch Rensselaer Polytechnic Institute Joint work with: Malik Magdon-Ismail and Jing Xi.
Fast Parallel and Adaptive Updates for Dual-Decomposition Solvers Ozgur Sumer, U. Chicago Umut Acar, MPI-SWS Alexander Ihler, UC Irvine Ramgopal Mettu,
Greedy Algorithms and Matroids Andreas Klappenecker.
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
Constrained adaptive sensing Mark A. Davenport Georgia Institute of Technology School of Electrical and Computer Engineering TexPoint fonts used in EMF.
1 Combinatorial Algorithms Local Search. A local search algorithm starts with an arbitrary feasible solution to the problem, and then check if some small,
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online
CS774. Markov Random Field : Theory and Application Lecture 02
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Update any set S of nodes simultaneously with step-size We show fixed point update is monotone for · 1/|S| Covering Trees and Lower-bounds on Quadratic.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.
Belief Propagation and its Generalizations Shane Oldenburger.
CPSC 536N Sparse Approximations Winter 2013 Lecture 1 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA.
Foundation of Computing Systems
Iterative Byzantine Vector Consensus in Incomplete Graphs Nitin Vaidya University of Illinois at Urbana-Champaign ICDCN presentation by Srikanth Sastry.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
1 Approximation algorithms Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij TexPoint fonts used in EMF. Read the TexPoint manual.
1 Chapter 2 Pigeonhole Principle. 2 Summary Pigeonhole principle –simple form Pigeonhole principle –strong form Ramsey’s theorem.
MAP Estimation in Binary MRFs using Bipartite Multi-Cuts Sashank J. Reddi Sunita Sarawagi Sundar Vishwanathan Indian Institute of Technology, Bombay TexPoint.
Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Distributed cooperation and coordination using the Max-Sum algorithm
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
Network Topology Single-level Diversity Coding System (DCS) An information source is encoded by a number of encoders. There are a number of decoders, each.
Introduction of BP & TRW-S
New Characterizations in Turnstile Streams with Applications
Chapter 5. Optimal Matchings
Analysis of Algorithms
Clustering.
Totally Asynchronous Iterative Algorithms
Presentation transcript:

Convergent and Correct Message Passing Algorithms Nicholas Ruozzi and Sekhar Tatikonda Yale University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Overview  A simple derivation of a new family of message passing algorithms  Conditions under which the parameters guarantee correctness upon convergence  An asynchronous algorithm that generalizes the notion of “bound minimizing” algorithms [Meltzer, Globerson, Weiss 2009]  A simple choice of parameters that guarantees both convergence and correctness of the asynchronous algorithm

Previous Work  Convergent message passing algorithms:  Serial TRMP [Kolmogorov 2006]  MPLP [Globerson and Jaakkola 2007]  Max-sum diffusion [Werner 2007] *  Norm-product BP [Hazan and Shashua 2008]  Upon convergence, any assignment that simultaneously minimizes all of the beliefs must minimize the objective function

Outline  Review of the min-sum algorithm  The splitting algorithm  Correctness  Convergence  Conclusion

Min-Sum  Minimize an objective function that factorizes as a sum of potentials  (some multiset whose elements are subsets of the variables)

Corresponding Graph 2 1 3

Min-Sum  Messages at time t:  Normalization factor is arbitrary for each message  Initial messages must be finite

Computing Beliefs  At each step, construct a set of beliefs:  Estimate the optimal assignment as  Converges if we obtain a fixed point of the message updates

Outline  Review of the min-sum algorithm  The splitting algorithm  Correctness  Convergence  Conclusion

“Splitting” Heuristic Original Graph Split the (1,2) factor Reduce the potentials on each edge by 1/

 Consider the message passed across one of the split edges:  If the initial messages are the same for the copied edges, then, at each time step, the messages passed across the duplicated edges are the same: Min-Sum with Splitting 12

 Consider the message passed across one of the split edges:  If the initial messages are the same for the copied edges, then, at each time step, the messages passed across the duplicated edges are the same: 12

Min-Sum with Splitting  We can split any of the factors c ® times and divide the potentials by a factor of c ®  Messages are passed on the original factor graph  Update equation is similar to TRMP

General Splitting Algorithm  Splitting the variable nodes and the factor nodes:  Recover min-sum if all parameters are chosen to be equal to one

Beliefs  Corresponding beliefs:  Not the same as before

Outline  Review of the min-sum algorithm  The splitting algorithm  Correctness  Convergence  Conclusion

Admissibility and Min-consistency  A vector of beliefs, b, is admissible for a function f if  Makes sense for any non-zero, real-valued parameters  A vector of beliefs, b, is min-consistent if for all ® and all i 2 ® :

Conical Combinations  If there is a unique x * that simultaneously minimizes each belief, does it minimize f?  f can be written as a conical combination of the beliefs if there exists a vector, d, of non-negative reals such that

Global Optimality Theorem: Given an objective function, f, suppose that  b is a vector of admissible and min-consistent beliefs for the function f  c is chosen such that f can be written as a conical combination of the beliefs  There exists an x * that simultaneously minimizes all of the beliefs then x * minimizes the objective function

Global Optimality  Examples of globally optimal parameters:  TRMP: is the “edge appearance probability”  MPLP:  Others:  Weaker conditions on c guarantee local optimality (wrt the Hamming distance)

Outline  Review of the min-sum algorithm  The splitting algorithm  Correctness  Convergence  Conclusion

Notion of Convergence  Let c be chosen as:  Consider the lower bound:  We will say that the algorithm has converged if this lower bound cannot be improved by further iteration

Convergence  For certain c, there is an asynchronous message passing schedule that is guaranteed not to decrease this bound  All “bound minimizing” algorithms have similar guarantees [Meltzer, Globerson, Weiss 2009]  Ensure min-consistency over one subtree of the graph at a time

Asynchronous Algorithm  Update the blue edges then the red edges  After this update beliefs are min-consistent with respect to the variable x

Asynchronous Algorithm  Update the blue edges then the red edges  After this update beliefs are min-consistent with respect to the variable x

Asynchronous Convergence  Order the variables and perform the one step update for a different variable at each step of the algorithm  This cannot decrease the lower bound  This update can be thought of as a coordinate ascent algorithm in an appropriate dual space

An Example: MWIS  100x100 grid graph  Weights randomly chosen between 0 and 1  Convergence is monotone as predicted

Conclusion  A simple derivation of a new family of message passing algorithms  Conditions under which the parameters guarantee correctness upon convergence  An asynchronous algorithm that generalizes the notion of “bound minimizing” algorithms [Meltzer, Globerson, Weiss 2009]  A simple choice of parameters that guarantees both convergence and correctness of the asynchronous algorithm

Questions? Preprint available online at:

Other Results  We can extend the computation tree story to this new algorithm  Can use graph covers to understand when the algorithm can converge to a collection of beliefs that admit a unique estimate  The lower bound is always tight for pairwise binary graphical models  New proof that uses graph covers instead of duality