Survey Propagation Algorithm

Slides:

Advertisements

Similar presentations

CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.

Advertisements

Exact Inference in Bayes Nets

1/30 SAT Solver Changki PSWLAB SAT Solver Daniel Kroening, Ofer Strichman.

Review: Constraint Satisfaction Problems How is a CSP defined? How do we solve CSPs?

Generating Hard Satisfiability Problems1 Bart Selman, David Mitchell, Hector J. Levesque Presented by Xiaoxin Yin.

Properties of SLUR Formulae Ondřej Čepek, Petr Kučera, Václav Vlček Charles University in Prague SOFSEM 2012 January 23, 2012.

Best-First Search: Agendas

Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.

Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.

CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep

ECE 667 Synthesis & Verification - SAT 1 ECE 667 ECE 667 Synthesis and Verification of Digital Systems Boolean SAT CNF Representation Slides adopted (with.

Why almost all k-colorable graphs are easy A. Coja-Oghlan, M. Krivelevich, D. Vilenchik.

Message Passing on Planted Models: What do we know? Why do we care? Elchanan Mossel Joint works with: 1. Uri Feige and Danny Vilenchik 2. Amin Coja-Oghlan.

Approximation Algoirthms: Semidefinite Programming Lecture 19: Mar 22.

Semidefinite Programming

1 Boolean Satisfiability in Electronic Design Automation (EDA ) By Kunal P. Ganeshpure.

Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT.

Analysis of Algorithms: Random Satisfiability Benny Applebaum Introduction.

Presented by Ed Clarke Slides borrowed from P. Chauhan and C. Bartzis

GRASP-an efficient SAT solver Pankaj Chauhan. 6/19/ : GRASP and Chaff2 What is SAT? Given a propositional formula in CNF, find an assignment.

The Theory of NP-Completeness

The Connectivity of Boolean Satisfiability: Structural and Computational Dichotomies Elitza Maneva (UC Berkeley) Joint work with Parikshit Gopalan, Phokion.

Belief Propagation, Junction Trees, and Factor Graphs

GRASP SAT solver Presented by Constantinos Bartzis Slides borrowed from Pankaj Chauhan J. Marques-Silva and K. Sakallah.

Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Instance Hardness and Phase Transitions.

CSE 421 Algorithms Richard Anderson Lecture 27 NP Completeness.

1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Satisfiability (Reading R&N: Chapter 7)

Stochastic greedy local search Chapter 7 ICS-275 Spring 2007.

Knowledge Representation II (Inference in Propositional Logic) CSE 473 Continued…

1 Paul Beame University of Washington Phase Transitions in Proof Complexity and Satisfiability Search Dimitris Achlioptas Michael Molloy Microsoft Research.

SAT Solving Presented by Avi Yadgar. The SAT Problem Given a Boolean formula, look for assignment A for such that.  A is a solution for. A partial assignment.

1 Message Passing and Local Heuristics as Decimation Strategies for Satisfiability Lukas Kroc, Ashish Sabharwal, Bart Selman (presented by Sebastian Brand)

Sampling Combinatorial Space Using Biased Random Walks Jordan Erenrich, Wei Wei and Bart Selman Dept. of Computer Science Cornell University.

Some Surprises in the Theory of Generalized Belief Propagation Jonathan Yedidia Mitsubishi Electric Research Labs (MERL) Collaborators: Bill Freeman (MIT)

The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep

1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.

Proving Non-Reconstruction on Trees by an Iterative Algorithm Elitza Maneva University of Barcelona joint work with N. Bhatnagar, Hebrew University.

1 MCMC Style Sampling / Counting for SAT Can we extend SAT/CSP techniques to solve harder counting/sampling problems? Such an extension would lead us to.

Lecture 22 More NPC problems

CHAPTERS 7, 8 Oliver Schulte Logical Inference: Through Proof to Truth.

Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 3 Logic Representations (Part 2)

Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011 TexPoint fonts used in EMF. Read the TexPoint manual before you.

Probabilistic Graphical Models

The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center.

Physics Fluctuomatics / Applied Stochastic Process (Tohoku University) 1 Physical Fluctuomatics Applied Stochastic Process 9th Belief propagation Kazuyuki.

EMIS 8373: Integer Programming NP-Complete Problems updated 21 April 2009.

Survey Propagation. Outline Survey Propagation: an algorithm for satisfiability 1 – Warning Propagation – Belief Propagation – Survey Propagation Survey.

Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 12th Bayesian network and belief propagation in statistical inference Kazuyuki Tanaka.

CS774. Markov Random Field : Theory and Application Lecture 02

Explorations in Artificial Intelligence Prof. Carla P. Gomes Module Logic Representations.

Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 7th~10th Belief propagation Kazuyuki Tanaka Graduate School of Information Sciences,

Conformant Probabilistic Planning via CSPs ICAPS-2003 Nathanael Hyafil & Fahiem Bacchus University of Toronto.

Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)

1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:

CS 3343: Analysis of Algorithms Lecture 25: P and NP Some slides courtesy of Carola Wenk.

Belief Propagation and its Generalizations Shane Oldenburger.

SAT 2009 Ashish Sabharwal Backdoors in the Context of Learning (short paper) Bistra Dilkina, Carla P. Gomes, Ashish Sabharwal Cornell University SAT-09.

Maximum Density Still Life Symmetries and Lazy Clause Generation Geoffrey Chu, Maria Garcia de la Banda, Chris Mears, Peter J. Stuckey.

CSE 589 Part V One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important. Bertrand Russell.

Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT.

Why almost all satisfiable k - CNF formulas are easy? Danny Vilenchik Joint work with A. Coja-Oghlan and M. Krivelevich.

SAT Solving As implemented in - DPLL solvers: GRASP, Chaff and

Inference in Propositional Logic (and Intro to SAT) CSE 473.

COSC 3101A - Design and Analysis of Algorithms 14 NP-Completeness.

Introduction to the Boolean Satisfiability Problem

Introduction to the Boolean Satisfiability Problem

Graduate School of Information Sciences, Tohoku University

Physical Fluctuomatics 7th~10th Belief propagation

Mean Field and Variational Methods Loopy Belief Propagation

Presentation transcript:

Survey Propagation Algorithm Elitza Maneva UC Berkeley Joint work with Elchanan Mossel and Martin Wainwright

The Plan Background: Random SAT Finding solutions by inference on Markov random field (MRF) Belief propagation algorithm (BP) [Pearl `88] Survey propagation alg. (SP) [Mezard, Parisi, Zecchina `02] Survey propagation is a belief propagation algorithm [Maneva, Mossel, Wainwright `05] MRF on partial assignments Relation of the MRF to the structure of the solution space of a random instance Survey propagation is an algorithm designed for random SAT problems below the SAT threshold. It’s from 4 years ago and it is much superior to all previous algorithms for random SAT problems. Since it works really only for random instances, in Oliver Kullmann’s talk it appeared in the list of loser algorithms. However it is still extremely interesting for at least two reasons. it give very strong evidence that there is truth in the statistical phys picture of 3-sat. They appear to have better tools for analyzing this kind of constrained random structures, and we ought to work on making these methods rigorous. The same methods are used for calculating the sat threshold, while we only have rough rigorous bounds. it gives an indication that message-passing algorithms are worth studying in the context of constraint satisfaction problems. Perhaps we can design them so that they work for more practical instances. So here is the plan: I will first introduce some background: define random SAT I’ll show how one can solve CSPs via inference on a Markov Random Field. I will describe the BP algorithm, which is an inference heuristic for computing the marginals of a general MRF. It is a message-passing algorithm originating in statistical learning theory. I’ll describe it in the context of 3-SAT. However I want to emphasize that it is a very widely used algorithm, but there is no rigorous analysis of its performance neither for 3SAT nor for most of the other applications. Until very recently it had not received enough attention in the Theoretical CS community. Then I will define survey propagation, which is also a message-passing algorithm, but was designed based on statistical physics methods by MPZ. Experimentally, it is vastly superior to any previously known heuristic for 3-SAT, and thus generated a lot of interest in the application of these methods to computation. Then I will describe some of my work. In joint work with with Elchanan Mossel and Martin W we show that the survey propagation algorithm can also be thought of as a BP algorithm. We do this, by defining a MRF on partial assignments, and showing that the BP equations for this MRF Are identical with the SP equations. I will also discuss some combinatorial properties of this new distribution and how it relates to the structure of the solution space of typical instances.

Boolean CSP Input: n Boolean variables x1, x2, …, xn m constraints Question: Find an assignment to the variables, such that all constraints are satisfied? Applications: Verification Planning and scheduling Major theoretical interest

Examples of Boolean CSP Constraints come from a fixed set of relations. Examples: 2-SAT (x1  x2 )  ( x1  x3) 3-SAT ( x1  x2  x3 )  (x2  x3  x4) 3-XOR-SAT ( x1  x2 x3)  (x2  x3  x4) 1-in-3-SAT ( x1  x2  x3 )  (x2  x3  x4) Schaefer’s Dichotomy Theorem [1978]: Every Boolean CSP is either : in P (e.g. 2-SAT, Horn-SAT, XOR-SAT, etc.) or NP-complete (3-SAT, NAE-3-SAT, etc.). _ _ _ _ _ _ _

Graph representation variables x1 x2 x3 x4 x5 x6 x7 x8 constraints

Graph representation of 3-SAT x1 x2 x3 x4 x5 x6 x7 x8 _ _ ( x1  x3  x5 ) positive literal negative literal

We can find solutions via inference Suppose the formula is satisfiable. Consider the uniform distribution over satisfying assignments. Simple Claim: If we can compute Pr[xi=1], then we can find a solution fast. Decimation: Assign variables one by one to a value that has highest probability. No backtracking in this talk!

Fact: We cannot hope to compute Pr[xi=1] Heuristics for guessing the best variable to assign: Pure Literal Rule (PLR): Choose a variable that appears always positive / always negative. 2. Myopic Rule: Choose a variable based on number of positive and negative occurrences, and density of 2-clause and 3-clauses. 3. Belief Propagation: Estimate Pr[xi=1] by belief propagation and choose variable with largest estimated bias. 4. Survey Propagation: Estimate the probability that a variable is frozen in a cluster of solutions, and choose the variable with maximum probability of being frozen.

Random 3-SAT n m = n  x1 x2 x3 x4 x5 x6 x7 x8 Not satisfiable WalkSAT Survey propagation Not satisfiable Satisfiable Distribution for 3-SAT Let’s look at how the 3-sat problem behaves at different densities of clauses to variables. The conjectured threshold is around 4.2. By green I will denote conjectures and heuristics. What we know is that below 3.4 the formula is satisfiable, and the proof is by showing an algorithm [Kaporis Kirousis Lalas]. We also know that there are no solutions with high probability above 4.5. This is by clever applications of Markov’s inequality [Dubois, Boufkhard, Mandler]. As far as heuristics go: … This is really dramatic success, because it works all the way to where the threshold for satisfiability is. Previously it was believed that formulas with density just below the threshold are the hard instances of 3-sat. For example they are used as benchmarks. Belief propagation Not satisfiable Satisfiable Myopic PLR 1.63 3.95 3.52 4.27 4.51 

Computing Pr[x1=0] on a tree formula (3-SAT) 108 192 x1 36 48 3 4 1 1 12 1 4 3 #Solns with 0 #Solns with 1 3 4 For example for 3-SAT the computation will go as follows: Each leaf sends a message saying in how many assignments on its subtree it appears as 0 and how many as 1. Then each clause can use the information from its two leaf variables to tell its third variable in how many assignments it appears as 0 and 1. And so on until the messages received by the root can be computed. It is not hard to show that since we are only interested in the ratio between 0 and 1 appearances, each message can be normalized to 1. 1 #Solutions with 0 #Solutions with 1 1 1 1

Vectors can be normalized .36 .64 x1 .43 .57 .43 .57 .5.5 .5.5 .5.5 .5.5 .43 .57 .57 .43 .5.5 .5.5 .5.5 .5.5

… and thought of as messages Vectors can be normalized … and thought of as messages x1

What if the graph is not a tree? Belief propagation

Belief propagation x3 x2 (x1, x2 , x3) x5 x4 x1 x11 Pr[x1, …, xn]  Πa a(xN(a)) x6 It can be applied to any distribution that has a factorization into functions on small number of variables. x10 x9 x8 x7

Belief Propagation [Pearl ’88] x1 x2 x3 x4 x5 x6 x7 m Given: Pr[x1 …x7]  a(x1, x3)  b(x1, x2)  c(x1, x4) … Goal: Compute Pr[x1] (i.e. marginal) i.e. Markov Random Field (MRF) A distribution is given not explicitly, but as a product of functions on small number of variables each. These functions are given explicitly. Compute the marginal probability of a particular variable. This can take exponential time. Belief propagation is a fast heuristic which is exact if the graph is a tree. Message passing rules: M i  c (xi) = Π M b  i (xi) M c  i (xi) = Σ c(x N(c) )  Π M j c (xj) Estimated marginals: i(xi) = Π M c  i (xi) xj: j N(c)\i j N(c)\i cN(i) bN(i)/c Belief propagation is a dynamic programming algorithm. It is exact only when the recurrence relation holds, i.e.: if the graph is a tree. if the graph behaves like a tree: large cycles

Applications of belief propagation Statistical learning theory Vision Error-correcting codes (Turbo, LDPC, LT) Constraint satisfaction Lossy data-compression Computational biology Sensor networks Nash equilibria This algorithm has found a large array of applications in the last decade, And the reason is that many problems can be represented as a Markov random field, and solved by computing its marginal probabilities. In all of these areas the algorithm is not exact and is notoriously hard to analyze rigorously, But performs well in practice. I have listed here the areas in which it is applied in rough chronological order. It originates in statistical learning theory, and is very widely applied in vision. In the end of the 90s it revolutionized the area of error correcting codes, when graphical codes were introduced. These are for example Turbo codes, LDPC codes, and LT codes. They are decoded by belief propagation. The application that I’m describing is for constraint satisfaction problems. It is also applicable to lossy data compression, in biology, sensor networks, and most recently even in game theory For computing the Nash equilibria of graphical games.

Survey propagation algorithm Designed by Mezard, Parisi, Zecchina, 2002 Approximation methods of statistical physics: Parisi’s 1-step Replica Symmetry Breaking cavity method Instances with 106 variables and 4.25  106 clauses are solved within a few minutes. Message-passing algorithm (like belief propagation) These methods have been used before to estimate the value of the threshold, But this is the first time they were used to design an algorithm, The success of this algorithm is clear evidence that the physical picture is largely correct, And should be studied with rigorous methods.

Survey propagation .12 .81 .07 1 

Survey propagation x1 x2 x3 x4 x5 x6 x7 x8 Mci=  ———————— I’m 0 with prob 10%, 1 with prob 70%, whichever (i.e. ) 20% x1 x2 x3 x4 x5 x6 x7 x8 You have to satisfy me with prob. 60% Mci=  ———————— Muic = (1-  (1- Mbi ))  (1-Mbi) Msic = (1-  (1- Mbi ))  (1-Mbi) Mic =  (1- Mbi ) Mujc Muj c+Msj  c+Mjc jN(c)\i b  Nsa (i) b  Nua (i) b  Nsc (i) b  Nuc (i) b  N(i)\c Here are the message passing rules. These rules were largely a mystery to computer scientists when the result came out. But the remarkable performance of the algorithm, made it necessary to try to understand them. Our work is the first combinatorial interpretation of the algorithm. But first let me give you an idea of what the statistical physics picture is.

Survey propagation  Multiple clusters Single cluster of solutions x1 x2 x3 x4 x5 x6 x7 x8 Multiple clusters WalkSAT Survey propagation Single cluster of solutions Multiple clusters of assignments confuse algorithms. No solutions Belief propagation Myopic Unit Clause PLR 1.6 3.5 3.9 4.2 4.5 

Clustering of solutions 00111 {0, 1}5 010 1 This graph is the 5-dimensional hypercube. We can think of a formula with 5 variables, and the set of solutions of this formula. Suppose we are adding the constraints one by one. Initially every assignment is a solution. As we add more constraints the set of solutions decreases. At some point it becomes disconnected, and eventually it becomes empty. When it is clustered, clusters can be described by 0, 1, star assignments, that indicate which variables are frozen within the cluster and to what value. SP is described as searching for a cluster instead of solution. After it finds a cluster assignment a simpler algorithm is applied to complete it to a satisfying assignment.

Difficult problems are in the multiple clusters phase Single cluster of solutions No solutions 1.6 4.1 3.5 4.2 4.5 

Question: Can survey propagation be interpreted as computing the marginals of an MRF on {0, 1, }n ? [ Maneva, Mossel, Wainwright ’05 ] Theorem: Survey propagation is equivalent to belief propagation on a non-uniform distribution over such partial assignments. Is there a distribution/ MRF on which it is working? Plan: Definition of the distribution Expressing the distribution as MRF (in order to apply BP) Combinatorial properties of the distribution

Definition of the new distribution  1111 111 111 11 11 1 11 1 1 1 1010 0111 011 010 100 Partial assignments Formula n() 1 011  no() 3 2 4 The valid assignments can be arranged in a partial order. We say that a partial assignment y is less than x if they differ on exactly one variable, which is a * in y. For example the figure shows a portion of the space of partial assignments for the given formula. 1. Includes all assignments without contradictions or implications n() no() 2. Weight of partial assignments: Pr[]   (1- ) Vanilla BP SP  3. A family of belief propagation algorithms: 1

The distribution is an MRF no() Pr[]   (1- ) Every variable is either , implied or free n() is the number of  no() is the number of free Variables know whether they are implied or free based on the set of clauses that constrain them. So extend the domain: Xi  {0, 1, }  { subsets of clauses that contain xi } In the new domain we can express the distribution in factorized form and apply belief propagation. Applying BP to the distribution, extending the domain of variables.

What is the relation of the distribution to clustering? 00111 010 1

Space of partial assignments # unassigned vars 110001 1 2 3 4 5 n 10

Pr[]   (1- ) n() no() {0, 1}n assignments Partial assignments 0110 =0 =1 core core Cluster assignments correspond to assignments in the new distribution that have the largest weight. The new distribution connects these assignments better, and allows you to single out a cluster. # stars Vanilla BP SP 1011  01101 1 This is the correct picture for 9-SAT and above. [Achlioptas, Ricci-Tersenghi ‘06] 

Clustering for k-SAT What is known? 2-SAT: a single cluster 3-SAT to 7-SAT: not known 8-SAT : exponential number of clusters 9-SAT and above: exponential number of clusters and they have non-trivial cores [Achlioptas, Ricci-Tersenghi `06] [Mezard, Mora, Zecchina `05]

Experiments to find cores for 3-SAT 1 1 1

Experiments to find cores for 3-SAT 1 1 1

Experiments to find cores for 3-SAT 1 1 

Experiments to find cores for 3-SAT 1 1  

Experiments to find cores for 3-SAT 1   

Experiments to find cores for 3-SAT  1   

Peeling Experiment for 3-SAT, n =105  Insert animation of peeling experiment.

Clusters and partial assignments {0, 1}n assignments Partial assignments 0110 01101 Cluster assignments correspond to assignments in the new distribution that have the largest weight. The new distribution connects these assignments better, and allows you to single out a cluster. # stars 1011 01101 

Unresolved questions Why do the marginals of this distribution lead to an algorithm for finding solutions? Why does BP for this distribution converge, while BP on the uniform over satisfying assignments does not (in the clustered phase)?

Thank you