Computing the Density of States of Boolean Formulas

Slides:



Advertisements
Similar presentations
Propositional Satisfiability (SAT) Toby Walsh Cork Constraint Computation Centre University College Cork Ireland 4c.ucc.ie/~tw/sat/
Advertisements

Fast Algorithms For Hierarchical Range Histogram Constructions
Methods of Proof Chapter 7, second half.. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound)
Methods of Proof Chapter 7, Part II. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound) generation.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.
1 Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University.
IBM Labs in Haifa © 2005 IBM Corporation Adaptive Application of SAT Solving Techniques Ohad Shacham and Karen Yorav Presented by Sharon Barner.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Computing the Density of States of Boolean Formulas Stefano Ermon, Carla Gomes, and Bart Selman Cornell University, September 2010.
CHAPTER 8 A NNEALING- T YPE A LGORITHMS Organization of chapter in ISSO –Introduction to simulated annealing –Simulated annealing algorithm Basic algorithm.
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Approximate Counting via Correlation Decay Pinyan Lu Microsoft Research.
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov
Phase Transitions of PP-Complete Satisfiability Problems D. Bailey, V. Dalmau, Ph.G. Kolaitis Computer Science Department UC Santa Cruz.
Proof methods Proof methods divide into (roughly) two kinds: –Application of inference rules Legitimate (sound) generation of new sentences from old Proof.
Recent Development on Elimination Ordering Group 1.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Towards Efficient Sampling: Exploiting Random Walk Strategy Wei Wei, Jordan Erenrich, and Bart Selman.
Methods of Proof Chapter 7, second half.
1 Sampling, Counting, and Probabilistic Inference Wei joint work with Bart Selman.
It’s all about the support: a new perspective on the satisfiability problem Danny Vilenchik.
AAAI00 Austin, Texas Generating Satisfiable Problem Instances Dimitris Achlioptas Microsoft Carla P. Gomes Cornell University Henry Kautz University of.
1 Backdoors To Typical Case Complexity Ryan Williams Carnegie Mellon University Joint work with: Carla Gomes and Bart Selman Cornell University.
1 Discrete Structures CS 280 Example application of probability: MAX 3-SAT.
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Instance Hardness and Phase Transitions.
1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Satisfiability (Reading R&N: Chapter 7)
1 Message Passing and Local Heuristics as Decimation Strategies for Satisfiability Lukas Kroc, Ashish Sabharwal, Bart Selman (presented by Sebastian Brand)
Sampling Combinatorial Space Using Biased Random Walks Jordan Erenrich, Wei Wei and Bart Selman Dept. of Computer Science Cornell University.
Introduction to Monte Carlo Methods D.J.C. Mackay.
1 Exploiting Random Walk Strategies in Reasoning Wei.
1 IE 607 Heuristic Optimization Simulated Annealing.
Random Sampling, Point Estimation and Maximum Likelihood.
1 MCMC Style Sampling / Counting for SAT Can we extend SAT/CSP techniques to solve harder counting/sampling problems? Such an extension would lead us to.
CHAPTERS 7, 8 Oliver Schulte Logical Inference: Through Proof to Truth.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 3 Logic Representations (Part 2)
Simulated Annealing.
1 Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples: b number of comparisons needed to find the.
U NIFORM S OLUTION S AMPLING U SING A C ONSTRAINT S OLVER A S AN O RACLE Stefano Ermon Cornell University August 16, 2012 Joint work with Carla P. Gomes.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module Logic Representations.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Oct, 30, 2015 Slide credit: some slides adapted from Stuart.
The Markov Chain Monte Carlo Method Isabelle Stanton May 8, 2008 Theory Lunch.
Accelerating Random Walks Wei Wei and Bart Selman.
Balance and Filtering in Structured Satisfiability Problems Henry Kautz University of Washington joint work with Yongshao Ruan (UW), Dimitris Achlioptas.
Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT.
Inference in Propositional Logic (and Intro to SAT) CSE 473.
1 P NP P^#P PSPACE NP-complete: SAT, propositional reasoning, scheduling, graph coloring, puzzles, … PSPACE-complete: QBF, planning, chess (bounded), …
Chapter 10 NP-Complete Problems.
Scientific Research Group in Egypt (SRGE)
Inference in Propositional Logic (and Intro to SAT)
Hard Problems Some problems are hard to solve.
Heuristic Optimization Methods
Inference and search for the propositional satisfiability problem
EA C461 – Artificial Intelligence Logical Agent
Hard Problems Introduction to NP
Local Search Strategies: From N-Queens to Walksat
NP-Completeness Yin Tat Lee
Markov chain monte carlo
Propositional Calculus: Boolean Algebra and Simplification
CAP 5636 – Advanced Artificial Intelligence
Markov Networks.
14. TMMC, Flat-Histogram and Wang-Landau Method
ICS 353: Design and Analysis of Algorithms
Haim Kaplan and Uri Zwick
Instructors: Fei Fang (This Lecture) and Dave Touretzky
CS 188: Artificial Intelligence
Chapter 8 NP and Computational Intractability
Chapter 11 Limitations of Algorithm Power
NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and Johnson, W.H. Freeman and Company, 1979.
Methods of Proof Chapter 7, second half.
Markov Networks.
Presentation transcript:

Computing the Density of States of Boolean Formulas Stefano Ermon, Carla Gomes, and Bart Selman Cornell University, September 2010

Motivation: Significant progress in SAT From 100 variables, 200 constraints (early 90’s) to over 1,000,000 vars. and 5,000,000 clauses in 20 years. Applications: Hardware and Software Verification, Planning, Scheduling, Optimal Control, Protocol Design, Routing, Multi-agent systems, E-Commerce (E-auctions and electronic trading agents), etc. SAT: Given a Boolean formula Φ in CNF, Φ=C1ΛC2 Λ…ΛCm does Φ have a satisfying assignment?

Extending SAT technology Model counting problem (number of distinct satisfying assignments): probabilistic inference problems multi-agent / adversarial reasoning (bounded) [Roth ‘96, Littman et. al. ‘01, Sang et. al. ‘04, Darwiche ‘05, Domingos ‘06] MAX-SAT and Weighted MAX-SAT: find a truth assignment that maximizes the number of satisfied clauses or the sum of their weights beyond decision (NP) [Hansen at al. ’90] hard and soft constraints [Heras et al. ’08, Cohen et al. ’06] Park ’02 How can we combine both challenges?

Density of states Given a Boolean formula Φ in CNF, Φ=C1ΛC2 Λ…ΛCm with m clauses the density of states is a function that gives the number of truth assignments that violate exactly i clauses, for i =0,..,m n(0) = number of assignments that violate 0 clauses (models) n(1) = number of assignments that violate exactly 1 clause

Density of states: a challenging problem Generalizes SAT Decision problem: Φ is satisfiable if and only if n(0)>0 Generalizes MAX-SAT MAX-SAT is the minimum i such that n(i)>0 Generalizes #SAT Number of models = n(0)

Statistical physics More generally, the density of states (DOS) gives the number of microstates with energy E Microstates = truth assignments Energy = number of violated clauses Ground states = maximally satisfying assignments Compact, very informative characterization of a physical system Macroscopic thermodynamic quantities (free energy, internal energy,..) Partition function, phase transitions,..

Motivation DOS provides a finer characterization of the structure of a combinatorial search space Statistical physics and CSPs: Insights on problem structure, hardness, new algorithms, Survey Propagation [Montanari et. al. ‘07, Monasson et. al. ‘96, Mézard ’02, Parisi ‘02] By defining different energy functions, it can be naturally used for probabilistic style inference (e.g. Markov Logic, [Domingos ‘06] )

Talk Outline Prior work A novel sampling strategy: MCMC-FlatSat Empirical Validation Small formulas with ground truth Synthetic formulas Random 3-SAT Large structured instances Model counting Conclusions

Density of states: prior work Exact Method: Enumeration (exponential) Approximate Uniform Sampling [Belaidouni et. al. ‘02] Sample density from K random truth assignments Impractical, unlikely to hit rare assignments (e.g. solutions) Metropolis Sampling [Rose et. al. ‘96] In theory, the density can be extracted from the Boltzmann distribution Impractical, difficult choice of the temperature and slow mixing times [Wei et. al. ‘04 ] Focus on low energy states How can we improve the sampling strategy?

The flat histogram idea Idea: Set up a Markov Chain that visits all energy levels equally often [F. Wang and Landau ’02, J. Wang et. al. ’99, De Oliveira et. al. ’96] e.g. an equal amount of time at the set of truth assignments with 0 unsat clauses, 1 unsat clause, ... How? Flip a variable, accept new state σ’ with probability Always accepts “rarer” states (when n(E’)<n(E))

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 1 1 1 1 1 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 3/10 3/10 3/10 3/10 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 3/10 3/10 3/10 3/10 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 1 1 1 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 1 1 1 1 1 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 1 1 1 1 1 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 1 1 1 1 1 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 1/10 1/3 1/10 1/10 1 Goal: visit all energy levels (colors) equally often

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10 1/10 1/3 1/10 1/10 1 Goal: visit all energy levels (colors) equally often

Example …and finally…

Example RED ↔ 0 unsat clauses, n(0)=1 GREEN ↔ 1 unsat clauses, n(1)=3 BLUE ↔ 2 unsat clauses, n(2)=10

Flat histogram: Intuition Detailed balance holds with those transition probabilities Properly biased towards “rare” states But there are few “rare” states The total time spent in each type of state is the same (flat visit histogram). (Red, Blue, Green) Note: in contrast, Simulated Annealing concentrates sampling around low energy states (more greedy!)

Flat histogram But… how can we even run the Markov Chain? Acceptance probability: The density n() is unknown and is precisely what we want to compute!

Adaptive sampling Start with an initial guess g (our estimate of the true density n) Random walk: Use g for guidance (acceptance probability) Chain will initially not sample uniformly across energy levels Each step, adjust g using a modification factor F Keep track of the visit histogram H When we see a flat H, we must have the right density! g() is an estimate of n().

The modification factor, F F controls the tradeoff between convergence rate and accuracy Use large modification factors F at the beginning to get rough estimates fast convergence Keep reducing F to get finer estimates Analogous to an annealing process

MCMC-FlatSat Initialization Inner Loop Specify E=new energy level Inner Loop : adaptive sampling until the visit histogram is flat (g becomes our new guess for n)

MCMC-FlatSat OuterLoop Reduce modification factor and repeat inner loop until g≈n

Outline of empirical validation Empirical validation on combinatorial problems Convergence Efficiency (number of samples vs search space size) Accuracy We study: Small formulas with ground truth Large synthetic formulas Random 3-SAT Large structured instances Model counting

Formulas with known ground truth Instances from MAXSAT2007 competition (Ramsey, Spin Glass, Max Clique) Direct enumeration is possible (n<=28), so we can compare our estimate with ground truth Metrics: KL divergence: Relative error per point

Spin glass instance, 27 variables, 162 clauses Energy 3.1 % maximum relative error Rel. error (%) n – ground truth g – estimate X Log-density Spin glass instance, 27 variables, 162 clauses Needs ~ 8 *106 flips << search space size (227≈1.3 *108) Energy (# unsat clauses)

n – ground truth g – estimate n g Instance variables clauses Ramsey Max Clique n – ground truth g – estimate X n g X Log-density Log-density Energy (# unsat clauses) Energy (# unsat clauses) Instance variables clauses KL-divergence Max rel. error Entropy ramk3n7.ra0 21 70 3.9 E-05 2.39 % 2.45 ramk3n8.ra0 28 126 1.1 E-05 5.1 % 3.93 johnson8-2-4.clq 420 4.5 E-05 5.5 % 2.90 T3pm3-5555.spn 27 162 1.3 E-05 3.15 % 3.34

Synthetic formulas Ground truth for larger formulas? Construct synthetic formulas Formulas for which we derive a closed form solution for the density Result on the composition (logical conjunction) of independent formulas (do not share variables) The density of F is the convolution of the density of Φ with itself l times

Needs ~ 2*107 flips << state space size (250≈1015) Convolution of uniform densities n – ground truth g – estimate X n g X Log-density Log-density Convolution of pigeon hole formulas Energy (# unsat clauses) Energy (# unsat clauses) Needs ~ 2*107 flips << state space size (250≈1015) 22 10^6 flips 21969980 26 10^6 flips Instance variables clauses KL-divergence Max relative error (%) Entropy UnifConv50_100 50 100 1.19 E-05 3.05 % 7.3 PigHoleConv410 200 750 1.26 E-07 2.2 % 33.7

Random k-SAT formulas Well known phase transitions for the satisfiability property in terms of the ratio α (Φ is satisfiable ↔ nΦ(0)>0 ) Analytic result on the average density given a truth assignment, the probability of having a clause that is violated is 1/2k for random k-SAT

n=50 variables (average over 1000 instances) Average Densities n=50 variables (average over 1000 instances) Needs ~ 108 flips << state space size (250≈1015) Log-density Ratio clauses to var. α

Large structured instances No ground truth known Consistency checks Number of models, when exact model counting is feasible (# models=n(0)) Method of the moments: Sample K assignments at random, compute their energies (unsat clauses) Compute sample moments (e.g. average energy, 2nd order moment of energy, ..) Compare with moments obtained using the estimated density g

Large structured instances variables clauses g(0) # models Ms(1) M(1) Ms(2) M(2) brock400 2.clq 40 1188 297.0 88365.9 88372.3 Spinglass5.pm 125 750 187.4 35249.2 35247.1 MANN a27.clq 42 1690 422.4 178709 178703 bw large.a 459 4675 1 995.2 995.3 996349 996634

Logistic instance Planning problem: n=459 variables m=4675 clauses Huge search space (2459 truth assignments), but MCMC-FlatSat returns within hours Remarkable precision: Finds the only existing model g(0)=n(0)=1! The mode of the estimated distribution is e300 times larger than the number of models g(0)  counts the needles and the haystack! (the moments method indicates g is accurate) Log-density Energy (# unsat clauses)

Model counting Comparison with state-of-the-art model counters SampleCount [Gomes et. al. ‘06] SampleMiniSAT [Gogate et. al. ‘07] MCMCFlatSat is very accurate Timings are competitive when ratio clauses to variables is not too large DOS provides guidance Information on what is not a model Overhead because it provides more information

Model counting comparison Instance variables clauses Exact # models SampleCount SampleMiniSAT MCMC-FlatSat Models Time (s) 2bitmax 252 766 2.10×1029 >2.40×1028 29 2.08×1029 345 1.96×1029 1863 wff-3-3.5 150 525 1.40×1014 >1.60×1013 145 1.60×1013 240 1.34×1014 393 wff-3.1.5 100 1.80×1021 >1.00×1020 1.58×1021 128 1.83×1021 21 wff-4-5.0 500 >8.00×1015 120 1.09×1017 191 8.64×1016 189 ls8-norm 301 1603 5.40×1011 >3.10×1010 1140 2.22×1011 168 5.93×1011 2693

Conclusions Computing the density of states is a hard problem (encompasses SAT, MAX-SAT, #SAT) MCMCFlatSat: sampling strategy adapted from physics for combinatorial spaces that adaptively explores the space while collecting statistics Extremely accurate, very efficient (few samples) Provides a compact, rich description of the search space. New insights about structure and local search Very general method: any property can be used for search space partitioning. Many applications to counting and inference problems Kirkpatrick – SA, they used metropolis for combinatorial problems. We showed the effectiveness for combinatorial problems of a method originally developed for spin glasses Partitioning of the search space is completely arbitrary It can spit out large number of assignments that violate a given number of clauses (eg. Give me an assignment that violates 2000 clauses) – not easy to do with current methods (MIN-SAT). New insights about combinatorial search spaces.  finer description of the space

Extra slides

Future work SAT-specific improvements Energy saturation Energy barriers (and related normalization issues) Walksat heuristics (with Metropolis-Hastings updates) Direct application to inference in Markov Logic Formal proof of convergence / counterexamples Application to other counting problems in combinatorial spaces

Runtime for random 3-SAT Search space 2^50 Flips 10^8 ≈ 2^26

Related work on random 3-SAT Lots of work on the i=0 (i.e. SAT/UNSAT) case [Gent et. al. ‘94] Previous experimental work for i>0 [Zhang ‘01] Different definition: “no more than i unsat clauses” versus “exactly i unsat clauses” Location on the phase transitions Apparently, same location Analytic results [Achlioptas et al. ‘05] We can see two phase transitions: assignments are unlikely to violate a large number of clauses

Other phase transition 50 variables, 1000 instances

Other structured instances

Histogram flatness (formal) Flatness condition of the visit histogram H is a necessary condition for convergence If g is equal to the true density n, the detailed balance is satisfied by upon convergence, the steady state probability is proportional to the reciprocal of the density of the corresponding energy level

Histogram flatness (formal) Flatness condition of the visit histogram H is a necessary condition for convergence If g=n, the visit histogram H will be flat i.e. energy levels visited equally often Problem: the density n() is unknown and is precisely what we want to compute!

Propositional Satisfiability (SAT) Satifiability (SAT): Given a formula in propositional calculus, does it have a model, i.e., is there an assignment to its variables making it true? (a  b  c) AND (b  c) AND (a  c) SAT: prototypical hard combinatorial search and reasoning problem. Problem is NP-Complete. (Cook 1971)

Prior work - Metropolis Approximate [Rose et. al., ‘96 ] Metropolis sampling at temperature T Energy = number of violated clauses Measure the empirical probability P(E) of energy levels, rescale to get the density n(E) Slow mixing times [Wei et. al., ‘04 ] Difficult choice of the temperature

Theoretical Convergence Stated as a conjecture in Lee,Okabe and Landau, Convergence and Refinement of the Wang-Landau Algorithm, Computer Physics Communications, 2006 Proof of convergence in Atchade and Liu,The Wang-Landau Algorithm for Monte Carlo computation in general state spaces, Statistica Sinica, 2009

Closed forms k-SAT, m clauses such that each variable appears exactly in one clause, the DOS is where is the fraction of assignments of the variables in a single clause not satisfying it (e.g. in 3-SAT, only 1 over 8 assignments does not satisfy a clause)

Implementation details Use of log-densities New assignment generated by flipping a variable uniformly at random Flatness condition (within 10% of the max) Normalization The density is obtained only up to a constant factor We use the normalization constraint F0=1.5, reduce by Flatness is checked every 1000 moves

Random k-SAT formulas k-SAT: each clause has at most k literals Random k-SAT: formulas Φ generated with n variables m clauses produced independently by randomly choosing a set of k variables from n available negating each one with probability 0.5 α is the ratio clauses to variables (m/n) Let be the probability measure associated with this generative model for Φ

Composition Logical conjunction of formulas Φ that do not share variables The density of F is the convolution of the density of Φ with itself l times Analogous to the probability density of the sum of l independent random variables

Closed forms Starting with a formula Φ with uniform density (e.g. ) we construct Sum of n s-sided dices has closed form probability distribution: