Solution Counting Methods for Combinatorial Problems Ashish Sabharwal [ Cornell University] Based on joint work with: Carla Gomes, Willem-Jan van Hoeve,

Solution Counting Methods for Combinatorial Problems Ashish Sabharwal [ Cornell University] Based on joint work with: Carla Gomes, Willem-Jan van Hoeve, Lukas Kroc, Bart Selman INFORMS, Oct 2008, Washington, D.C.

INFORMS-082 Context  Constraint Satisfaction Problems (CSPs)  In particular, Boolean Satisfiability or SAT :  Given a Boolean formula F in conjunctive normal form e.g. F = (a or b) and (  a or  c or d) and (b or c) determine whether F is satisfiable  NP-complete  widely used in practice, e.g. in hardware & software verification, design automation, AI planning, … How many satisfying assignments does F have?  #F, the “model count” of F, the solution count of F  #SAT is #P-complete

INFORMS-083 Model Counting for SAT  Inspired by the success of SAT solvers, a lot of activity in the last few years in attacking the solution counting problem  Aside: “success of SAT” = scalability, industrial applications, black-box nature and standardized input making it ‘easy’ for users  Many different approaches, many different counting goals  A “zoo” of techniques!  This talk: to give a brief overview of these techniques, many of which are contributed by our group at Cornell  Further reading and refs: Model Counting chapter in the upcoming Handbook of Satisfiability (draft available on my webpage) – with Carla Gomes and Bart Selman

INFORMS-084 What shall we count? 0#F2N2N E.g., F has N=1000 variables and 10 150 ≈ 2 500 solutions 0#F Exact count Estimate, no guarantees Upper bound (appears hard!) Lower bound Strict “( ,  )” guarantee

INFORMS-085 Problem Space: why are upper bounds hard?  Number of solutions is often a miniscule fraction of the search space size  Limits our ability to reason about upper bounds  E.g., after having searched half the space, could still have 2 999 potential solutions remaining in the worst case! (off by a factor of 2 499 )  Probabilistic methods work better for lower bounds  E.g., if expected value = true count, Markov’s ineq. says, can’t get high numbers too often because 0’s can’t compensate enough  reverse Markov’s ineq. doesn’t help: can get low numbers too often because a single 2 N can compensate for a lot of low numbers! 0#F2N2N E.g., F has N=1000 variables and 10 150 ≈ 2 500 solutions

INFORMS-086 The “Zoo” of Counting Methods Exact methods Practical bounds with a guarantee Approximate methods Estimation without any guarantee Solution counting “Only” the count Count + many by-products DPLL-style backtrack search Knowledge compilation Using backtr. -free space Sampling + multipliers Sampling + randomization FPRAS: MCMC sampling FPT: branch-width, tree-width,… XOR streamlining (randomized) Backtr. search + randomization + statistics Belief prop. + randomization Note: not an exhaustive listing L LL L U U

I. Exact Methods Exact methods “Only” the count Count + many by-products DPLL-style backtrack search Knowledge compilation FPT: branch-width, tree-width,… [“CDP”, Birnbaul-Lozinskii-99] [“relsat”, Bayardo-Pehoushek-00] [“cachet”, Sang et al-04] [“sharpSAT”, Thurley-06] [tree-width: Gottlob-Scarcello-Sideri-02] [branch-width: Bacchus-Dalmao-Pitassi-03] [cluster-width: Fischer-Makowsky-Ravve-08]

INFORMS-088 Knowledge Compilation for Counting  Main idea: convert F into a different “form” from which one can easily read off the solution count (and many other quantities of interest)  d-DNNF: Deterministic, Decomposable Negation Normal Form  Think of the formula as a directed acyclic graph (DAG)  Negations allowed only at the leaves (NNF)  Children of AND node don’t share any variables (different “components”)  Children of OR node don’t share any solutions  Once converted to d-DNNF, can answer many queries in linear time  Satisfiability, tautology, logical equivalence, solution counts, …  Any query that a BDD could answer  Our recent result: can count number of “clusters” of solutions – how many different kinds/families of solutions are there? [DNNF, “c2d”, Darwiche et al. 2001-05] can multiply the counts can add the counts [To appear in NIPS-08]

II. Approximate Methods Practical bounds with a guarantee Approximate methods Estimation without any guarantee Using backtr. -free space Sampling + multipliers Sampling + randomization XOR streamlining (randomized) Backtr. search + randomization + statistics Belief prop. + randomization LL L U L FPRAS: MCMC sampling U [Karp-Luby-85] [Karp-Luby-Madras89] [“SampleMinisat”, Gogate-Dechter-07] [“MiniCount”, CPAIOR-08]

INFORMS-0810 XOR Streamlining for Bounds on #F  Main idea: rather than modifying the algorithm for solving, modify the problem, run the solver, deduce the count  Randomized algorithm, expected value = true count  Can be converted into bounds with correctness guarantees  Lower bounds easier in practice (XORs of any “length” work)  Upper bounds possible but not so easy  Empirical evidence: can get by with “very short” XORs  Can be extended to general CSPs Streamlined formula CNF formula Random XOR constraints Off-the-shelf SAT Solver Model count [“Mbound”, AAAI-06] [SAT-07] [AAAI-07; see Willem’s talk] ideal when systematic search works well!

INFORMS-0811 Sampling for Estimates + Lower Bound  Main idea: “find” a balanced variable – one that appears roughly equally as True and as False in solutions; fix to one value, count that sub-problem, re-scale with appropriate multiplier  Finding balanced variables not so easy  Use solution sampling: ideal when local search works well!  Use Belief Propagation for “marginal” prob. estimates: ideal when message passing works well!  Randomize the process: expected value = true count, as before!  Great lower bounds, but variance too high for good upper bounds x=? TF 40% of solutions 60% of solutions E.g., count #F| x=T, scale up by factor 100/60 [“ApproxCount”, Wei-Selman-05] [“BPCount”, CPAIOR-08] [“SampleCount”, IJCAI-07]

INFORMS-0812 The “Zoo” of Counting Methods Exact methods Practical bounds with a guarantee Approximate methods Estimation without any guarantee Solution counting “Only” the count Count + many by-products DPLL-style backtrack search Knowledge compilation Using backtr. -free space Sampling + multipliers Sampling + randomization FPRAS: MCMC sampling FPT: branch-width, tree-width,… XOR streamlining (randomized) Backtr. search + randomization + statistics Belief prop. + randomization Note: not an exhaustive listing L LL L U U

Solution Counting Methods for Combinatorial Problems Ashish Sabharwal [ Cornell University] Based on joint work with: Carla Gomes, Willem-Jan van Hoeve,

Similar presentations

Presentation on theme: "Solution Counting Methods for Combinatorial Problems Ashish Sabharwal [ Cornell University] Based on joint work with: Carla Gomes, Willem-Jan van Hoeve,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Solution Counting Methods for Combinatorial Problems Ashish Sabharwal [ Cornell University] Based on joint work with: Carla Gomes, Willem-Jan van Hoeve,

Similar presentations

Presentation on theme: "Solution Counting Methods for Combinatorial Problems Ashish Sabharwal [ Cornell University] Based on joint work with: Carla Gomes, Willem-Jan van Hoeve,"— Presentation transcript:

Similar presentations

About project

Feedback