Exact Model Counting: limitations of SAT-solver based methods Paul Beame Jerry Li Sudeepa Roy Dan Suciu University of Washington [UAI 13], [ICDT 14]
Model Counting Model Counting Problem: Given a Boolean formula/circuit F, compute #F = #Models (satisfying assignments) of F Traditional cases of interest: F is CNF or DNF Recent: F is given by a small circuit from a class of simple circuits Probability Computation Problem: Given F, and independent Pr(x), Pr(y), Pr(z), …, compute Pr(F) 2
Model Counting #P-hard ▫ Even for formulas where satisfiability is easy to check Practical model counters can compute #F or Pr(F) for many CNF formulas of 100’s-10,000’s of variables. 3
CDP Relsat Cachet SharpSAT c2d Dsharp … 4 Exact Model Counters for CNF Search-based/DPLL-based (explore the assignment-space and count the satisfying ones) Knowledge Compilation-based (compile F into a “computation-friendly” form) [Survey by Gomes et. al. ’09] Both techniques explicitly or implicitly use DPLL-based algorithms produce FBDD or Decision-DNNF compiled forms [Huang-Darwiche’05, ’07] Both techniques explicitly or implicitly use DPLL-based algorithms produce FBDD or Decision-DNNF compiled forms [Huang-Darwiche’05, ’07] [Birnbaum et. al.’99] [Bayardo Jr. et. al. ’97, ’00] [Sang et. al. ’05] [Thurley ’06] [Darwiche ’04] [Muise et. al. ’12]
Compiled size vs Search time Desiderata ▫ Compiled format makes model counting simple ▫ Compiled format is concise ▫ Compiled format is easy to find Compiled size ≤ Search time ▫ Even if construction of compiled form is only implicit ▫ Can be exponential gap in terms of # of variables e.g. an UNSAT formula has constant compiled size 5
Model Counters Use Extensions to DPLL Caching Subformulas ▫ Cachet, SharpSAT, c2d, Dsharp Component Analysis ▫ Relsat, c2d, Cachet, SharpSAT, Dsharp Conflict Directed Clause Learning ▫ Cachet, SharpSAT, c2d, Dsharp 6 Traces of DPLL + caching + (clause learning) FBDD DPLL + caching + components + (clause learning) Decision-DNNF Traces of DPLL + caching + (clause learning) FBDD DPLL + caching + components + (clause learning) Decision-DNNF How much does component analysis help? i.e. how much smaller are decision-DNNFs than FBDDs?
Outline Review of DPLL-based algorithms for #SAT ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDD & Decision-DNNF) Decision-DNNF to FBDD conversion theorem ▫ Implications of the conversion Applications ▫ Probabilistic databases ▫ Separation between Lifted vs Grounded Inference Proof sketch for Conversion Theorem Open Problems 7
DPLL Algorithms 8 x z 0 y 1 u w u w F: (x y) (x u w) ( x u w z) uwzuwz uwuw w uwuw ½ ¾ ¾ y(uw)y(uw) 3/83/8 7/87/8 5/85/8 w ½ Assume uniform distribution for simplicity // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 )
DPLL Algorithms 9 x z 0 y 1 u w u w F: (x y) (x u w) ( x u w z) uwzuwz uwuw w uwuw ½ ¾ ¾ y(uw)y(uw) 3/83/8 7/87/8 5/85/8 w ½ The trace is a Decision-Tree for F The trace is a Decision-Tree for F Decision-Node
Caching 10 // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) x z 0 y 1 u w u w F: (x y) (x u w) ( x u w z) uwzuwz uwuw w uwuw y(uw)y(uw) w // DPLL with caching: Cache F and Pr(F); look it up before computing // DPLL with caching: Cache F and Pr(F); look it up before computing
Caching & FBDDs 11 x z 0 y u w F: (x y) (x u w) ( x u w z) uwzuwz uwuw w y(uw)y(uw) The trace is a decision-DAG for F Every variable is tested at most once on any path FBDD (Free Binary Decision Diagram) or 1-BP (Read Once Branching Program) The trace is a decision-DAG for F Every variable is tested at most once on any path FBDD (Free Binary Decision Diagram) or 1-BP (Read Once Branching Program)
Component Analysis 12 x z 0 y u w F: (x y) (x u w) ( x u w z) uwzuwz uwuw w y ( u w) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // DPLL with component analysis (and caching): if F = G H where G and H have disjoint sets of variables Pr(F) = Pr(G) × Pr(H) // DPLL with component analysis (and caching): if F = G H where G and H have disjoint sets of variables Pr(F) = Pr(G) × Pr(H)
Components & Decision-DNNF 13 x z 1 u w uwzuwz w y ( u w) 0 y 1 0 F: (x y) (x u w) ( x u w z) The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07] FBDD + “Decomposable” AND-nodes (Two sub-DAGs do not share variables) The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07] FBDD + “Decomposable” AND-nodes (Two sub-DAGs do not share variables) Decision Node y 0 1 AND Node uwuw How much power do they add?
14 Theorem: decision-DNNF for F of size N FBDD for F of size N log N + 1 If F is a k-DNF or k-CNF, then FBDD is of size N k Conversion algorithm runs in linear time in the size of its output Theorem: decision-DNNF for F of size N FBDD for F of size N log N + 1 If F is a k-DNF or k-CNF, then FBDD is of size N k Conversion algorithm runs in linear time in the size of its output New Conversion Theorem
Decomposable Logic Decision Diagrams (DLDDs) Generalization of Decision-DNNFs: ▫ not just decomposable AND nodes ▫ Also NOT nodes, decomposable binary OR, XOR, etc sub-DAGs for each node are labelled by disjoint sets of variables Theorem: Conversion works even for DLDDs 15
Implications Many previous exponential lower bounds for 1-BPs/FBDDs ▫ 2 (n) lower bounds for certain 2-DNF formulas based on combinatorial designs [Bollig-Wegener 00] [Wegener 02] Our conversion theorem implies 2 (n) bounds for decision-DNNF size and hence for SAT-solver based exact model counters 16
Outline Review of DPLL-based algorithms for #SAT ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDD & Decision-DNNF) Decision-DNNF to FBDD conversion theorem ▫ Implications of the conversion Applications ▫ Probabilistic databases ▫ Separation between Lifted vs Grounded Inference Proof sketch for Conversion Theorem Open Problems 17
Applications of exact model counters Finite model theory: ▫ First order formulas over finite domains Bayesian inference Statistical relational models ▫ Combinations of logic and probability Probabilistic databases ▫ Monotone restrictions of statistical relational models 18
Relational Databases AsthmaPatient Ann Bob Friend AnnJoe AnnTom BobTom Smoker Joe Tom Boolean query Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y)
Probabilistic Databases AsthmaPatient Ann Bob Friend AnnJoe AnnTom BobTom Smoker Joe Tom Boolean query Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y) Tuples are probabilistic (and independent) ▫ “Ann” is present with probability 0.3 What is the probability that Q is true on D? ▫ Assign unique variables to tuples Boolean formula F Q,D = (x 1 y 1 z 1 ) (x 1 y 2 z 2 ) (x 2 y 3 z 2 ) ▫ Q is true on D F Q,D is true x1x1 x2x2 z1z1 z2z2 y1y1 y2y2 y3y Pr(x 1 ) = 0.3
Probabilistic Databases Boolean formula F Q,D = (x 1 y 1 z 1 ) (x 1 y 2 z 2 ) (x 2 y 3 z 2 ) ▫ Q is true on D F Q,D is true Query Probability Computation = Model Counting: Compute Pr(F Q,D ) given Pr(x 1 ), Pr(x 2 ), … Monotone database query Q monotone k-DNF F Q,D
A class of DB examples H 1 (x,y)=R(x)S(x,y) S(x,y)T(y) H k (x,y)=R(x)S 1 (x,y) ... S i (x,y)S i+1 (x,y) ... S k (x,y)T(y) Dichotomy Theorem [Dalvi, Suciu 12] Model counting a Boolean combination of h k0,...,h kk is either ▫ #P-hard, e.g. H k, or ▫ Polynomial time computable using “lifted inference” (inclusion-exclusion), e.g. (h 30 h 32 ) (h 30 h 33 ) (h 31 h 33 ) ▫ and there is a simple condition to tell which case holds 22 h k0 h ki h kk
New Lower Bounds 23
“Lifted” vs “Grounded” Inference “Grounded” inference ▫ Work with propositional groundings of the first-order expressions given by the model ▫ “Lifted” inference ▫ Work with the first-order formulas and do higher level calculations Folklore sentiment: Lifted inference is strictly stronger than grounded inference Our examples give a first clear proof of this 24
Outline Review of DPLL-based algorithms for #SAT ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDD & Decision-DNNF) Decision-DNNF to FBDD conversion theorem ▫ Implications of the conversion Applications ▫ Probabilistic databases ▫ Separation between Lifted vs Grounded Inference Proof sketch for Conversion Theorem Open Problems 25
26 Proof of Simulation Decision-DNNFFBDD Efficient construction Size N Size N log N+1 Size N k Decision-DNNF that represents a k-DNF
27 Convert decomposable AND-nodes to decision-nodes while representing the same formula F Decision-DNNF FBDD
First attempt 28 G H G H 0 01 Decision-DNNFFBDD G and H do not share variables, so every variable is still tested at most once on any path 1 FBDD
But, what if sub-DAGs are shared? 29 G H Decision-DNNF Conflict! g’g’ h G H H G g’g’ h
30 G H g’g’ h Obvious Solution: Replicate Nodes G H No conflict can apply original idea But, may need recursive replication Can have exponential blowup!
Main Idea: Replicate Smaller Sub-DAG 31 Edges coming from other nodes in the decision-DNNF Smaller sub-DAG Larger sub-DAG Each AND-node creates a private copy of its smaller sub-DAG
Light and Heavy Edges 32 Smaller sub-DAG Larger sub-DAG Light Edge Heavy Edge Each AND-node creates a private copy of its smaller sub-DAG Recursively, each node u is replicated #times it is in a smaller sub-DAG #Copies of u = #sequences of light edges leading to u Each AND-node creates a private copy of its smaller sub-DAG Recursively, each node u is replicated #times it is in a smaller sub-DAG #Copies of u = #sequences of light edges leading to u
Quasipolynomial Conversion 33 L = Max #light edges on any path L ≤ log N N = N small + N big ≥ 2 N small ≥... ≥ 2 L #Copies of each node ≤ N L ≤ N log N We also show that our analysis is tight #Nodes in FBDD ≤ N. N log N
Polynomial Conversion for k-DNFs L = #Max light edges on any path ≤ k – 1 #Nodes in FBDD ≤ N. N L = N k 34
Summary Quasi-polynomial conversion of any decision-DNNF or DLDD into an FBDD (polynomial for k-DNF or k-CNF) Exponential lower bounds on model counting algorithms Applications in probabilistic databases involving simple 2-DNF formulas where lifted inference is exponentially better than propositional model counting 35
Separation Results AND-FBDD Decision-DNNF FBDD d-DNNF FBDD: Decision-DAG, each variable is tested once along any path Decision-DNNF: FBDD + decomposable AND-nodes (disjoint sub-DAGs) Exponential Separation Poly-size AND-FBDD or d-DNNF exists Exponential lower bound on decision-DNNF size Exponential Separation Poly-size AND-FBDD or d-DNNF exists Exponential lower bound on decision-DNNF size AND-FBDD: FBDD + AND-nodes (not necessarily decomposable) [Wegener’00] d-DNNF: Decomposable AND nodes + OR-nodes with sub-DAGs not simultaneously satisfiable [Darwiche ’01, Darwiche-Marquis ’02] AND-FBDD: FBDD + AND-nodes (not necessarily decomposable) [Wegener’00] d-DNNF: Decomposable AND nodes + OR-nodes with sub-DAGs not simultaneously satisfiable [Darwiche ’01, Darwiche-Marquis ’02]
Open Problems A polynomial conversion of decision-DNNFs to FBDDs? ▫ We have some examples we believe require quasipolynomial blow-up What about SDDs [Darwiche 11] ? Other syntactic subclasses of d-DNNFs? Approximate model counting? 37
Thank You Questions? 38