Hierarchical Hardness Models for SAT Lin Xu, Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {xulin730, hoos,
22 Hierarchical Hardness Models --- SATzilla-07 One Application of Hierarchical Hardness Models --- SATzilla-07 Old SATzilla [ Nudelman, Devkar, et. al, 2003 ] 2nd Random 2nd Handmade (SAT) 3rd Handmade SATzilla-07 [Xu, et.al, 2007] 1 st Handmade 1 st Handmade (UNSAT) 1 st Random 2 nd Handmade (SAT) 3 rd Random (UNSAT)
3 Outline Introduction Motivation Predicting the Satisfiability of SAT Instances Hierarchical Hardness Models Conclusions and Futures Work
Introduction
5 NP-hardness and solving big problems Using empirical hardness model to predict algorithm’s runtime Identification of factors underlying algorithm’s performance Induce hard problem distribution Automated algorithm tuning [Hutter, et al. 2006] Automatic algorithm selection [Xu, et al. 2007] …
6 Empirical Hardness Model (EHM) Predicted algorithm’s runtime based on poly-time computable features Features: anything can characterize the problem instance and can be presented by a real number 9 category features [Nudelman, et al, 2004] Prediction: any machine learning technique can return prediction of a continuous value Linear basis function regression [Leyton-Brown et al, 2002; Nudelman, et al, 2004]
7 Linear Basis Function Regression Features () Runtime (y) f w () = w T …
Motivation
9 If we train EHM for SAT/UNSAT separately (M sat, M unsat ), Models are more accurate and much simpler [Nudelman, et al, 2004] M oracular : always use good modelM uncond. : Trained on mixture of instance Solver: satelite; Dataset: Quasi-group completion problem
10 Motivation If we train EHM for SAT/UNSAT separately (M sat, M unsat ), Models are more accurate and much simpler Using wrong model can cause huge error M unsat : only use UNSAT modelM sat : only use SAT model Solver: satelite; Dataset: Quasi-group completion problem
11 Motivation Question: How to select the right model for a given problem? Attempt to guess Satisfiability of given instance!
Predicting Satisfiability of SAT Instances
13 Predict Satisfiability of SAT Instances NP-complete No poly-time method has 100% accuracy of classification BUT For many types of SAT instances, classification with poly-time features is possible Classification accuracies are very high
14 Performance of Classification Classifier: SMLR Features: same as regression Datasets rand3sat-var rand3sat-fix QCP SW-GCP DatasetClassification Accuracy sat.unsat.overall rand3sat-var rand3sat-fix QCP SW-QCP [Krishnapuram, et al. 2005]
15 Performance of Classification (QCP)
16 Performance of Classification For all datasets: Much higher accuracies than random guessing Classifier output correlates with classification accuracies High accuracies with few features
Hierarchical Hardness Models
18 Back to Question Question: How to select the right model for a given problem? Can we just use the output of classifier? NO, need to consider the error distribution Note: best performance would be achieved by model selection Oracle! Question: How to approximate model selection oracle based on features
19 Hierarchical Hardness Models , s Features & Classifier output s: P(sat) Z Model selection Oracle y Runtime Mixture of experts problem with FIXED experts, use EM to find the parameters for z [Murphy, 2001]
20 B A Importance of Classifier’s Output Two ways: A: Using classifier’s output as a feature B: Using classifier’s output for EM initialization A+B
21 Big Picture of HHM Performance OracularUncond.Hier.OracularUncond.Hier. SolverRMSE (rand3-var)RMSE (rand3-fix) satz march_dl kcnfs Oksolver SolverRMSE (QCP)RMSE (SW-GCP) Zchaff Minisat Satzoo Satelite Sato oksolver
22 Example for rand3-var (satz) Left: unconditional model Right: hierarchical model
23 Example for rand3-fix (satz) Left: unconditional model Right: hierarchical model
24 Example for QCP (satelite) Left: unconditional model Right: hierarchical model
25 Example for SW-GCP (zchaff) Left: unconditional model Right: hierarchical model
26 Correlation Between Prediction Error and Classifier’s Confidence Left: Classifier’s output vs runtime prediction error Right: Classifier’s output vs RMSE Solver: satelite; Dataset: QCP
Conclusions and Futures Work
28 Conclusions Models conditioned on SAT/UNSAT have much better runtime prediction accuracies. A classifier can be used to distinguish SAT/UNSAT with high accuracy. Conditional models can be combined into hierarchical model with better performance. Classifier’s confidence correlates with prediction error.
29 Future Work Better features for SW-GCP Test on more real world problems Extend underlying experts beyond satisfiability