Frank Hutter, Holger Hoos, Kevin Leyton-Brown

Slides:



Advertisements
Similar presentations
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Advertisements

Automated Parameter Setting Based on Runtime Prediction: Towards an Instance-Aware Problem Solver Frank Hutter, Univ. of British Columbia, Vancouver, Canada.
SATzilla: Portfolio-based Algorithm Selection for SAT Lin Xu, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown Department of Computer Science University.
Hydra-MIP: Automated Algorithm Configuration and Selection for Mixed Integer Programming Lin Xu, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown Department.
Efficient Program Compilation through Machine Learning Techniques Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto.
Empirical Algorithmics Reading Group Oct 11, 2007 Tuning Search Algorithms for Real-World Applications: A Regression Tree Based Approach by Thomas Bartz-Beielstein.
Random Forest Predrag Radenković 3237/10
A Simple Distribution- Free Approach to the Max k-Armed Bandit Problem Matthew Streeter and Stephen Smith Carnegie Mellon University.
IBM Labs in Haifa © 2005 IBM Corporation Adaptive Application of SAT Solving Techniques Ohad Shacham and Karen Yorav Presented by Sharon Barner.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
© 2011 IBM Corporation 1 Guiding Combinatorial Optimization with UCT Ashish Sabharwal and Horst Samulowitz IBM Watson Research Center (presented by Raghuram.
The Variable Neighborhood Search Heuristic for the Containers Drayage Problem with Time Windows D. Popović, M. Vidović, M. Nikolić DEPARTMENT OF LOGISTICS.
Automatic Tuning1/33 Boosting Verification by Automatic Tuning of Decision Procedures Domagoj Babić joint work with Frank Hutter, Holger H. Hoos, Alan.
On the Potential of Automated Algorithm Configuration Frank Hutter, University of British Columbia, Vancouver, Canada. Motivation for automated tuning.
Automatic Algorithm Configuration based on Local Search EARG presentation December 13, 2006 Frank Hutter.
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
Learning Markov Network Structure with Decision Trees Daniel Lowd University of Oregon Jesse Davis Katholieke Universiteit Leuven Joint work with:
SATzilla-07: The Design and Analysis of an Algorithm Portfolio for SAT Lin Xu, Frank Hutter, Holger H. Hoos and Kevin Leyton-Brown University of British.
Hierarchical Hardness Models for SAT Lin Xu, Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {xulin730, hoos,
Computational Methods for Management and Economics Carla Gomes
Ensemble Learning: An Introduction
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
Compiler Optimization-Space Exploration Adrian Pop IDA/PELAB Authors Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, David.
Lukas Kroc, Ashish Sabharwal, Bart Selman Cornell University, USA SAT 2010 Conference Edinburgh, July 2010 An Empirical Study of Optimal Noise and Runtime.
Hierarchical Hardness Models for SAT Hierarchical Hardness Models for SAT Building the hierarchical hardness models 1.The classifier is trained on a set.
Hardness-Aware Restart Policies Yongshao Ruan, Eric Horvitz, & Henry Kautz IJCAI 2003 Workshop on Stochastic Search.
Ensemble Learning (2), Tree and Forest
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Chapter 10. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes Jean-Hugues Chauchat and Ricco.
Parameter tuning based on response surface models An update on work in progress EARG, Feb 27 th, 2008 Presenter: Frank Hutter.
Dept. of Computer and Information Sciences : University of Delaware John Cavazos Department of Computer and Information Sciences University of Delaware.
Karthik Raman, Pannaga Shivaswamy & Thorsten Joachims Cornell University 1.
Benk Erika Kelemen Zsolt
Shared Memory Parallelization of Decision Tree Construction Using a General Middleware Ruoming Jin Gagan Agrawal Department of Computer and Information.
Parallel Algorithm Configuration Frank Hutter, Holger Hoos, Kevin Leyton-Brown University of British Columbia, Vancouver, Canada.
Predictive Design Space Exploration Using Genetically Programmed Response Surfaces Henry Cook Department of Electrical Engineering and Computer Science.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms Frank Hutter 1, Youssef Hamadi 2, Holger Hoos 1, and Kevin Leyton-Brown.
Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact:
15.053Tuesday, April 9 Branch and Bound Handouts: Lecture Notes.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Experimental Algorithmics Reading Group, UBC, CS Presented paper: Fine-tuning of Algorithms Using Fractional Experimental Designs and Local Search by Belarmino.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms: An Initial Investigation Frank Hutter 1, Youssef Hamadi 2, Holger.
SAT 2009 Ashish Sabharwal Backdoors in the Context of Learning (short paper) Bistra Dilkina, Carla P. Gomes, Ashish Sabharwal Cornell University SAT-09.
Parameter tuning based on response surface models An update on work in progress EARG, Feb 27 th, 2008 Presenter: Frank Hutter.
Classification Ensemble Methods 1
Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav.
Javad Azimi, Ali Jalali, Xiaoli Fern Oregon State University University of Texas at Austin In NIPS 2011, Workshop in Bayesian optimization, experimental.
Tommy Messelis * Stefaan Haspeslagh Burak Bilgin Patrick De Causmaecker Greet Vanden Berghe *
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Hybrid BDD and All-SAT Method for Model Checking
Lin Xu, Holger H. Hoos, Kevin Leyton-Brown
Introduction to Data Mining, 2nd Edition by
MIP Tools Branch and Cut with Callbacks Lazy Constraint Callback
CSCI N317 Computation for Scientific Applications Unit Weka
Overfitting and Underfitting
Phase based adaptive Branch predictor: Seeing the forest for the trees
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

Sequential Model-based Optimization for General Algorithm Configuration Frank Hutter, Holger Hoos, Kevin Leyton-Brown University of British Columbia LION 5, Rome January 18, 2011

Motivation SMAC ! ROAR ! Most optimization algorithms have parameters E.g. IBM ILOG CPLEX: Preprocessing, balance of branching vs. cutting, type of cuts, etc. 76 parameters, mostly categorical Use machine learning to predict algorithm runtime, given parameter configuration used characteristics of the instance being solved Use these predictions for general algorithm configuration E.g. optimize CPLEX parameters for given benchmark set Two new methods for general algorithm configuration balance of branching vs. cutting, Most algorithms have parameters Decisions that are left open during algorithm design Instantiate to optimize empirical performance E.g. local search: neighbourhoods, restarts, perturbation types, tabu length (or range for it), restarts, … Find parameter configuration with best empirical performance across benchmark instance set ROAR ! SMAC ! Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Related work General algorithm configuration Racing algorithms, F-Race [Birattari et al., GECCO‘02-present] Iterated Local Search, ParamILS [Hutter et al., AAAI’07 & JAIR ‘09] Genetic algorithms, GGA [Ansotegui et al, CP’09] Model-based optimization of algorithm parameters Sequential Parameter Optimization [Bartz-Beielstein et al., '05-present] SPO toolbox: interactive tools for parameter optimization Our own previous work SPO+: fully automated & more robust [Hutter et al., GECCO’09] TB-SPO: reduced computational overheads [Hutter et al., LION 2010] Here: extend to general algorithm configuration Sets of problem instances Many, categorical parameters Studied SPO components  due to models, first practical SPO variant given a time budget Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Outline ROAR ! SMAC ! 1. ROAR 2. SMAC 3. Experimental Evaluation Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

A key component of ROAR and SMAC Compare a configuration  vs. the current incumbent, *: Racing approach: Few runs for poor  Many runs for good  once confident enough: update *   Agressively rejects poor configurations  Very often after a single run Inspired by a similar mechanism in FocusedILS : discard if empirical mean worse than for * Iteratively perform runs with configuration  until either Empirical mean for  is worse than for *  discard  Have as many runs as *  *   (differences clearly not significant yet) Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

ROAR: a simple method for algorithm configuration Main ROAR loop: Select a configuration  uniformly at random Compare  to current * (online, one  at a time) Using aggressive racing from previous slide Random Online Aggressive Racing Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Outline 1. ROAR 2. SMAC Sequential Model-based Algorithm Configuration 3. Experimental Evaluation ROAR ! SMAC ! Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

SMAC in a Nutshell Construct a model to predict algorithm performance Supervised machine learning Gaussian processes (aka kriging) Random forest model f :   R Use that model to select promising configurations Compare each selected configuration to incumbent Using same aggressive racing as ROAR , balancing Exploit: predicted to be good Explore: model is uncertain Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Fitting a Regression Tree to Data: Example param3  {red} param3  {blue, green} Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Fitting a Regression Tree to Data: Example In each internal node: only store split criterion used param3  {red} param3  {blue, green} param2 ≤ 3.5 param2 > 3.5 Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Fitting a Regression Tree to Data: Example In each internal node: only store split criterion used param3  {red} param3  {blue, green} param2 ≤ 3.5 param2 > 3.5 Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Fitting a Regression Tree to Data: Example In each internal node: only store split criterion used In each leaf: store mean of runtimes param3  {red} param3  {blue, green} param2 ≤ 3.5 param2 > 3.5 3.7 Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Fitting a Regression Tree to Data: Example In each internal node: only store split criterion used In each leaf: store mean of runtimes param3  {red} param3  {blue, green} param2 ≤ 3.5 param2 > 3.5 … 3.7 1.65 Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Fitting a Regression Tree to Data: Example In each internal node: only store split criterion used In each leaf: store mean of runtimes param3  {red} param3  {blue, green} … param2 ≤ 3.5 param2 > 3.5 3.7 1.65 Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Predictions for a new parameter configuration E.g. n+1 = (true, 4.7, red) Walk down tree, return mean runtime stored in leaf  1.65 param3  {red} param3  {blue, green} … param2 ≤ 3.5 param2 > 3.5 3.7 1.65 Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Random Forests: sets of regression trees … Training Subsample the data T times (with repetitions) For each subsample, fit a regression tree Prediction Predict with each of the T trees Return empirical mean and variance across these T predictions Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Predictions For Different Instances Runtime data now also includes instance features: Configuration i , runtime ri, and instance features xi = (xi,1, …, xi,m) Fit a model g:   Rm  R Predict runtime for previously unseen combinations (n+1 ,xn+1 ) feat2 ≤ 3.5 feat2 > 3.5 … param3  {blue, green} param3  {red} f: 1  …  k  Rm  R 3.7 feat7 ≤ 17 feat7 > 17 2 1 Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Visualization of Runtime Across Instances and Parameter Configurations True log10 runtime Predicted log10 runtime Darker is faster Performance of configuration  across instances: Average of ’s predicted row Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Summary of SMAC Approach Construct model to predict algorithm performance Random forest model g :   Rm R Marginal predictions f :   R Use that model to select promising configurations Standard “expected improvement (EI)” criterion combines predicted mean and uncertainty Find configuration with highest EI: optimization by local search Compare each selected configuration to incumbent * Using same aggressive racing as ROAR Save all run data  use to construct models in next iteration Update * once a better configuration is found Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Outline ROAR ! SMAC ! 1. ROAR 2. SMAC 3. Experimental Evaluation Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Experimental Evaluation: Setup Compared SMAC, ROAR, FocusedILS, and GGA On 17 small configuration scenarios: Local search and tree search SAT solvers SAPS and SPEAR Leading commercial MIP solver CPLEX For each configurator and each scenario 25 configuration runs with 5-hour time budget each Evaluate final configuration of each run on independent test set Over a year of CPU time Will be available as a reproducable experiment package in HAL HAL: see Chris Nell’s talk tomorrow @ 17:20 Short target algorithm runtimes of up to 5 seconds 25 runs of each of the 4 configurators Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Experimental Evaluation: Results y-axis: test performance (runtime, smaller is better) S=SMAC, y-axis: test performance (runtime, smaller is better) R=ROAR, F=FocusedILS, G=GGA Improvement (means over 25 runs) 0.93  2.25 (vs FocusedILS), 1.01  2.76 (vs GGA) Significant (never significantly worse) 11/17 (vs FocusedILS), 13/17 (vs GGA) But: SMAC’s performance depends on instance features Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Conclusion Generalized model-based parameter optimization: Sets of benchmark instances Many, categorical parameters Two new procedures for general algorithm configuration Random Online Aggressive Racing (ROAR) Simple yet surprisingly effective Sequential Model-based Algorithm Configuration (SMAC) State-of-the-art configuration procedure Improvements over FocusedILS and GGA (discrete-valued and finite domains) Modest but significant improvements Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration

Future Work Improve algorithm configuration further Cut off poor runs early (like adaptive capping in ParamILS) Handle “censored” data in the models Combine model-free and model-based methods Use SMAC’s models to gain scientific insights Importance of each parameter Interaction of parameters and instance features Use SMAC’s models for per-instance algorithm configuration Compute instance features Pick configuration predicted to be best Algorithm simplification: drop useless parameters Algorithm configuration on the cloud: exploiting parallelism Hutter et al: Sequential Model-Based Optimization for General Algorithm Configuration