Extending the Scope of Algebraic MINLP Solvers to Black- and Grey- box Optimization Nick Sahinidis Acknowledgments: Alison Cozad, David Miller, Zach Wilson.

Slides:



Advertisements
Similar presentations
Slides from: Doug Gray, David Poole
Advertisements

Process Control: Designing Process and Control Systems for Dynamic Performance Chapter 6. Empirical Model Identification Copyright © Thomas Marlin 2013.
Control Structure Selection for a Methanol Plant using Hysys/Unisim
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Model assessment and cross-validation - overview
Venkataramanan Balakrishnan Purdue University Applications of Convex Optimization in Systems and Control.
Chapter 2: Lasso for linear models
Including Uncertainty Models for Surrogate-based Global Optimization
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Visual Recognition Tutorial
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
Numerical Optimization
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
NORM BASED APPROACHES FOR AUTOMATIC TUNING OF MODEL BASED PREDICTIVE CONTROL Pastora Vega, Mario Francisco, Eladio Sanz University of Salamanca – Spain.
Optimal Bandwidth Selection for MLS Surfaces
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Support Vector Regression David R. Musicant and O.L. Mangasarian International Symposium on Mathematical Programming Thursday, August 10, 2000
1 Real-Time Optimization (RTO) In previous chapters we have emphasized control system performance for disturbance and set-point changes. Now we will be.
Classification and Prediction: Regression Analysis
Radial Basis Function Networks
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Face Alignment Using Cascaded Boosted Regression Active Shape Models
Least-Squares Regression
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
1 Hybrid methods for solving large-scale parameter estimation problems Carlos A. Quintero 1 Miguel Argáez 1 Hector Klie 2 Leticia Velázquez 1 Mary Wheeler.
Towards a Fast Heuristic for MINLP John W. Chinneck, M. Shafique Systems and Computer Engineering Carleton University, Ottawa, Canada.
Optimization for Operation of Power Systems with Performance Guarantee
Part 5 Parameter Identification (Model Calibration/Updating)
Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
IPE 2003 Tuscaloosa, Alabama1 An Inverse BEM/GA Approach to Determining Heat Transfer Coefficient Distributions Within Film Cooling Holes/Slots Mahmood.
Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者:郝柏翰 2013/01/28.
WB1440 Engineering Optimization – Concepts and Applications Engineering Optimization Concepts and Applications Fred van Keulen Matthijs Langelaar CLA H21.1.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Dept. of Mathematical Information Technology June 13-17, 2011MCDM2011, Jyväskylä, Finland On Metamodel-based Multiobjective Optimization of Simulated Moving.
The Group Lasso for Logistic Regression Lukas Meier, Sara van de Geer and Peter Bühlmann Presenter: Lu Ren ECE Dept., Duke University Sept. 19, 2008.
Mathematical Models & Optimization?
Design, Optimization, and Control for Multiscale Systems
Silesian University of Technology in Gliwice Inverse approach for identification of the shrinkage gap thermal resistance in continuous casting of metals.
TSV-Constrained Micro- Channel Infrastructure Design for Cooling Stacked 3D-ICs Bing Shi and Ankur Srivastava, University of Maryland, College Park, MD,
Patch Based Prediction Techniques University of Houston By: Paul AMALAMAN From: UH-DMML Lab Director: Dr. Eick.
Forward Modeling Image Analysis The forward modeling analysis approach (sometimes called “analysis by synthesis”) involves optimizing the parameters of.
INCLUDING UNCERTAINTY MODELS FOR SURROGATE BASED GLOBAL DESIGN OPTIMIZATION The EGO algorithm STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION GROUP Thanks.
Machine Learning 5. Parametric Methods.
Derivative-Enhanced Variable Fidelity Kriging Approach Dept. of Mechanical Engineering, University of Wyoming, USA Wataru YAMAZAKI 23 rd, September, 2010.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Optimization in Engineering Design 1 Introduction to Non-Linear Optimization.
Håkan L. S. YounesDavid J. Musliner Carnegie Mellon UniversityHoneywell Laboratories Probabilistic Plan Verification through Acceptance Sampling.
Massive Support Vector Regression (via Row and Column Chunking) David R. Musicant and O.L. Mangasarian NIPS 99 Workshop on Learning With Support Vectors.
Data Mining Lectures Lecture 7: Regression Padhraic Smyth, UC Irvine ICS 278: Data Mining Lecture 7: Regression Algorithms Padhraic Smyth Department of.
Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
1 Chapter 6 Reformulation-Linearization Technique and Applications.
Bump Hunting The objective PRIM algorithm Beam search References: Feelders, A.J. (2002). Rule induction by bump hunting. In J. Meij (Ed.), Dealing with.
LECTURE 13: LINEAR MODEL SELECTION PT. 3 March 9, 2016 SDS 293 Machine Learning.
Water resources planning and management by use of generalized Benders decomposition method to solve large-scale MINLP problems By Prof. André A. Keller.
A Hybrid Optimization Approach for Automated Parameter Estimation Problems Carlos A. Quintero 1 Miguel Argáez 1, Hector Klie 2, Leticia Velázquez 1 and.
Kriging - Introduction Method invented in the 1950s by South African geologist Daniel Krige (1919-) for predicting distribution of minerals. Became very.
Bounded Nonlinear Optimization to Fit a Model of Acoustic Foams
ALAMO: Machine learning from data and first principles
Recent Development of Global Self-Optimizing Control
Subset Selection in Multiple Linear Regression
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Linear Model Selection and regularization
Neural networks (1) Traditional multi-layer perceptrons
Presentation transcript:

Extending the Scope of Algebraic MINLP Solvers to Black- and Grey- box Optimization Nick Sahinidis Acknowledgments: Alison Cozad, David Miller, Zach Wilson

2 Carnegie Mellon University SIMULATION OPTIMIZATION Pulverized coal plant Aspen Plus® simulation provided by the National Energy Technology Laboratory

Carnegie Mellon University 3 PROCESS DISAGGREGATION Block 1: Simulator Model generation Block 2: Simulator Model generation Block 3: Simulator Model generation Surrogate Models Build simple and accurate models with a functional form tailored for an optimization framework Process Simulation Disaggregate process into process blocks Optimization Model Add algebraic constraints design specs, heat/mass balances, and logic constraints

Carnegie Mellon University 4 LEARNING PROBLEM Independent variables: Operating conditions, inlet flow properties, unit geometry Dependent variables: Efficiency, outlet flow conditions, conversions, heat flow, etc. Process simulation or Experiment

5 Carnegie Mellon University We aim to build surrogate models that are – Accurate We want to reflect the true nature of the simulation – Simple Tailored for algebraic optimization – Generated from a minimal data set Reduce experimental and simulation requirements HOW TO BUILD THE SURROGATES

6 Carnegie Mellon University ALAMO Automated Learning of Algebraic Models for Optimization true Stop Update training data set Start false Initial sampling Build surrogate model Adaptive sampling Model converged? Black-box function Initial sampling Build surrogate model Current model Model error Adaptive sampling Update training data set New model

7 Carnegie Mellon University MODEL COMPLEXITY TRADEOFF Kriging [Krige, 63] Neural nets [McCulloch-Pitts, 43] Radial basis functions [Buhman, 00] Model complexity Model accuracy Linear response surface Preferred region

8 Carnegie Mellon University Goal: Identify the functional form and complexity of the surrogate models Functional form: – General functional form is unknown: Our method will identify models with combinations of simple basis functions MODEL IDENTIFICATION

9 Carnegie Mellon University Step 1: Define a large set of potential basis functions Step 2: Model reduction OVERFITTING AND TRUE ERROR True error Empirical error Complexity Error Ideal Model OverfittingUnderfitting

10 Carnegie Mellon University Qualitative tradeoffs of model reduction methods MODEL REDUCTION TECHNIQUES CPU modeling cost Efficacy Backward elimination [Oosterhof, 63] Forward selection [Hamaker, 62] Stepwise regression [Efroymson, 60] Regularized regression techniques Penalize the least squares objective using the magnitude of the regressors [Tibshirani, 95] Best subset methods Enumerate all possible subsets

11 Carnegie Mellon University MODEL SIZING Complexity = number of terms allowed in the model Goodness-of-fit measure 6 th term was not worth the added complexity Final model includes 5 terms Some measure of error that is sensitive to overfitting (AICc, BIC, Cp) Solve for the best one-term model Solve for the best two-term model

12 Carnegie Mellon University BASIS FUNCTION SELECTION Find the model with the least error We will solve this model for increasing T until we determine a model size

13 Carnegie Mellon University ALAMO Automated Learning of Algebraic Models for Optimization true Stop Update training data set Start false Initial sampling Build surrogate model Adaptive sampling Model converged? Model error Error maximization sampling

14 Carnegie Mellon University Search the problem space for areas of model inconsistency or model mismatch Find points that maximize the model error with respect to the independent variables – Derivative-free solvers work well in low-dimensional spaces [Rios and Sahinidis, 12] – Optimized using a black-box or derivative-free solver (SNOBFIT) [Huyer and Neumaier, 08] ERROR MAXIMIZATION SAMPLING Surrogate model

15 Carnegie Mellon University Goal – Compare methods on three target metrics Modeling methods compared – ALAMO modeler – Proposed methodology – The LASSO – The lasso regularization – Ordinary regression – Ordinary least-squares regression Sampling methods compared (over the same data set size) – ALAMO sampler – Proposed error maximization technique – Single LH – Single Latin hypercube (no feedback) COMPUTATIONAL RESULTS Model simplicity 3 Data efficiency 2 Model accuracy 1

16 Carnegie Mellon University Fraction of problems solved Ordinary regression ALAMO modeler the lasso Ordinary regression ALAMO modeler the lasso (0.005, 0.80) 80% of the problems had ≤0.5% error 70% of problems solved exactly Normalized test error error maximization sampling single Latin hypercube Model simplicity 3 Data efficiency 2 Model accuracy 1 Normalized test error

17 Carnegie Mellon University ALAMO sampler Single LH ALAMO sampler Single LH ALAMO sampler Single LH Ordinary regression ALAMO modeler the lasso Normalized test error Fraction of problems solved Model simplicity 3 Data efficiency 2 Model accuracy 1

18 Carnegie Mellon University more complexity than required Results over a test set of 45 known functions treated as black boxes with bases that are available to all modeling methods. Modeling type, Median Model simplicity 3 Data efficiency 2 Model accuracy 1

19 Carnegie Mellon University Balance fit (sum of square errors) with model complexity (number of terms in the model; denoted by p) MODEL SELECTION CRITERIA

20 Carnegie Mellon University CPU TIME COMPARISON Eight benchmarks from the UCI and CMU data sets Seventy noisy data sets were generated with multicolinearity and increasing problem size (number of bases) BIC solves more than two orders of magnitude faster than AIC, MSE and HQC – Optimized directly via a single mixed-integer convex quadratic model

21 Carnegie Mellon University MODEL QUALITY COMPARISON BIC leads to smaller, more accurate models – Larger penalty for model complexity

22 Carnegie Mellon University Expanding the scope of algebraic optimization – Using low-complexity surrogate models to strike a balance between optimal decision-making and model fidelity Surrogate model identification – Simple, accurate model identification – Integer optimization Error maximization sampling – More information found per simulated data point ALAMO REMARKS New surrogate model Surrogate model

23 Carnegie Mellon University Use freely available system knowledge to strengthen model – Physical limits – First-principles knowledge – Intuition Non-empirical restrictions can be applied to general regression problems THEORY UTILIZATION empirical datanon-empirical information

24 Carnegie Mellon University Challenging due to the semi-infinite nature of the regression constraints CONSTRAINED REGRESSION Standard regression Surrogate model easy tough

25 Carnegie Mellon University IMPLIED PARAMETER RESTRICTIONS Step 1: Formulate constraint in z- and x-space Step 2: Identify a sufficient set of β-space constraints 1 parametric constraint 4 β-constraints

26 Carnegie Mellon University Multiple responsesIndividual responsesResponse bounds Response derivativesAlternative domainsBoundary conditions Multiple responses mass balances, sum-to- one, state variables Individual responses mass and energy balances, physical limitations TYPES OF RESTRICTIONS Response bounds pressure, temperature, compositions Response derivativesAlternative domainsBoundary conditions monotonicity, numerical properties, convexity safe extrapolation, boundary conditions Add no slip velocity profile model safe extrapolation, boundary conditions Problem Space Extrapolation zone

27 Carnegie Mellon University CARBON CAPTURE SYSTEM DESIGN Discrete decisions: How many units? Parallel trains? What technology used for each reactor? Continuous decisions: Unit geometries Operating conditions: Vessel temperature and pressure, flow rates, compositions Surrogate models for each reactor and technology used

28 Carnegie Mellon University SUPERSTRUCTURE OPTIMIZATION Mixed-integer nonlinear programming model Economic model Process model Material balances Hydrodynamic/Energy balances Reactor surrogate models Link between economic model and process model Binary variable constraints Bounds for variables Mixed-integer nonlinear programming model Economic model Process model Material balances Hydrodynamic/Energy balances Reactor surrogate models Link between economic model and process model Binary variable constraints Bounds for variables

29 Carnegie Mellon University GLOBAL MINLP SOLVERS ON CMU/IBMLIB BARON 13 LINDOGLOBAL BARON 14 COUENNE SCIP ANTIGONE

30 Carnegie Mellon University ALAMO provides algebraic models that are Accurate and simple Generated from a minimal number of function evaluations ALAMO’s constrained regression facility allows modeling of Bounds on response variables Convexity/monotonicity of response variables On-going efforts Uncertainty quantification Symbolic regression ALAMO site: archimedes.cheme.cmu.edu/?q=alamo CONCLUSIONS