Optimization COS 323. Ingredients Objective functionObjective function VariablesVariables ConstraintsConstraints Find values of the variables that minimize.

Slides:



Advertisements
Similar presentations
Nelder Mead.
Advertisements

Optimization.
Engineering Optimization
Least squares CS1114
Root Finding COS D Root Finding Given some function, find location where f ( x )=0Given some function, find location where f ( x )=0 Need:Need:
Optimization Methods TexPoint fonts used in EMF.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 One-Dimensional Unconstrained Optimization Chapter.
Optimization : The min and max of a function
Optimization methods Review
Optimization Introduction & 1-D Unconstrained Optimization
CHAPTER 2 D IRECT M ETHODS FOR S TOCHASTIC S EARCH Organization of chapter in ISSO –Introductory material –Random search methods Attributes of random search.
Random numbers and optimization techniques Jorge Andre Swieca School Campos do Jordão, January,2003 second lecture.
1cs542g-term Notes  Assignment 1 due tonight ( me by tomorrow morning)
PHYS2020 NUMERICAL ALGORITHM NOTES ROOTS OF EQUATIONS.
Numerical Optimization
1cs542g-term Notes  Extra class this Friday 1-2pm  If you want to receive s about the course (and are auditing) send me .
Engineering Optimization
Engineering Optimization
MAE 552 – Heuristic Optimization Lecture 3 January 28, 2002.
Optimization Methods Unconstrained optimization of an objective function F Deterministic, gradient-based methods Running a PDE: will cover later in course.
Revision.
Gradient Methods May Preview Background Steepest Descent Conjugate Gradient.
Data Modeling and Least Squares Fitting 2 COS 323.
Optimization Methods One-Dimensional Unconstrained Optimization
Root Finding COS D Root Finding Given some function, find location where f ( x )=0Given some function, find location where f ( x )=0 Need:Need:
Advanced Topics in Optimization
Dr. Marco A. Arocha Aug,  “Roots” problems occur when some function f can be written in terms of one or more dependent variables x, where the.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 25 Instructor: Paul Beame.
Optimization Methods One-Dimensional Unconstrained Optimization
Chapter 3 Root Finding.
Simulated Annealing G.Anuradha. What is it? Simulated Annealing is a stochastic optimization method that derives its name from the annealing process used.

UNCONSTRAINED MULTIVARIABLE
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 9. Optimization problems.
CHAPTER 15 S IMULATION - B ASED O PTIMIZATION II : S TOCHASTIC G RADIENT AND S AMPLE P ATH M ETHODS Organization of chapter in ISSO –Introduction to gradient.
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
1 Hybrid methods for solving large-scale parameter estimation problems Carlos A. Quintero 1 Miguel Argáez 1 Hector Klie 2 Leticia Velázquez 1 Mary Wheeler.
Chapter 17 Boundary Value Problems. Standard Form of Two-Point Boundary Value Problem In total, there are n 1 +n 2 =N boundary conditions.
ENCI 303 Lecture PS-19 Optimization 2
84 b Unidimensional Search Methods Most algorithms for unconstrained and constrained optimisation use an efficient unidimensional optimisation technique.
Nonlinear programming Unconstrained optimization techniques.
Chapter 7 Optimization. Content Introduction One dimensional unconstrained Multidimensional unconstrained Example.
1 Optimization Multi-Dimensional Unconstrained Optimization Part II: Gradient Methods.
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.
Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
559 Fish 559; Lecture 5 Non-linear Minimization. 559 Introduction Non-linear minimization (or optimization) is the numerical technique that is used by.
ZEIT4700 – S1, 2015 Mathematical Modeling and Optimization School of Engineering and Information Technology.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
L21 Numerical Methods part 1 Homework Review Search problem Line Search methods Summary 1 Test 4 Wed.
One Dimensional Search
Chapter 10 Minimization or Maximization of Functions.
Optimization Problems
1 Chapter 6 General Strategy for Gradient methods (1) Calculate a search direction (2) Select a step length in that direction to reduce f(x) Steepest Descent.
Optimization of functions of one variable (Section 2)
Exam 1 Oct 3, closed book Place ITE 119, Time:12:30-1:45pm
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Optimization in Engineering Design 1 Introduction to Non-Linear Optimization.
Advanced Computer Graphics Optimization Part 2 Spring 2002 Professor Brogan.
INTRO TO OPTIMIZATION MATH-415 Numerical Analysis 1.
Ch. Eick: Num. Optimization with GAs Numerical Optimization General Framework: objective function f(x 1,...,x n ) to be minimized or maximized constraints:
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 2 - Chapter 7 Optimization.
Non-linear Minimization
Local Search Algorithms
Newton’s method for finding local minima
Chapter 7 Optimization.
Downhill Simplex Search (Nelder-Mead Method)
Local Search Algorithms
What are optimization methods?
Presentation transcript:

Optimization COS 323

Ingredients Objective functionObjective function VariablesVariables ConstraintsConstraints Find values of the variables that minimize or maximize the objective function while satisfying the constraints

Different Kinds of Optimization Figure from: Optimization Technology Center

Different Optimization Techniques Algorithms have very different flavor depending on specific problemAlgorithms have very different flavor depending on specific problem – Closed form vs. numerical vs. discrete – Local vs. global minima – Running times ranging from O(1) to NP-hard Today:Today: – Focus on continuous numerical methods

Optimization in 1-D Look for analogies to bracketing in root-findingLook for analogies to bracketing in root-finding What does it mean to bracket a minimum?What does it mean to bracket a minimum? ( x left, f ( x left ) ) ( x right, f ( x right ) ) ( x mid, f ( x mid ) ) x left < x mid < x right f ( x mid ) < f ( x left ) f ( x mid ) < f ( x right ) x left < x mid < x right f ( x mid ) < f ( x left ) f ( x mid ) < f ( x right )

Optimization in 1-D Once we have these properties, there is at least one local minimum between x left and x rightOnce we have these properties, there is at least one local minimum between x left and x right Establishing bracket initially:Establishing bracket initially: – Given x initial, increment – Evaluate f ( x initial ), f ( x initial +increment ) – If decreasing, step until find an increase – Else, step in opposite direction until find an increase – Grow increment at each step For maximization: substitute – f for fFor maximization: substitute – f for f

Optimization in 1-D Strategy: evaluate function at some x newStrategy: evaluate function at some x new ( x left, f ( x left ) ) ( x right, f ( x right ) ) ( x mid, f ( x mid ) ) ( x new, f ( x new ) )

Optimization in 1-D Strategy: evaluate function at some x newStrategy: evaluate function at some x new – Here, new “bracket” points are x new, x mid, x right ( x left, f ( x left ) ) ( x right, f ( x right ) ) ( x mid, f ( x mid ) ) ( x new, f ( x new ) )

Optimization in 1-D Strategy: evaluate function at some x newStrategy: evaluate function at some x new – Here, new “bracket” points are x left, x new, x mid ( x left, f ( x left ) ) ( x right, f ( x right ) ) ( x mid, f ( x mid ) ) ( x new, f ( x new ) )

Optimization in 1-D Unlike with root-finding, can’t always guarantee that interval will be reduced by a factor of 2Unlike with root-finding, can’t always guarantee that interval will be reduced by a factor of 2 Let’s find the optimal place for x mid, relative to left and right, that will guarantee same factor of reduction regardless of outcomeLet’s find the optimal place for x mid, relative to left and right, that will guarantee same factor of reduction regardless of outcome

Optimization in 1-D if f ( x new ) < f ( x mid ) new interval =  else new interval = 1–  2  2222

Golden Section Search To assure same interval, want  = 1–  2To assure same interval, want  = 1–  2 So,So, This is the “golden ratio” = 0.618…This is the “golden ratio” = 0.618… So, interval decreases by 30% per iterationSo, interval decreases by 30% per iteration – Linear convergence

Error Tolerance Around minimum, derivative = 0, soAround minimum, derivative = 0, so Rule of thumb: pointless to ask for more accuracy than sqrt(  )Rule of thumb: pointless to ask for more accuracy than sqrt(  ) – Can use double precision if you want a single- precision result (and/or have single-precision data)

Faster 1-D Optimization Trade off super-linear convergence for worse robustnessTrade off super-linear convergence for worse robustness – Combine with Golden Section search for safety Usual bag of tricks:Usual bag of tricks: – Fit parabola through 3 points, find minimum – Compute derivatives as well as positions, fit cubic – Use second derivatives: Newton

Newton’s Method

At each step:At each step: Requires 1 st and 2 nd derivativesRequires 1 st and 2 nd derivatives Quadratic convergenceQuadratic convergence

Multi-Dimensional Optimization Important in many areasImportant in many areas – Fitting a model to measured data – Finding best design in some parameter space Hard in generalHard in general – Weird shapes: multiple extrema, saddles, curved or elongated valleys, etc. – Can’t bracket (but there are “trust region” methods) In general, easier than rootfindingIn general, easier than rootfinding – Can always walk “downhill”

Newton’s Method in Multiple Dimensions Replace 1 st derivative with gradient, 2 nd derivative with HessianReplace 1 st derivative with gradient, 2 nd derivative with Hessian

Newton’s Method in Multiple Dimensions Replace 1 st derivative with gradient, 2 nd derivative with HessianReplace 1 st derivative with gradient, 2 nd derivative with Hessian So,So, Tends to be extremely fragile unless function very smooth and starting close to minimumTends to be extremely fragile unless function very smooth and starting close to minimum

Steepest Descent Methods What if you can’t / don’t want to use 2 nd derivative?What if you can’t / don’t want to use 2 nd derivative? “Quasi-Newton” methods estimate Hessian“Quasi-Newton” methods estimate Hessian Alternative: walk along (negative of) gradient…Alternative: walk along (negative of) gradient… – Perform 1-D minimization along line passing through current point in the direction of the gradient – Once done, re-compute gradient, iterate

Problem With Steepest Descent

Conjugate Gradient Methods Idea: avoid “undoing” minimization that’s already been doneIdea: avoid “undoing” minimization that’s already been done Walk along directionWalk along direction Polak and Ribiere formula:Polak and Ribiere formula:

Conjugate Gradient Methods Conjugate gradient implicitly obtains information about HessianConjugate gradient implicitly obtains information about Hessian For quadratic function in n dimensions, gets exact solution in n steps (ignoring roundoff error)For quadratic function in n dimensions, gets exact solution in n steps (ignoring roundoff error) Works well in practice…Works well in practice…

Value-Only Methods in Multi-Dimensions If can’t evaluate gradients, life is hardIf can’t evaluate gradients, life is hard Can use approximate (numerically evaluated) gradients:Can use approximate (numerically evaluated) gradients:

Generic Optimization Strategies Uniform sampling:Uniform sampling: – Cost rises exponentially with # of dimensions Simulated annealing:Simulated annealing: – Search in random directions – Start with large steps, gradually decrease – “Annealing schedule” – how fast to cool?

Downhill Simplex Method (Nelder-Mead) Keep track of n +1 points in n dimensionsKeep track of n +1 points in n dimensions – Vertices of a simplex (triangle in 2D tetrahedron in 3D, etc.) At each iteration: simplex can move, expand, or contractAt each iteration: simplex can move, expand, or contract – Sometimes known as amoeba method : simplex “oozes” along the function

Downhill Simplex Method (Nelder-Mead) Basic operation: reflectionBasic operation: reflection worst point (highest function value) location probed by reflection step

Downhill Simplex Method (Nelder-Mead) If reflection resulted in best (lowest) value so far, try an expansionIf reflection resulted in best (lowest) value so far, try an expansion Else, if reflection helped at all, keep itElse, if reflection helped at all, keep it location probed by expansion step

Downhill Simplex Method (Nelder-Mead) If reflection didn’t help (reflected point still worst) try a contractionIf reflection didn’t help (reflected point still worst) try a contraction location probed by contration step

Downhill Simplex Method (Nelder-Mead) If all else fails shrink the simplex around the best pointIf all else fails shrink the simplex around the best point

Downhill Simplex Method (Nelder-Mead) Method fairly efficient at each iteration (typically 1-2 function evaluations)Method fairly efficient at each iteration (typically 1-2 function evaluations) Can take lots of iterationsCan take lots of iterations Somewhat flakey – sometimes needs restart after simplex collapses on itself, etc.Somewhat flakey – sometimes needs restart after simplex collapses on itself, etc. Benefits: simple to implement, doesn’t need derivative, doesn’t care about function smoothness, etc.Benefits: simple to implement, doesn’t need derivative, doesn’t care about function smoothness, etc.

Rosenbrock’s Function Designed specifically for testing optimization techniquesDesigned specifically for testing optimization techniques Curved, narrow valleyCurved, narrow valley

Constrained Optimization Equality constraints: optimize f(x) subject to g i (x) =0Equality constraints: optimize f(x) subject to g i (x) =0 Method of Lagrange multipliers: convert to a higher-dimensional problemMethod of Lagrange multipliers: convert to a higher-dimensional problem Minimize w.r.t.Minimize w.r.t.

Constrained Optimization Inequality constraints are harder…Inequality constraints are harder… If objective function and constraints all linear, this is “linear programming”If objective function and constraints all linear, this is “linear programming” Observation: minimum must lie at corner of region formed by constraintsObservation: minimum must lie at corner of region formed by constraints Simplex method: move from vertex to vertex, minimizing objective functionSimplex method: move from vertex to vertex, minimizing objective function

Constrained Optimization General “nonlinear programming” hardGeneral “nonlinear programming” hard Algorithms for special cases (e.g. quadratic)Algorithms for special cases (e.g. quadratic)

Global Optimization In general, can’t guarantee that you’ve found global (rather than local) minimumIn general, can’t guarantee that you’ve found global (rather than local) minimum Some heuristics:Some heuristics: – Multi-start: try local optimization from several starting positions – Very slow simulated annealing – Use analytical methods (or graphing) to determine behavior, guide methods to correct neighborhoods