Why Function Optimization ?

Slides:



Advertisements
Similar presentations
Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.
Advertisements

Optimization.
Optimization : The min and max of a function
Optimization of thermal processes
Optimization 吳育德.
Optimization methods Review
Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.
1cs542g-term Notes  Assignment 1 due tonight ( me by tomorrow morning)
1 L-BFGS and Delayed Dynamical Systems Approach for Unconstrained Optimization Xiaohui XIE Supervisor: Dr. Hon Wah TAM.
Function Optimization Newton’s Method. Conjugate Gradients
Tutorial 12 Unconstrained optimization Conjugate gradients.
Methods For Nonlinear Least-Square Problems
Chapter 1 Introduction The solutions of engineering problems can be obtained using analytical methods or numerical methods. Analytical differentiation.
1 L-BFGS and Delayed Dynamical Systems Approach for Unconstrained Optimization Xiaohui XIE Supervisor: Dr. Hon Wah TAM.
Optimization Methods One-Dimensional Unconstrained Optimization
Ch 5.3: Series Solutions Near an Ordinary Point, Part II
Gradient Methods May Preview Background Steepest Descent Conjugate Gradient.
Tutorial 5-6 Function Optimization. Line Search. Taylor Series for Rn
Optimization Methods One-Dimensional Unconstrained Optimization
Unconstrained Optimization Problem
Math for CSTutorial 5-61 Tutorial 5 Function Optimization. Line Search. Taylor Series for R n Steepest Descent.
Lecture 10: Support Vector Machines
Function Optimization. Newton’s Method Conjugate Gradients Method
Advanced Topics in Optimization
Math for CSLecture 51 Function Optimization. Math for CSLecture 52 There are three main reasons why most problems in robotics, vision, and arguably every.
Optimization Methods One-Dimensional Unconstrained Optimization
Tier I: Mathematical Methods of Optimization

9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.
Computational Optimization
UNCONSTRAINED MULTIVARIABLE
Roots of Equations Chapter 3. Roots of Equations Also called “zeroes” of the equation –A value x such that f(x) = 0 Extremely important in applications.
ENCI 303 Lecture PS-19 Optimization 2
Simulation and Animation
Nonlinear programming Unconstrained optimization techniques.
Block 4 Nonlinear Systems Lesson 14 – The Methods of Differential Calculus The world is not only nonlinear but is changing as well 1 Narrator: Charles.
1 Unconstrained Optimization Objective: Find minimum of F(X) where X is a vector of design variables We may know lower and upper bounds for optimum No.
1 Optimization Multi-Dimensional Unconstrained Optimization Part II: Gradient Methods.
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.
Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.
D Nagesh Kumar, IIScOptimization Methods: M2L4 1 Optimization using Calculus Optimization of Functions of Multiple Variables subject to Equality Constraints.
559 Fish 559; Lecture 5 Non-linear Minimization. 559 Introduction Non-linear minimization (or optimization) is the numerical technique that is used by.
Quasi-Newton Methods of Optimization Lecture 2. General Algorithm n A Baseline Scenario Algorithm U (Model algorithm for n- dimensional unconstrained.
Dan Simon Cleveland State University Jang, Sun, and Mizutani Neuro-Fuzzy and Soft Computing Chapter 6 Derivative-Based Optimization 1.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Numerical Methods Solution of Equation.
1 EEE 431 Computational Methods in Electrodynamics Lecture 18 By Dr. Rasime Uyguroglu
Chapter 10 Minimization or Maximization of Functions.
1 Chapter 6 General Strategy for Gradient methods (1) Calculate a search direction (2) Select a step length in that direction to reduce f(x) Steepest Descent.
Exam 1 Oct 3, closed book Place ITE 119, Time:12:30-1:45pm
Gradient Methods In Optimization
Variations on Backpropagation.
Survey of unconstrained optimization gradient based algorithms
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
D Nagesh Kumar, IISc Water Resources Systems Planning and Management: M2L2 Introduction to Optimization (ii) Constrained and Unconstrained Optimization.
Optimal Control.
Function Optimization
Non-linear Minimization
CS5321 Numerical Optimization
Non-linear Least-Squares
Outline Single neuron case: Nonlinear error correcting learning
Chapter 10. Numerical Solutions of Nonlinear Systems of Equations
6.5 Taylor Series Linearization
~ Least Squares example
~ Least Squares example
Performance Optimization
Outline Preface Fundamentals of Optimization
Outline Preface Fundamentals of Optimization
Section 3: Second Order Methods
Presentation transcript:

Tutorial 11 Unconstrained optimization Steepest descent Newton’s method

Why Function Optimization ? There are three main reasons why most problems in robotics, vision, and arguably every other science or endeavor take on the form of optimization problems: The desired goal may not be achievable, and so we try to get as close as possible to it. There may be more ways to achieve the goal, and so we can choose one by assigning a quality to all the solutions and selecting the best one. We may not know how to solve the system of equations f(x) = 0, so instead we minimize the norm ||f(x)||, which is a scalar function of the unknown vector x. Tutorial 11 M4CS 2005

Characteristics of Optimization Algorithms x* = arg min f(x) Stability Under what conditions the minimum will be reached? Convergence speed N – the order of the algorithm (usually N=1,2, rarely 3) 3. Complexity How much time (CPU operations) takes each iteration. Tutorial 11 M4CS 2005

Line search Line search could run as follows: Let be the scalar function of α representing the possible values of f(x) in the direction of pk. Let (a,b,c) be the three points of α, such that a single point of (constrained) minimum x* is between a and c: a < x* < c . Then the following algorithm allows to approach x* arbitrarily close: If f(a) ≥ f(c), u = (a+b)/2; If f(u) < f(b) (a,b,c) = (a,u,b) Else (a,b,c) = (u,b,c) a b c u If f(a) < f(c), u = (b+c)/2; If f(u) < f(b) (a,b,c) = (b,u,c) Else (a,b,c) = (a,b,u) Tutorial 11 M4CS 2005

Taylor Series The Taylor series for a scalar function f(x) is given by ,where The Taylor series can be derived by successive differentiation of polynomial representation of f(x): For the function of n variables, the expression is Tutorial 11 M4CS 2005

2D Taylor Series: Example Consider an elliptic function: f(x,y)=(x-1)2+(2y-2)2 and find the first three terms of Taylor expansion. Tutorial 11 M4CS 2005

Steepest Descent: example Consider the same elliptic function: f(x,y)=(x-1)2+(2y-2)2 and find the first step of Steepest Descent, from (0,0). 1 2 -f’(0) Now, we the line search can be applied. Instead, we do: Is it a minimum? Next step? Tutorial 11 M4CS 2005

Newton’s Method The steepest descent treats only the gradient term of Taylor expansion to finds the minimization direction and therefore has linear convergence rate. Newton’s method treats also the second derivatives to find both the direction and the step, and is applicable, where the function f(x) near minimum x* can be approximated by a paraboloid: in other words if the Hessian H is PD. Minimum of the function should require: Tutorial 11 M4CS 2005

Newton’s Method: Example Consider the same elliptic function: f(x)=(x1-1)2+4(x2-2)2 and find the first step for Newton’s Method. 1 2 -f’(0) In this simple case, the description of the function with the first 3 Taylor terms is exact, and the first iteration converge to the minimum. Tutorial 11 M4CS 2005

Convergence rate 1/2 Before analyzing the convergence rate of steepest descent and Newton’s methods, we write again the Taylor series for the function: To simplify the proof we will consider the upper bound of this expansion and will take the norm of each term. Near the minimum point first derivative vanishes: The gradient of the function near the minimum behaves as following: Tutorial 11 M4CS 2005

Convergence rate 2/2 Now, consider the step k of the Newton’s method The step is chosen to zero out the first two terms, and therefore, Thus, the derivative converges to zero in the second order. Since point of zero derivative corresponds to the minimum of the function, the Newtons method is of the second order. Tutorial 11 M4CS 2005

Complexity 1/2 For example, for a quadratic function The steepest descent takes many iterations to converge in general case Q≠I, while the Newton’s method will require only one step. However, this single iteration in Newton's method is more expensive, because it requires both the gradient gk and the Hessian Hk to be evaluated, for a total of derivatives . In addition, the Hessian must be inverted, or, at least, a system must be solved. The explicit solution of this system requires about O(n3) operations and O(n2) memory, what is very expensive. Tutorial 11 M4CS 2005

Complexity 2/2 In contrast, steepest descent requires the gradient gk for selecting the step direction pk, and a line search in the direction pk to find the step size. These faster steps can be advantageous over faster convergence of Newton’s method for large dimensionality of x, which can exceed many thousands. In the next tutorial we will discuss the method of conjugate gradients, which is motivated by the desire to accelerate convergence with respect to the steepest descent method, but without paying the computation and storage cost of Newton's method. Tutorial 11 M4CS 2005