Slides:



Advertisements
Similar presentations
Zhen Lu CPACT University of Newcastle MDC Technology Reduced Hessian Sequential Quadratic Programming(SQP)
Advertisements

Lecture 5 Newton-Raphson Method
Optimization.
Optimization : The min and max of a function
Optimization of thermal processes
Optimization 吳育德.
Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted.
Steepest Decent and Conjugate Gradients (CG). Solving of the linear equation system.
1cs542g-term Notes  Assignment 1 due tonight ( me by tomorrow morning)
1 L-BFGS and Delayed Dynamical Systems Approach for Unconstrained Optimization Xiaohui XIE Supervisor: Dr. Hon Wah TAM.
Numerical Optimization
Function Optimization Newton’s Method. Conjugate Gradients
Unconstrained Optimization Rong Jin. Recap  Gradient ascent/descent Simple algorithm, only requires the first order derivative Problem: difficulty in.
1cs542g-term Notes  Extra class this Friday 1-2pm  If you want to receive s about the course (and are auditing) send me .
Tutorial 12 Unconstrained optimization Conjugate gradients.
Motion Analysis (contd.) Slides are from RPI Registration Class.
CSci 6971: Image Registration Lecture 4: First Examples January 23, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart, RPI Dr.
Announcements  Homework 4 is due on this Thursday (02/27/2004)  Project proposal is due on 03/02.
Design Optimization School of Engineering University of Bradford 1 Numerical optimization techniques Unconstrained multi-parameter optimization techniques.
1 L-BFGS and Delayed Dynamical Systems Approach for Unconstrained Optimization Xiaohui XIE Supervisor: Dr. Hon Wah TAM.
Optimization Methods One-Dimensional Unconstrained Optimization
Tutorial 5-6 Function Optimization. Line Search. Taylor Series for Rn
Function Optimization. Newton’s Method Conjugate Gradients Method
Advanced Topics in Optimization
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
Newton's Method for Functions of Several Variables
Why Function Optimization ?
Math for CSLecture 51 Function Optimization. Math for CSLecture 52 There are three main reasons why most problems in robotics, vision, and arguably every.
Optimization Methods One-Dimensional Unconstrained Optimization
Unconstrained Optimization Rong Jin. Logistic Regression The optimization problem is to find weights w and b that maximizes the above log-likelihood How.
Tier I: Mathematical Methods of Optimization
9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
UNCONSTRAINED MULTIVARIABLE
ENCI 303 Lecture PS-19 Optimization 2
84 b Unidimensional Search Methods Most algorithms for unconstrained and constrained optimisation use an efficient unidimensional optimisation technique.
Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 101 Quasi-Newton Methods.
Nonlinear programming Unconstrained optimization techniques.
1 Unconstrained Optimization Objective: Find minimum of F(X) where X is a vector of design variables We may know lower and upper bounds for optimum No.
1 Optimization Multi-Dimensional Unconstrained Optimization Part II: Gradient Methods.
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.
1 MODELING MATTER AT NANOSCALES 4. Introduction to quantum treatments The variational method.
MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization Geometries from energy derivatives.
559 Fish 559; Lecture 5 Non-linear Minimization. 559 Introduction Non-linear minimization (or optimization) is the numerical technique that is used by.
Case Study in Computational Science & Engineering - Lecture 5 1 Iterative Solution of Linear Systems Jacobi Method while not converged do { }
Quasi-Newton Methods of Optimization Lecture 2. General Algorithm n A Baseline Scenario Algorithm U (Model algorithm for n- dimensional unconstrained.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Lecture 13. Geometry Optimization References Computational chemistry: Introduction to the theory and applications of molecular and quantum mechanics, E.
Chapter 10 Minimization or Maximization of Functions.
1 Chapter 6 General Strategy for Gradient methods (1) Calculate a search direction (2) Select a step length in that direction to reduce f(x) Steepest Descent.
L24 Numerical Methods part 4
Gradient Methods In Optimization
Variations on Backpropagation.
Survey of unconstrained optimization gradient based algorithms
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Optimization in Engineering Design 1 Introduction to Non-Linear Optimization.
INTRO TO OPTIMIZATION MATH-415 Numerical Analysis 1.
Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 1 Primal Methods.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Non-linear Minimization
Multiplicity of a Root First Modified Newton’s Method
CS5321 Numerical Optimization
Chapter 10. Numerical Solutions of Nonlinear Systems of Equations
CS5321 Numerical Optimization
Performance Optimization
Outline Preface Fundamentals of Optimization
Section 3: Second Order Methods
Conjugate Direction Methods
First-Order Methods.
Presentation transcript:

What you can do for one variable, you can do for many (in principle)

Method of Steepest Descent The method of steepest descent (also known as the gradient method) is the simplest example of a gradient based method for minimizing a function of several variables. Its core is the following recursion formula: - Remember: Direction = dk = S(k) = -F(x(k)) Refer to Section 3.5 for Algorithm and Stopping Criteria Advantage: Simple Disadvantage: Seldom converges reliably.

Newton's Method (multi-variable case) No. Why? Remainder is dropped. Significance? T See Sec. 1.4. Not yet. Why? Like the Steepest Descent Method, Newton’s searches in the negative gradient direction. Don’t confuse H-1 with α.

Properties of Newton's Method Good properties (fast convergence) if started near solution. However, needs modifications if started far away from solution. Also, (inverse) Hessian is expensive to calculate. To overcome this, several modifications are often made. One of them is to add a search parameter  in from of the Hessian. (similar to steepest descent). This is often referred to as the modified Newton's method. Other modification focus on enhancing the properties of the second and first order gradient combination. Quasi-Newton methods build up curvature information by observing the behavior of the objective functions and its first order gradient. This info is used to generate an approximation of the Hessian.

Conjugate Directions Method Conjugate direction methods can be regarded as somewhat in between steepest descent and Newton's method, having the positive features of both of them. Motivation: Desire to accelerate slow convergence of steepest descent, but avoid expensive evaluation, storage, and inversion of Hessian. Application: Conjugate direction methods are invariably invented and solved for the quadratic problem: Minimize: (½) xTQx - bTx Note: Condition for optimality is y = Qx - b = 0 or Qx = b (linear equation) Note: Textbook uses “A” instead of “Q”.

Basic Principle Definition: Given a symmetric matrix Q, two vectors d1 and d2 are said to be Q orthogonal or Q conjugate (with respect to Q) if d1TQd2 = 0. Note that orthogonal vectors (d1Td2 = 0)are a special case of conjugate vectors So, since the vectors di are independent, the solution to the nxn quadratic problem can be rewritten as x* = 0d0 + ... + n-1 dn-1 Multiplying by Q and by taking the scalar product with di, you can express  in terms of d, Q, and either x* or b Note that A is used instead of Q in your textbook

Conjugate Gradient Method The conjugate gradient method is the conjugate direction method that is obtained by selecting the successive direction vectors as a conjugate version of the successive gradients obtained as the method progresses. You generate the conjugate directions as you go along. or Search direction @ iteration k. Three advantages: 1) Gradient is always nonzero and linearly independent of all previous direction vectors. 2) Simple formula to determine the new direction. Only slightly more complicated than steepest descent. 3) Process makes good progress because it is based on gradients.

“Pure” Conjugate Gradient Method (Quadratic Case) 0 - Starting at any x0 define d0 = -g0 = b - Q x0 , where gk is the column vector of gradients of the objective function at point f(xk) 1 - Using dk , calculate the new point xk+1 = xk + ak dk , where 2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + bk dk where T a k = - g d Qd Note that a is calculated b k = g k+1 T Qd d This is slightly different than your current textbook

Non-Quadratic Conjugate Gradient Methods For non-quadratic cases, you have the problem that you do not know Q, and you would have to make an approximation. One approach is to substitute Hessian H(xk) instead of Q. Problem is that Hessian has to be evaluated at each point. Other approaches avoid the Q completely by using Line Searches Examples: Fletcher-Reeves and Polak-Robiere methods Difference in methods: find ak through line search different formulas for calculating bk than the “pure” Conjugate Gradient algorithm

Polak-Robiere & Fletcher Reeves Method for Minimizing f(x) 0 - Starting at any x0 define d0 = -g0 ,where g is the column vector of gradients of the objective function at point f(x) 1 - Using dk , find the new point xk+1 = xk + ak dk , where ak is found using a line search that minimizes f(xk + ak dk) 2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + bk dk where bk can vary depending on what (update) formula you use. Fletcher-Reeves: Polak-Robiere: Note: gk+1 is the gradient of the objective function at point xk+1

Fletcher-Reeves Method for Minimizing f(x) 0 - Starting at any x0 define d0 = -g0 ,where g is the column vector of gradients of the objective function at point f(x) 1 - Using dk , find the new point xk+1 = xk + ak dk , where ak is found using a line search that minimizes f(xk + ak dk) 2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + bk dk where See also Example 3.9 (page 73) in your textbook

Conjugate Gradient Method Advantages Attractive are the simple formulae for updating the direction vector. Method is slightly more complicated than steepest descent, but converges faster. See ‘em in action! For animations of each of ALL preceding search techniques, check out: http://www.esm.vt.edu/~zgurdal/COURSES/4084/4084-Docs/Animation.html