CS5321 Numerical Optimization

Slides:

Advertisements

Similar presentations

Zhen Lu CPACT University of Newcastle MDC Technology Reduced Hessian Sequential Quadratic Programming(SQP)

Advertisements

Scientific Computing QR Factorization Part 2 – Algorithm to Find Eigenvalues.

Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.

Inexact SQP Methods for Equality Constrained Optimization Frank Edward Curtis Department of IE/MS, Northwestern University with Richard Byrd and Jorge.

MATH 685/ CSI 700/ OR 682 Lecture Notes

Solving Linear Systems (Numerical Recipes, Chap 2)

Solution of linear system of equations

Numerical Optimization

1cs542g-term Notes  Extra class this Friday 1-2pm  If you want to receive s about the course (and are auditing) send me .

Sparse Matrix Algorithms CS 524 – High-Performance Computing.

Unconstrained Optimization Problem

Ordinary least squares regression (OLS)

Why Function Optimization ?

Unconstrained Optimization Rong Jin. Logistic Regression The optimization problem is to find weights w and b that maximizes the above log-likelihood How.

9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Computational Optimization

UNCONSTRAINED MULTIVARIABLE

MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 9. Optimization problems.

Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.

Solving Scalar Linear Systems Iterative approach Lecture 15 MA/CS 471 Fall 2003.

Frank Edward Curtis Northwestern University Joint work with Richard Byrd and Jorge Nocedal February 12, 2007 Inexact Methods for PDE-Constrained Optimization.

ENCI 303 Lecture PS-19 Optimization 2

Ordinary Least-Squares Emmanuel Iarussi Inria. Many graphics problems can be seen as finding the best set of parameters for a model, given some data Surface.

Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 101 Quasi-Newton Methods.

Qualifier Exam in HPC February 10 th, Quasi-Newton methods Alexandru Cioaca.

Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares.

Nonlinear Programming.  A nonlinear program (NLP) is similar to a linear program in that it is composed of an objective function, general constraints,

Fin500J: Mathematical Foundations in Finance

Frank Edward Curtis Northwestern University Joint work with Richard Byrd and Jorge Nocedal January 31, 2007 Inexact Methods for PDE-Constrained Optimization.

Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.

1 Incorporating Iterative Refinement with Sparse Cholesky April 2007 Doron Pearl.

Quasi-Newton Methods of Optimization Lecture 2. General Algorithm n A Baseline Scenario Algorithm U (Model algorithm for n- dimensional unconstrained.

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.

Inexact SQP methods for equality constrained optimization Frank Edward Curtis Department of IE/MS, Northwestern University with Richard Byrd and Jorge.

1 Chapter 7 Numerical Methods for the Solution of Systems of Equations.

Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.

Consider Preconditioning – Basic Principles Basic Idea: is to use Krylov subspace method (CG, GMRES, MINRES …) on a modified system such as The matrix.

Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.

Optimal Control.

1 Numerical Methods Solution of Systems of Linear Equations.

Bounded Nonlinear Optimization to Fit a Model of Acoustic Foams

CSLT ML Summer Seminar (13)

Computational Optimization

Quasi-Newton Methods Problem: SD, CG too slow to converge if NxN H matrix is ill-conditioned. SD: dx = - g (slow but no inverse to store or compute) QN:

CS B553: Algorithms for Optimization and Learning

CS5321 Numerical Optimization

CS5321 Numerical Optimization

Collaborative Filtering Matrix Factorization Approach

CS5321 Numerical Optimization

CS5321 Numerical Optimization

Chapter 10. Numerical Solutions of Nonlinear Systems of Equations

CS5321 Numerical Optimization

CS5321 Numerical Optimization

Optimization Methods TexPoint fonts used in EMF.

CS5321 Numerical Optimization

~ Least Squares example

CS5321 Numerical Optimization

Solving Linear Systems: Iterative Methods and Sparse Systems

~ Least Squares example

CS5321 Numerical Optimization

Performance Optimization

Section 3: Second Order Methods

Linear Algebra Lecture 16.

CS5321 Numerical Optimization

Ax = b Methods for Solution of the System of Equations:

CS5321 Numerical Optimization

Multiple linear regression

Presentation transcript:

CS5321 Numerical Optimization 07 Large-Scale Unconstrained Optimization 11/25/2019

Large-scaled optimizations The problem size n may be thousands to millions. Storage of Hessian is n2. Even if it is sparse, its decompositions (LU, Cholesky…) and its approximations (BFGS, SR1…) are not. Methods Inexact Newton methods Line-search: Truncated Newton method Trust region: Steihaug’s algorithm Limited memory BFGS, Sparse quasi-Newton updates Algorithms for partially separable functions 11/25/2019

Inexact Newton method Inexact Newton methods use iterative methods to solve the Newton’s direction p = −H−1g inexactly. The exactness is measured by the residual rk=Hkpk+gk It stops when ||rk|| k||gk|| for 0< k <1. Convergence (Theorem 7.1, 7.2) If H is spd for x near x*, and x0 is close enough to x*, the inexact Newton method converges to x*. If k0, the convergence is superlinear. In addition, if H is Lipschitz continuous for x near x*, the convergence is quadratic. 11/25/2019

Truncated Newton method CG + line search + termination conditions What CG need is matrix-vector multiplications Use finite difference to approximate Hessian (multiplying a vector d.). The cost is the computation of f (xk+hd) Additional termination conditions When negative curvature is detected pTHp<0, return −g H d = r 2 f k ( x + h ) ¡ 11/25/2019

Steihaug’s algorithm CG + trust-region + termination conditions Change terminations of CG: xz, pd. Index j Additional termination conditions When negative curvature is detected pTHp<0, When the step size ||zj||k, It can be shown that ||zj|| increases monotonically When stops abnormally, it returns zj+dj, where  minimizes mk(zj+dj) subject to || zj+dj ||=k. 11/25/2019

Limited memory BFGS Review of BFGS L-BFGS: Not form Hk explicitly , , , , L-BFGS: Not form Hk explicitly Store sk and yk for m iterations (m << n) x k + 1 = ® p , s ¡ y g H k + 1 = V T ½ s V k = I ¡ ½ s y T ½ k = 1 y T s H k = ( V T ¡ 1 ¢ m ) + ½ s 2 11/25/2019

Compact form of L-BFGS Only need to store Sk, Yk, Lk, Dk. = ¡ S Y ¢ µ T L D ¶ 1 Only need to store Sk, Yk, Lk, Dk. Sk, Yk are nm. Lk is mm upper triangular. Dk is mm diagonal. S k = ¡ s : 1 ¢ Y k = ¡ y : 1 ¢ ( L k ) i ; j = ½ s T ¡ 1 y f > , o t h e r w D k = d i a g ( s T y ; : ¡ 1 ) 11/25/2019

Sparse quasi-Newton updates Compute Bk+1 that is symmetric and has the same sparsity as the exact Hessian. satisfies secant direction. Bk+1sk=yk. Sparsity: ={(i, j) | [2f (x)]ij0} Subject to Bk+1sk=yk, B=BT and Bij=0 for (i, j) Constrained nonlinear least square problem The solution may fail to be positive definite. m i n B k ¡ 2 F = X ( ; j ) [ ] 11/25/2019

Partially separable functions Separable object function: f (x) = f1(x1,x3)+f2(x2,x4) Partially separable object function f (x): f (x) can be decomposed as a sum of element functions fi, which depends only a few components of x. Gradient and Hessian are linear operators Compute Hessian of each element functions separately f ( x ) = 1 ¡ 2 3 + 4 f ( x ) = ` X i 1 ; r 2 11/25/2019