Conjugate Gradient Problem: SD too slow to converge if NxN H matrix is ill-conditioned. SD: dx = - g (slow but no inverse to store or compute) CG: dx =

Slides:

Advertisements

Similar presentations

Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.

Advertisements

Optimization with Constraints

Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.

Steepest Decent and Conjugate Gradients (CG). Solving of the linear equation system.

Jonathan Richard Shewchuk Reading Group Presention By David Cline

Numerical Optimization

Unconstrained Optimization Rong Jin. Recap  Gradient ascent/descent Simple algorithm, only requires the first order derivative Problem: difficulty in.

Tutorial 12 Unconstrained optimization Conjugate gradients.

Shawn Sickel A Comparison of some Iterative Methods in Scientific Computing.

Newton's Method for Functions of Several Variables

Unconstrained Optimization Rong Jin. Logistic Regression The optimization problem is to find weights w and b that maximizes the above log-likelihood How.

Solving quadratic equations Factorisation Type 1: No constant term Solve x 2 – 6x = 0 x (x – 6) = 0 x = 0 or x – 6 = 0 Solutions: x = 0 or x = 6 Graph.

Section 8.3 – Systems of Linear Equations - Determinants Using Determinants to Solve Systems of Equations A determinant is a value that is obtained from.

9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.

By Mary Hudachek-Buswell. Overview Atmospheric Turbulence Blur.

Computational Optimization

UNCONSTRAINED MULTIVARIABLE

Collaborative Filtering Matrix Factorization Approach

84 b Unidimensional Search Methods Most algorithms for unconstrained and constrained optimisation use an efficient unidimensional optimisation technique.

Qualifier Exam in HPC February 10 th, Quasi-Newton methods Alexandru Cioaca.

Application of Differential Applied Optimization Problems.

13.6 MATRIX SOLUTION OF A LINEAR SYSTEM.  Examine the matrix equation below.  How would you solve for X?  In order to solve this type of equation,

Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore

Chapter 2-OPTIMIZATION

Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.

Data assimilation for weather forecasting G.W. Inverarity 06/05/15.

Warm Up 1.) What is the graph of the function y = -x 2 + 4x + 1?

Algebra 2 cc Section 2.2 Solve quadratic equations by factoring

Fast 3D Least-squares Migration with a Deblurring Filter Wei Dai.

WARM UP What are the solutions of each equation? 1.) x = 4 2.) x = 0 3.) x 2 – 49 = 0.

Using Matrices to Solve a 3-Variable System

The Inverse of a Square Matrix

Non-linear Minimization

Solving Quadratic Equations by the Complete the Square Method

Computational Optimization

Quasi-Newton Methods Problem: SD, CG too slow to converge if NxN H matrix is ill-conditioned. SD: dx = - g (slow but no inverse to store or compute) QN:

A Comparison of some Iterative Methods in Scientific Computing

Iterative Non-Linear Optimization Methods

Steepest Descent Optimization

CS5321 Numerical Optimization

Collaborative Filtering Matrix Factorization Approach

CS5321 Numerical Optimization

Outline Single neuron case: Nonlinear error correcting learning

CS5321 Numerical Optimization

Conjugate Gradient Method

Overview of Multisource and Multiscale Seismic Inversion

CS5321 Numerical Optimization

Ch2: Adaline and Madaline

Overview of Multisource and Multiscale Seismic Inversion

Introduction to Scientific Computing II

Introduction to Scientific Computing II

Introduction to Scientific Computing II

Optimization Methods TexPoint fonts used in EMF.

~ Least Squares example

Solving Linear Systems: Iterative Methods and Sparse Systems

~ Least Squares example

Introduction to Scientific Computing II

Administrivia: November 9, 2009

Performance Optimization

Multiple features Linear Regression with multiple variables

Multiple features Linear Regression with multiple variables

Section 3: Second Order Methods

Steepest Descent Optimization

Conjugate Gradient Optimization

Solving Linear Systems of Equations - Inverse Matrix

Conjugate Direction Methods

CS5321 Numerical Optimization

First-Order Methods.

Presentation transcript:

Conjugate Gradient Problem: SD too slow to converge if NxN H matrix is ill-conditioned. SD: dx = - g (slow but no inverse to store or compute) CG: dx = -p (fast but no inverse to compute+store) GN: dx = -H-1 g (fast but expensive) Solution: Conjugate Gradient converges in N iterations if NxN H is S.P.D. Quasi-Newton Condition: g’ – g = Hdx’  (g’-g)/dx’= d2g/dx2

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Conjugate Gradient . dxT g=0 -g dx Quasi-Newton Condition: g’ – g = Hdx’ (1) g’ dx’Tg’= dx’T f(x*)T = 0 D dxT g=0 . dx’ x* -g dx’ dx’ dx’ Kiss point dx For dx’ at the bullseye x*, g’=0 so eqn. 1 becomes, after multiplying by dx and recalling dxT g=0, dxT (g’-g)=0 zero at bullseye. Hence, Conjugacy Condition: 0 = dxTHdx’ (2) x’ = x + a p (where p is conjugate to previous direction) (3)

(no longer going downhill) Conjugate Gradient Quasi-Newton Condition: g’ – g = Hdx’ (1) Conjugacy Condition: 0 = dxHdx’ (2) x’ = x + a p (where p is conjugate to previous direction and a linear combo of dx & g) (3) For i = 1:nit end 0 = dxT H(bdx - g) Solve for b s.t. dx conjugate to dx’ find b find a p { dxTHdx dxT Hg b = p= bdx - g x* g dx’ dx’ = dx + ap x=x+ dx’ Solve for a s.t. dx’ kisses contour (no longer going downhill) Kiss point dx dxTHdx dxT g a =

Conjugate Gradient For i = 1:nit find b p= bdx - g find a dxTHdx dxT Hg b = find b p= bdx - g dxTHdx dxT g a = find a dx’ = dx + ap x=x+ dx’ end Recall, aHd (k-1) = g(k) - g(k-1)

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Conjugate Gradient For i = 1:nit find b p= dx + bg find a dx’ = dx + ap For i = 1:nit find b find a p= dx + bg x=x+ dx’ end dxTHdx dxT Hg b = dxT g a = Fletcher-Reeves Polak-Ribierre Not going downhill if moving perpindicular to gradient -g

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Conjugate Gradient: Lx=d dxT g=0 . dx’ x* -g dx’ dx’ dx’ Kiss point dx Conjugate Gradient: Lx=d

x* -g dk1 Kiss point dk

Conjugate Gradient: LTLx=LTd Compared to square system of equations, the gradient for overdetermined system of equations has an extra LT However, LLT has squared condition number

Conjugate Gradient Convergence Well conditioned In most dimensions Poorly conditioned In every dimension If NxN H is linear SPD then convergence in N iterations, but in practice much sooner. Stopping sooner is a form of regularization by excluding small eigenvalue components

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Regularized Conjugate Gradient Balance between solution That minimizes misfit and one that minimizes penalty

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Preconditioned Conjugate Gradient Find a cheap approximate inverse P~H-1 so that PH~I. Thus, Ill-conditioned system of equations: Hx=-g Well-conditioned system of equations: PHx=-Pg PHx=-Pg A cheap approximate inverse is [H-1]ii ~ 1/Hii . Warning: PH should be SPD

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Non-linear Conjugate Gradient Reset to gradient direction after every approximately 3-5 iterations Locally quadratic