Conjugate Gradient Problem: SD too slow to converge if NxN H matrix is ill-conditioned. SD: dx = - g (slow but no inverse to store or compute) CG: dx =

Slides:



Advertisements
Similar presentations
Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.
Advertisements

Optimization with Constraints
Optimization.
Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.
Steepest Decent and Conjugate Gradients (CG). Solving of the linear equation system.
Jonathan Richard Shewchuk Reading Group Presention By David Cline
Numerical Optimization
Unconstrained Optimization Rong Jin. Recap  Gradient ascent/descent Simple algorithm, only requires the first order derivative Problem: difficulty in.
Tutorial 12 Unconstrained optimization Conjugate gradients.
Shawn Sickel A Comparison of some Iterative Methods in Scientific Computing.
Newton's Method for Functions of Several Variables
Unconstrained Optimization Rong Jin. Logistic Regression The optimization problem is to find weights w and b that maximizes the above log-likelihood How.
Solving quadratic equations Factorisation Type 1: No constant term Solve x 2 – 6x = 0 x (x – 6) = 0 x = 0 or x – 6 = 0 Solutions: x = 0 or x = 6 Graph.
Section 8.3 – Systems of Linear Equations - Determinants Using Determinants to Solve Systems of Equations A determinant is a value that is obtained from.

9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.
By Mary Hudachek-Buswell. Overview Atmospheric Turbulence Blur.
Computational Optimization
UNCONSTRAINED MULTIVARIABLE
Collaborative Filtering Matrix Factorization Approach
84 b Unidimensional Search Methods Most algorithms for unconstrained and constrained optimisation use an efficient unidimensional optimisation technique.
Qualifier Exam in HPC February 10 th, Quasi-Newton methods Alexandru Cioaca.
Application of Differential Applied Optimization Problems.
13.6 MATRIX SOLUTION OF A LINEAR SYSTEM.  Examine the matrix equation below.  How would you solve for X?  In order to solve this type of equation,
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Chapter 2-OPTIMIZATION
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Data assimilation for weather forecasting G.W. Inverarity 06/05/15.
Warm Up 1.) What is the graph of the function y = -x 2 + 4x + 1?
Algebra 2 cc Section 2.2 Solve quadratic equations by factoring
Fast 3D Least-squares Migration with a Deblurring Filter Wei Dai.
WARM UP What are the solutions of each equation? 1.) x = 4 2.) x = 0 3.) x 2 – 49 = 0.
Using Matrices to Solve a 3-Variable System
The Inverse of a Square Matrix
Non-linear Minimization
Solving Quadratic Equations by the Complete the Square Method
Computational Optimization
Quasi-Newton Methods Problem: SD, CG too slow to converge if NxN H matrix is ill-conditioned. SD: dx = - g (slow but no inverse to store or compute) QN:
A Comparison of some Iterative Methods in Scientific Computing
Iterative Non-Linear Optimization Methods
Steepest Descent Optimization
CS5321 Numerical Optimization
Collaborative Filtering Matrix Factorization Approach
CS5321 Numerical Optimization
Outline Single neuron case: Nonlinear error correcting learning
CS5321 Numerical Optimization
Conjugate Gradient Method
Overview of Multisource and Multiscale Seismic Inversion
CS5321 Numerical Optimization
Ch2: Adaline and Madaline
Overview of Multisource and Multiscale Seismic Inversion
Introduction to Scientific Computing II
Introduction to Scientific Computing II
Introduction to Scientific Computing II
Optimization Methods TexPoint fonts used in EMF.
~ Least Squares example
Solving Linear Systems: Iterative Methods and Sparse Systems
~ Least Squares example
Introduction to Scientific Computing II
Administrivia: November 9, 2009
Performance Optimization
Multiple features Linear Regression with multiple variables
Multiple features Linear Regression with multiple variables
Section 3: Second Order Methods
Steepest Descent Optimization
Conjugate Gradient Optimization
Solving Linear Systems of Equations - Inverse Matrix
Conjugate Direction Methods
CS5321 Numerical Optimization
First-Order Methods.
Presentation transcript:

Conjugate Gradient Problem: SD too slow to converge if NxN H matrix is ill-conditioned. SD: dx = - g (slow but no inverse to store or compute) CG: dx = -p (fast but no inverse to compute+store) GN: dx = -H-1 g (fast but expensive) Solution: Conjugate Gradient converges in N iterations if NxN H is S.P.D. Quasi-Newton Condition: g’ – g = Hdx’  (g’-g)/dx’= d2g/dx2

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Conjugate Gradient . dxT g=0 -g dx Quasi-Newton Condition: g’ – g = Hdx’ (1) g’ dx’Tg’= dx’T f(x*)T = 0 D dxT g=0 . dx’ x* -g dx’ dx’ dx’ Kiss point dx For dx’ at the bullseye x*, g’=0 so eqn. 1 becomes, after multiplying by dx and recalling dxT g=0, dxT (g’-g)=0 zero at bullseye. Hence, Conjugacy Condition: 0 = dxTHdx’ (2) x’ = x + a p (where p is conjugate to previous direction) (3)

(no longer going downhill) Conjugate Gradient Quasi-Newton Condition: g’ – g = Hdx’ (1) Conjugacy Condition: 0 = dxHdx’ (2) x’ = x + a p (where p is conjugate to previous direction and a linear combo of dx & g) (3) For i = 1:nit end 0 = dxT H(bdx - g) Solve for b s.t. dx conjugate to dx’ find b find a p { dxTHdx dxT Hg b = p= bdx - g x* g dx’ dx’ = dx + ap x=x+ dx’ Solve for a s.t. dx’ kisses contour (no longer going downhill) Kiss point dx dxTHdx dxT g a =

Conjugate Gradient For i = 1:nit find b p= bdx - g find a dxTHdx dxT Hg b = find b p= bdx - g dxTHdx dxT g a = find a dx’ = dx + ap x=x+ dx’ end Recall, aHd (k-1) = g(k) - g(k-1)

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Conjugate Gradient For i = 1:nit find b p= dx + bg find a dx’ = dx + ap For i = 1:nit find b find a p= dx + bg x=x+ dx’ end dxTHdx dxT Hg b = dxT g a = Fletcher-Reeves Polak-Ribierre Not going downhill if moving perpindicular to gradient -g

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Conjugate Gradient: Lx=d dxT g=0 . dx’ x* -g dx’ dx’ dx’ Kiss point dx Conjugate Gradient: Lx=d

x* -g dk1 Kiss point dk

Conjugate Gradient: LTLx=LTd Compared to square system of equations, the gradient for overdetermined system of equations has an extra LT However, LLT has squared condition number

Conjugate Gradient Convergence Well conditioned In most dimensions Poorly conditioned In every dimension If NxN H is linear SPD then convergence in N iterations, but in practice much sooner. Stopping sooner is a form of regularization by excluding small eigenvalue components

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Regularized Conjugate Gradient Balance between solution That minimizes misfit and one that minimizes penalty

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Preconditioned Conjugate Gradient Find a cheap approximate inverse P~H-1 so that PH~I. Thus, Ill-conditioned system of equations: Hx=-g Well-conditioned system of equations: PHx=-Pg PHx=-Pg A cheap approximate inverse is [H-1]ii ~ 1/Hii . Warning: PH should be SPD

Outline CG Algorithm Step Length: Polak-Ribiere vs Fletcher-Reeves CG Soln to Even & Overdetermined Equations Regularized CG Preconditioned CG Non-Linear CG

Non-linear Conjugate Gradient Reset to gradient direction after every approximately 3-5 iterations Locally quadratic