Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS5321 Numerical Optimization

Similar presentations


Presentation on theme: "CS5321 Numerical Optimization"— Presentation transcript:

1 CS5321 Numerical Optimization
05 Conjugate Gradient Methods 12/30/2018

2 Conjugate gradient methods
For convex quadratic problems, the steepest descent method is slow in convergence. the Newton’s method is expensive in solving Ax=b. the conjugate gradient method solves Ax=b iteratively. Outline Conjugate directions Linear conjugate gradient method Nonlinear conjugate gradient method 12/30/2018

3 Quadratic optimization problem
Consider the quadratic optimization problem A is symmetric positive definite The optimal solution is at f (x) = 0 Define r(x) = f (x) = Ax − b (the residual). Solve Ax=b without inverting A. (Iterative method) m i n f ( x ) = 1 2 T A b r ( 1 2 x T A b ) = 12/30/2018

4 Steepest descent+line search
Given an initial guess x0. The search direction: pk= −fk= −rk= b −Axk The optimal step length: The optimal solution is Update xk+1= xk + kpk. Goto 2. m i n f ( x ) = 1 2 T A b m i n f ( x k + p ) k = r T p A 12/30/2018

5 Conjugate direction For a symmetric positive definite matrix A, one can define A-inner-product as x, yA= xTAy. A-norm is defined as Two vectors x and y are A-conjugate for a symmetric positive definite matrix A if xTAy=0 x and y are orthogonal under A-inner-product. The conjugate directions are a set of search directions {p0, p1, p2,…}, such that piApj=0 for any i  j. k x A = p T 12/30/2018

6 Example A = µ 1 4 ¶ ; b x Conjugate direction Steepest descent
; b x Conjugate direction Steepest descent 12/30/2018

7 Conjugate gradient A better result can be obtained if the current search direction combines the previous one pk+1= −rk + kpk Let pk+1 be A-conjugate to pk. ( ) p T k + 1 A = p T k + 1 A = r k = p T A r + 1 12/30/2018

8 The linear CG algorithm
With some linear algebra, the algorithm can be simplified as Given x0, r0=Ax0−b, p0= −r0 For k = 0,1,2,… until ||rk|| = 0 k = r T p A x + 1 12/30/2018

9 Properties of linear CG
One matrix-vector multiplication per iteration. Only four vectors are required. (xk, rk, pk, Apk) Matrix A can be stored implicitly The CG guarantees convergence in r iterations, where r is the number of distinct eigenvalues of A If A has eigenvalues 1  2  …  n, k x + 1 2 A n 12/30/2018

10 CG for nonlinear optimization
The Fletcher-Reeves method Given x0. Set p0= −f0, For k = 0,1,…, until f0=0 Compute optimal step length k and set xk+1=xk+ kpk Evaluate fk+1 k + 1 = r f T p 12/30/2018

11 Other choices of  Polak-Ribiere: Hestens-Siefel: Y.Dai and Y.Yuan
k + 1 = r f T ( ) Polak-Ribiere: Hestens-Siefel: Y.Dai and Y.Yuan (1999) WW.Hager and H.Zhang (2005) k + 1 = r f T ( ) p k + 1 = r f 2 ( ) T p k + 1 = y 2 p T r f 12/30/2018


Download ppt "CS5321 Numerical Optimization"

Similar presentations


Ads by Google