Download presentation
Presentation is loading. Please wait.
1
Outline Preface Fundamentals of Optimization
®Copyright of Shun-Feng Su Outline Preface Fundamentals of Optimization Unconstrained Optimization Ideas of finding solutions One-Dimensional Search Gradient Methods Newton’s Method and Its Variations
2
®Copyright of Shun-Feng Su
Gradient Methods Increment approach is to find which way can improve the current situation based on the current error. (back forward approach) Usually, an incremental approach is to update the parameter vector as x(k+1)=x(k)+x. In fact, such an approach is usually fulfilled as a gradient approach; that is x=f(x)/x. Need to find a relationship between the current error and the change of the variable considered; that is why x=f(x)/x is employed. Sept, 2010
3
®Copyright of Shun-Feng Su
Gradient Methods These methods use the gradient of the given function in searching for the minimizer. The gradient acts in such a direction that for a given small displacement, the function increase more in the gradient direction than in any other direction. When for any ||d||=1, <f, d>||f|| (Cauchy-Schwarz inequality) Also, <f, f / ||f|| >=||f|| Note we are now considering multi-variable functions. Inner production
4
x(k+1)=x(k)kf(x(k))
®Copyright of Shun-Feng Su Gradient Methods Thus, the iteration algorithm is x(k+1)=x(k)kf(x(k)) k is called the step size. This is often referred to as the gradient decent algorithm. The issue is how to select k. Usually, it is a constant and is selected in a ad hoc manner. small long searching time. large zigzag path to the minimizer. Sept, 2010
5
Gradient Methods Idea of level set sequences of steepest descent
®Copyright of Shun-Feng Su Gradient Methods Idea of level set sequences of steepest descent Sept, 2010
6
k =arg min0f(x(k) f(x(k)))
®Copyright of Shun-Feng Su Gradient Methods The steepest descent is to select k to achieve the maximum amount of decrease of the function (i.e., to minimize k()f(x(k) f(x(k)))), or k =arg min0f(x(k) f(x(k))) arg means the argument that can achieve the required. arg min0 means the value that can achieve the minimum for 0. Thus, we can conduct a line search in the direction of f(x(k)) to find x(k+1). It is called the steepest descent method. Sept, 2010
7
®Copyright of Shun-Feng Su
Gradient Methods If {x(k)}k=0 is a steepest descent sequence for a given function, then for each k, (x(k+1)x(k)) is orthogonal to (x(k+2)x(k+1)). Orthogonal means <(x(k+1)x(k)), (x(k+2)x(k+1))>=0. Proof: <(x(k+1)x(k)), (x(k+2)x(k+1))>= k k+1 <f(x(k)), f(x(k+1))> Note k =arg min0f(x(k)f(x(k))) or arg min0 k(). With FONC, k’(k)=0= dk(k)/dk= f(x(k)k f(x(k)))T.(-f(x(k))=<f(x(k+1)), f(x(k))> The proof is complete.
8
®Copyright of Shun-Feng Su
Gradient Methods Let {x(k)}k=0 be a steepest descent sequence for a given function. if f(x(k))0, then f(x(k+1))<f(x(k)). Proof: x(k+1)=x(k)kf(x(k)) and k =arg min0 k(). Thus, k( k)k() for all 0. It is easy to see f(x(k+1))=k(k)f(x(k))=k(0). Not sufficient
9
®Copyright of Shun-Feng Su
Gradient Methods Let {x(k)}k=0 be a steepest descent sequence for a given function. if f(x(k))0, then f(x(k+1))<f(x(k)). Proof: x(k+1)=x(k)kf(x(k)) and k =arg min0 k(). Thus, k( k)k() for all 0. It is easy to see f(x(k+1))=k(k)f(x(k))=k(0). Consider ’k(0)= f(x(k)0f(x(k)))T.(-f(x(k))= ||f(x(k))|| Since f(x(k))0, ’k(0)<0. It implies that there exist an ~>0 such that k(~)<k(0). f(x(k+1))=k(k)k(~)<k(0) = f(x(k)). The proof is complete. Not used
10
®Copyright of Shun-Feng Su
Gradient Methods If f(x(k))=0, then f(x(k+1))=f(x(k)). It means x(k) satisfies the FONC. It is a stopping (termination) criterion. However, this criterion is not directly suitable as a practical stopping criterion because f(x(k))=0 may not be obtained in practical cases. A practical stopping criterion is to check ||f(x(k))|| is less than a pre-specified threshold or to check whether |f(x(k+1)) f(x(k))|< (or relatively, divided by |f(x(k))|). Another alternative is ||x(k+1)) x(k)||< (or relatively divided by ||x(k)||). preferable
11
®Copyright of Shun-Feng Su
Gradient Methods Relative criterions are preferable because they are scale-independent (scaling the objective function will not change the satisfaction of the criterion.) A relative criterion like whether |f(x(k+1)) f(x(k))|/|f(x(k))|< may encounter problems when |f(x(k))| is very small. Thus, sometimes, we can use |f(x(k+1)) f(x(k))|/(max(1, |f(x(k))|))< .
12
Gradient Methods Example: consider
®Copyright of Shun-Feng Su Gradient Methods Example: consider Ans: Let the initial point is x(0)=[4, 2, -1]T. f(x)= f(x(0))=[0, -2, 1024] T. 0 =arg min0f(x(0)f(x(0))), by using the secant method, 0 =3.96710-3. x(1)=[4.0, 2.008, ]T.
13
Any method can be used to find the minimizer.
®Copyright of Shun-Feng Su Gradient Methods Example: consider Ans: Let the initial point is x(0)=[4, 2, -1]T. f(x)= f(x(0))=[0, -2, 1024] T. 0 =arg min0f(x(0)f(x(0))), by using the secant method, 0 =3.96710-3. x(1)=[4.0, 2.008, ]T. Any method can be used to find the minimizer.
14
Gradient Methods f(x(1))=[0, -1.984, -0.003875]T.
®Copyright of Shun-Feng Su Gradient Methods f(x(1))=[0, , ]T. 1 =arg min0f(x(1)f(x(1))), 1 =0.5. x(2)=[4.0, 3.0, ]T. f(x(2))=[0.0, 0.0, ]T. 2 =arg min0f(x(2)f(x(2))), 2 =16.29. x(3)=[4.0, 3.0, ]T. Note that the minimizer is [4, 3, -5]. In three iterations, it almost reaches the minimizer.
15
the Hessian matrix of f or H(x)=2f(x) =Q
®Copyright of Shun-Feng Su Gradient Methods Consider a quadratic function in steepest descent: f(x)=1/2 xTQx-bTx f(x)=Qx-b Assume Q is a symmetric matrix (if not, say AAT, xTAx=(xTAx)T=xTATx, Then xTAx=1/2(xTAx+xTATx) =1/2xT(A+AT)x=1/2xTQx the Hessian matrix of f or H(x)=2f(x) =Q The steepest descent x(k+1)=x(k)kf(x(k)) Scalar. symmetric
16
®Copyright of Shun-Feng Su
Gradient Methods
17
Gradient Methods To find arg min0f(x(k)g(k)).
®Copyright of Shun-Feng Su Gradient Methods To find arg min0f(x(k)g(k)). Define g(k)=f(x(k)) and k()=f(x(k)g(k)) Assume g(k) 0, (if g(k)=0, x(k)=x*) k()=1/2(x(k)g(k))TQ(x(k)g(k))(x(k)g(k))Tb) ’k()=(x(k)g(k))TQ(g(k))bT(g(k)) Let ’k(k)=0, we have k =(g(k)Tg(k))/(g(k)TQg(k)) explicit fomula for k or x(k+1)=x(k)[(g(k)Tg(k))/(g(k)TQg(k))] g(k)
18
®Copyright of Shun-Feng Su
Gradient Methods Note that the above (an implicit form for the steepest descent approach) is for quadratic form only. There are also some analysis about the convergence property and convergence rate. But usually, quadratic form is a simple problem. However, if you are studying on the convergence properties, you can check with those details in the references.
19
®Copyright of Shun-Feng Su
Gradient Methods An important result is about fixed step gradient algorithm (still for a quadratic form): For the fixed step size quadratic algorithm, x(k)x* for any x(0) if and only if 0<< 2/min(Q), where min(Q) denotes the maximal eigenvalue of Q. Note that it is only for quadratic form, but somehow this can be used in convergence analysis in general problems.
20
®Copyright of Shun-Feng Su
Gradient Methods
21
Gradient Methods Selected homework in Prob 3:
®Copyright of Shun-Feng Su Gradient Methods Selected homework in Prob 3: 8.5, 8.6, 8.13, 8.14 and 8.17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.