CS5321 Numerical Optimization 17 Penalty and Augmented Lagrangian Methods 11/28/2018
Outline Both active set methods and interior point methods require a feasible initial point. Penalty methods need not a feasible initial point. Quadratic penalty method Nonsmooth exact penalty method Augmented Lagrangian methods 11/28/2018
1. Quadratic penalty function For the quadratic penalty function is is the penalty parameter m i n x f ( ) s . t c = ; 2 E Q ( x ; ¹ ) = f + 2 X i E c m i n x f ( ) s . t c = ; 2 E ¸ I Q ( x ; ¹ ) = f + 2 X i E c I [ ] ¡ 11/28/2018
Quadratic penalty method Given 0, {k|k0, k >0}, starting point x0 For k = 0,1,2,… Find a solution xk of Q (:, k) s.t. ||xQ(:, k)|| k. If converged, stop Choose k+1 > k and another starting xk. Theorem 17.1: If xk is a global exact solution to step 2(a), and k, xk converges to the global solution x* of the original problem. 11/28/2018
The Hessian matrix Let A(x)T=[ci(x)]iE. The Hessian of Q is Step 2(a) needs to solve ATA only has rank m (m<n). As k increases, the system becomes ill-conditioned Solve a larger system with a better condition r 2 x Q ( ; ¹ k ) = f + X i E c A T r 2 x Q ( ; ¹ k ) p = ¡ 2 4 r f + X i E ¹ k c A ( x ) T ¡ 1 = u I 3 5 · p ³ ¸ Q 11/28/2018
2. Nonsmooth Penalty function [y]−=max{0,−y}, which is not differentiable. But the functions inside are differentiable. Approximate it by linear functions Á 1 ( x ; ¹ ) = f + X i 2 E j c I [ ] ¡ q ( p ; ¹ ) = f x + r T 1 2 W X i E j c I [ ] ¡ 11/28/2018
Smoothed object function The object function can be rewritten as Á 1 ( x ; ¹ ) = f + X i 2 E j c I [ ] ¡ m i n p ; r s t f ( x ) T + 1 2 W ¹ X E I . c = ¸ ¡ 11/28/2018
3. Augmented Lagrangian For , define the augmented Lagrangian function Theorem 17.5: If x* is a solution of the original problem, and ci(x) are linearly independent, and the second order optimality conditions are satisfied, then there is a *, such that for all *, x* is a local minimizer of LA(x,,) m i n x f ( ) s . t c = ; 2 E L A ( x ; ¸ ¹ ) = f ¡ X i 2 E c + 11/28/2018
Lagrangian multiplier The gradient of LA(x,,) is In the Lagrangian function property, the Lagrangian multiplier Thus, . To satisfy ci(x)=0, either or Previous penalty methods only use to make ci(x)=0 Parameter can be updated as r L A ( x k ; ¸ ¹ ) = f ¡ X i 2 E [ c ] ¸ ¤ i = ( ) k ¡ ¹ c x c i ( x ) = ¡ [ ¸ ¤ k ] ¹ ( ¸ i ) k ! ¤ ( ¸ i ) k + 1 = ¡ ¹ c x 11/28/2018
Inequality constrains For inequality constraints, add slack variables Bounded constrained Lagrangian (BCL) How to solve this will be discussed in chap 18. (gradient projection method) c i ( x ) ¡ s = ; ¸ f o r a l 2 I L A ( x ; ¸ ¹ ) = f ¡ m X i 1 c + 2 n s . t l · u 11/28/2018