Presentation is loading. Please wait.

Presentation is loading. Please wait.

8/7/2019 Berhanu G (Dr) 1 Chapter 3 Convex Functions and Separation Theorems In this chapter we focus mainly on Convex functions and their properties in.

Similar presentations


Presentation on theme: "8/7/2019 Berhanu G (Dr) 1 Chapter 3 Convex Functions and Separation Theorems In this chapter we focus mainly on Convex functions and their properties in."— Presentation transcript:

1 8/7/2019 Berhanu G (Dr) 1 Chapter 3 Convex Functions and Separation Theorems In this chapter we focus mainly on Convex functions and their properties in relation with optimization

2 8/7/2019 Berhanu G (Dr) 2 4.1 Convex functions Def n : Let S   n be a convex set. A functional f : S  R is said to be convex (over S) if for any x 1, x 2  S and any λ  [0,1], f(λx 1 + (1  λ)x 2 )  λf(x 1 ) + (1  λ)f(x 2 ). f is said to be concave iff  f is convex [ so, f is concave iff … f(λx 1 + (1  λ)x 2 )  λf(x 1 ) + (1  λ)f(x 2 ) ] Neither convex nor concave y=f(x) Convex function y=f(x) Concave function f is strictly convex if for every distinct x 1, x 2  S, and λ  (0,1), f(λx 1 + (1  λ)x 2 ) < λf(x 1 ) + (1  λ)f(x 2 ).

3 8/7/2019 Berhanu G (Dr) 3 Examples: 1. Given x 0  R n, let d: R n  R be given by d(x) = ||x – x o ||. Then d is a convex function, 2. Let q(x) = x T Ax, where A is an n  n symmetric matrix. (i) q is convex if A is positive semi definite (ii) q is strictly convex if A is positive definite. [Note: A is PSD  2x T Ay  x T Ax + y T Ay, as (x  y) T A(x  y)  0; and the inequalities are strict when A is PD and x  y. ] 3. Let f(x) = a T x+ b, where a  R n, b  R (affine function) Then, f is convex. (In fact, affine function is also concave.) Note: f : S  R is convex iff for every points x 1, x 2, …, x n in S

4 8/7/2019 Berhanu G (Dr) 4 hyp(f) Def n : Let f : S  R. 1. The epigraph of f is the set epi(f):=  (x,y)  SxR : f(x)  y . 2. The hypograph of f is the set hyp(f):=  (x,y)  SxR : f(x) )  y . y=f(x) hyp(f) y=f(x) hyp(f) epi(f) Theorem : Let S  R n be convex and f : S  R. f is convex function  epi(f) is convex set epi(f)

5 8/7/2019 Berhanu G (Dr) 5 4.2: Some Properties of Convex functions Theorem 1: If f: S  R is a convex function, then the level set S  = { x  S | f(x)   }, where , is a convex set. In this section, let S  n be a convex set unless stated otherwise. Proof: ( Direct computations ) Theorem 3: If f 1, f 2 : S  R are convex functions, then f(x) =  1 f 1 (x) +  2 f 2 (x), where  1,  2  0, is a convex function Proof: Take x 1, x 2  S ,  [0,1] and x= x 1 +(1  )x 2 By convexity, f( x 1 +(1  )x 2 )  f(x 1 )+ (1  )f(x 2 )   +(1  )  =   x 1 +(1  )x 2  S  Theorem 2: Let g i : S  R, i=1,2,…m be convex functions. Then, S = { x  S | g i (x)  0, i=1,2,…m} is a convex set. Proof: Let S i0 = {x  S | g i (x)  0 }, i=1,2,…m.  S i0 is convex for each i (Thm 1)  S=  i S i0 is convex.

6 8/7/2019 Berhanu G (Dr) 6 f is convex iff its hessian  2 f(x) is positive semi definite at each x  S. Theorem 5: Let S be open and f: S  R be twice differentiable. Proof: Follows from 2 nd order Taylor’s Theorem, continuity of  2 f(x), and Theorem 4 above. f is convex iff f(x)  f(x 0 ) +  f(x 0 ) T (x  x 0 ) for every x 0, x  S (called subgradient inequality) Theorem 4: Let S be open and f: S  R is differentiable. Proof: (  ) f(x 0 + (x  x o )  f(x 0 )+ [f(x)  f(x 0 )]  Df(x 0 ; x  x 0 )  f(x)  f(x 0 ) (  ) Let x 0 = x 1 + (1  )x 2. Then, f(x 1 )  f(x 0 ) +  f(x 0 ) T (x 1 -x 0 ), f(x 2 )  f(x 0 ) +  f(x 0 ) T (x 2 -x 0 ) Then multiply the 1 st inequality by and the 2 nd by (1  ) and add.  f(x 1 ) + (1  ) f(x 2 )  f(x 0 ). Example: 1. Show that f(x,y, z) = x 4 + y 2 + e z  5y is convex on  3. 2. Find a domain (set) on which f(x,y) =  x 3 + y 2 +y is convex

7 8/7/2019 Berhanu G (Dr) 7 4.3: Minimizing Convex functions Theorem 6: Let S be a convex set and f: S  R a convex function. x o is a local minimum of f over S iff x 0 is global minimum of f over S. Proof: A problem If minimizing a convex function over a convex set is called is called convex programming. That is, if S is a convex set and f is a convex function (on S), then min f(x) s.t. x  S is convex programming. In particular, if f, g i : R n  R, i =1,2,…,m are all convex functions and h j : R n  R, j =1,2,…,k are all affine functions, then min f(x) s.t. g i (x)  0, i=1,2,…,m h j (x) = 0, j =1,2,…,k x  0 is a convex programming problem. (Follows from Theorem 2)

8 8/7/2019 Berhanu G (Dr) 8 : (Convex Programming Optimality Conditions) Theorem 8: (Convex Programming Optimality Conditions) Let S  be convex Let S  R n be convex and f: S  R be a convex function. Then, 1. x o  S minimizes f on S iff   f(x o ), x – x o   0,  x  S. 2. x o int(S) minimizes f on S iff  f(x o ) = 0. ( In particular, if  f(x o ) = 0, at x o  R n, then x o is the minimizer of f on R n.) Theorem 7: Let if f : R n  R be a differentiable convex function. x o is a minimum of f (over R n ) iff  f(x 0 )=0

9 8/7/2019 Berhanu G (Dr) 9 4.4 Quasi-Convex functions Definition: Let S be a convex set and f : S  R. f is said to be quasi-convex if S  (f) is convex, for every  R. 1. f : R  R, f(x) = x 3  f is quasi- convex Examples: 2.Every convex function is quasi-convex. 3. If f : R  R is monotonic (increasing /decreasing), then f is quasi-convex. A quasi-convex function can be characterized also as follows: Theorem 8: Let S be convex and f : S  R. f is quasi-convex iff for each x 1,x 2  S and  [ 0,1], f(λx 1 + (1- λ)x 2 )  max  f(x 1 ), f(x 2 ) .

10 8/7/2019 Berhanu G (Dr) 10 Theorem 9: Let S be convex and f : S  R be quasi-convex function. Suppose M =  x o  S | f(x o )  f(x),  x  S . (set of minimal points) Then, M is convex. Theorem 10: Let S be convex and f : S  R be strictly quasi-convex. If x o  S is a local minimizer of f over S, then it is global minimizer of f over S. Note: Some important properties of Convex functions hold also for quasi-convex functions. For instance, the following two Theorem: Definition: Let S be convex and f : S  R. f is said to be strictly quasi-convex if for every x 1, x 2  S with f(x 1 ) ≠ f(x 2 ) and  [ 0,1], we have f(λx 1 + (1- λ)x 2 ) < max  f(x 1 ), f(x 2 ) . Proof: Let α = f(x 0 ), where x 0 is a minimizer of f on S. Notice that M = S α (f) and hence convex since f is quasi-convex. Proof: Similar to the prove of Theorem 6

11 8/7/2019 Berhanu G (Dr) 11 4.5. Approximations and Separation Theorems Given a nonempty S  V, and y  V \ S, the theory of best approximation deals with the problem of finding an  x  S which is closest to y. In the sequel, V is a normed real vector space Definition: Let S  V and y  V \ S.  x is called the best approximation of y in S, if  x  S and || y   x ||  || y  x ||,  x  S. i.e., best approximation of y in S is a solution of the minimization problem min { || y  x || : x  S }

12 8/7/2019 Berhanu G (Dr) 12 Examples of problem best approximation: 1. Let V = { f : [ -1, 1]  R | |f (x)| <  }, S = C 1 [-1,1]. b. Let y 2 (x) = |x|. What is the best approximation of y 2 (x) in S ? What is the best approximation of y 1 (x) in S ?. a. 2. Let V= R 2, S = { X  R 2 | ||X||  1 } and Y= (2,3). What is the best approximation of the point Y in S ? Definition: Let S  V. S is said to be proximinal if for every y  V there is a best approximation of y in S. That is, a set S  V is proximinal if the problem min { || y  x || : x  S } has a solution for any y  V.

13 8/7/2019 Berhanu G (Dr) 13 Theorem 11: Let S  R n, S ≠ Ø, and closed. Then, 1. S is proximinal. 2. additionally if S is convex, then for any y  R n its best approximation in S is unique. Proof: (1) Given any y  R n, define d: S  ℝ by d(x)= ||x  y ||. Pick an x 0  S, and let α = d(x 0 )= ||x 0  y ||.  S α = { x  S: d(x)  α } is compact. Hence, d has a minimizer on S α. Consequently, d has a minimizer on S. (2) The uniqueness follows from the fact that d is strictly convex. The following theorem gives us a sufficient condition for existence of solution for approximation problem in R n.

14 8/7/2019 Berhanu G (Dr) 14 Definition: Let V be an inner product space and S  V. A nonzero u  V is said to be normal to the set S at  x  S if  u, x   x  ≤ 0,  x  S. Examples: Let S = { (x,y) T  R 2 : 0 ≤ x, y ≤ 2 } 1) u = (-1,0) T is normal to S at  x= (0,1) T. 2) u = (0,1) T is normal to S at  x = (1,2) T. 3) u = (a,b) T, where a, b ≤ 0 (but not both 0 ) is normal to S at  x =(0,0) T. Theorem 12: Let S be a nonempty closed convex subset of V and y  V \ S.  x  S is the best approximation of y in S iff y –  x is normal to S at  x. Proof:

15 8/7/2019 Berhanu G (Dr) 15 Proof of Theorem 12: ( By Thm 11 the best approximation exists)   y –  x, x –  x   0. (  ) Let y –  x be normal to S at  x, and take arbitrary x  S. Now, ||y  x || 2 = || y –  x – (x –  x) || 2 = || y –  x || 2 + || x –  x || 2 – 2  y –  x, x –  x   || y –  x || 2 (  ) Let || y   x ||  || y  x ||,  x  S. Take arbitrary x  S, λ  [0,1], and let x λ =  x + λ(x –  x).  x λ  S  || y –  x || 2  || y – x λ || 2 = || y –  x – λ(x –  x) || 2   y –  x, x –  x   (λ/2 ) ||x –  x || 2 Thus, taking λ  0 +, we get  y –  x, x –  x   0.

16 8/7/2019 Berhanu G (Dr) 16 Definition:Let S  V, and H = { x  V |  u, x  = α } be a hyperplane for some nonzero u  V* and α  R. H is said to be support S at x o iff x o  S ∩ H and either S  H – or S  H +. ● x o S Supporting line at more than one point S S Several Supporting lines at x o ●xo●xo S No supporting line at x o Examples: Exercise: Let S be a nonempty closed convex subset of R n and y  R n \ S. Show that x o  S is the best approximation of y in S if and only if H = {x  R n |  u, x- x o  = 0 } supports S at x o, where u = y– x o and S  H – One unique supporting line x1●x1● x2 ●x2 ● ● x o

17 8/7/2019 Berhanu G (Dr) 17 Definition: Let S 1 and S 2 be nonempty subsets of V. S 1 and S 2 are said to be 1) separable if there is a nonzero u  V* and α  R such that  u, x  ≤ α ≤  u, y   x  S 1,  y  S 2. 2) strongly separable if there is a nonzero u  V* such that Sup {  u, x  | x  S 1 } < Inf {  u, x  | x  S 2 }. That is, considering the hyperplane H = { x  V |  u, x  = α }, S 1 and S 2 are 1) separable if S 1  H – and S 2  H +. ( In this case, we say H separates S 1 and S 2 ) 2) strongly separable if S 1  H –, S 2  H + and H supports neither S 1 nor S 2. ( In this case, we say H strongly separates S 1 and S 2 ) Separable S1S1 S2S2 H u Strongly Separable S1S1 S2S2 H u S1S1 S2S2 Not separable

18 8/7/2019 Berhanu G (Dr) 18 Theorem 14: Let S  R n be a closed convex set and y  S.Then {y} and S can be strongly separable. Theorem 13: Let S  R n be convex and 0  cl(S). Then, 1) If a  cl(S) is the element of minimal norm, then  a, x   ||a|| 2 > 0  x  cl(S). 2) {0} and S are strongly separable. Theorem 15: Let S  R n be a closed convex set and x o  bd(S). Then there is a hyperplane that supports S at x o. Proof:(1) --- (2) follows directly from (1). Notice that if we take α = ½ ||a|| 2, then H = { x  R n |  a, x  = α } strongly separates {0} and S. ( {0}  H -, S  H + and H supports neither of them. )

19 8/7/2019 Berhanu G (Dr) 19 Theorem 16: Let S  R n be a convex set and x o   (S). Then there is a nonzero u  R n such that  u, x - x o  ≤ 0,  x  S. Theorem 17: (Separation Theorem for two sets) Proof: Let S 1 and S 2 be two disjoint convex subsets of R n. Then there is a hyperplane that separates S 1 and S 2. Proof: Directly follows from Theorem 2. 19 Corollary 18: Let S 1 and S 2 be two disjoint convex subsets of R n. Then there is a a nonzero u  R n such that Sup {  u, x  | x  S 1 } ≤ Inf {  u, x  | x  S 2 }.

20 8/7/2019 Berhanu G (Dr) 20 4.6. Subdifferentials Defn: Let S  V convex, Defn: Let S  V convex, x o  S and f : S  R is a function. A vector  V (or  V* ) is called a subgradient of f at x o if f(x)  f(x o ) +  , x – x o , for all x  S.  The set of all subgradients of f at x o, denoted by  f(x o ), is called subdifferential of at x o. i.e.,  f(x o ) =   V : f(x)  f(x o ) +  , x – x o , for all x  S   If  f(x o )  , then f is said to be subdifferentiable at x o. Example: Let f(x) = ||x|| on R n. Then,  f(0)= {  R n : ||  ||  1 }. Theorem 19:  f(x o ) is a convex set. Note: subgradient at a point may not be unique. We will show that a subgradient is unique at a point where f is differentiable.

21 8/7/2019 Berhanu G (Dr) 21 : Let S  V be a convex set Theorem 20: Let S  V be a convex set and f : S  R. If f is subdifferentiable on int(S), then f is convex. : Let S  V be convex Theorem 21: Let S  V be convex and f : S  R be a convex functional. Then,  f(x o )   at every x o  int(S). : Let S  V be convex Theorem 23: Let S  V be convex and f: S  R be a convex functional. Then, x o  S minimizes f on S iff   f(x o ) s.t.  , x – x o   0,  x  S. : Let S  V be convex Corollary 24: Let S  V be convex and f: S  R be a convex functional. Then, x o int(S) minimizes f on S iff 0  f(x o ). : Let S  V be convex Corollary 22: Let S  V be convex and open ; f: S  R be a functional. Then, f is convex iff  f(x o )   at each x o  S.

22 8/7/2019 Berhanu G (Dr) 22 4.7. Subgradient Optimization Method Let S be a convex set. Consider a Convex Programming Problem: (P) min { f(x) : x  S } where, f : S  R is convex, but not necessarily differentiable. The subgradient method to solve (P) : Step 1: Start with initial point x 0  S. Step 2: At the current iterate point x k, find a subgradient  k of f at x k. Step 3: If  k =0, STOP (x k is optimal solution). Otherwise, set d k = –  k / ||  k ||, and Let y = x k + k d k, where k > 0 is a suitable step length. The next iterate point is x k+1 = y, if y  S ; else x k+1 = P S ( y) where P S (y) = x k ; i.e., x k  S is the best approximation of y in S. Step 4: Repeat Step 2 and 3 until a stopping condition holds.

23 8/7/2019 Berhanu G (Dr) 23 Note: 1. For the subgradient method to be practical, there should be tractable way to identify a subgradient of f at every iterate point; and to perform the projection operation P S (y). This depends on the specific problem. 2. The step direction at each iterate point x k is d k = –  k / ||  k ||, where  k  f(x k ) This direction need not necessarily be a descent direction; However, if x* is an optimal solution of (P) and k > 0 is small, we get || x k+1 – x* || < || x k – x* ||, for each k. i.e., for each k, x k+1 gets closer to x* than x k is. This is so, because for every non-optimal x k we have  –  k, x* – x k  > 0 since f(x*)  f(x k ) +   k, x* – x k  and f(x*) < f(x k ).

24 8/7/2019 Berhanu G (Dr) 24


Download ppt "8/7/2019 Berhanu G (Dr) 1 Chapter 3 Convex Functions and Separation Theorems In this chapter we focus mainly on Convex functions and their properties in."

Similar presentations


Ads by Google