8/7/2019 Berhanu G (Dr) 1 Chapter 3 Convex Functions and Separation Theorems In this chapter we focus mainly on Convex functions and their properties in.

Slides:



Advertisements
Similar presentations
C&O 355 Lecture 15 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A A A A A A A.
Advertisements

6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 16.
1 OR II GSLM Outline  some terminology  differences between LP and NLP  basic questions in NLP  gradient and Hessian  quadratic form  contour,
Optimization 吳育德.
C&O 355 Mathematical Programming Fall 2010 Lecture 15 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
How should we define corner points? Under any reasonable definition, point x should be considered a corner point x What is a corner point?
Lecture 8 – Nonlinear Programming Models Topics General formulations Local vs. global solutions Solution characteristics Convexity and convex programming.
D Nagesh Kumar, IIScOptimization Methods: M2L2 1 Optimization using Calculus Convexity and Concavity of Functions of One and Two Variables.
Thursday, April 25 Nonlinear Programming Theory Separable programming Handouts: Lecture Notes.
MIT and James Orlin © Nonlinear Programming Theory.
Chapter 5 Orthogonality
Unconstrained Optimization Problem
Optimality Conditions for Nonlinear Optimization Ashish Goel Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.
ECE 552 Numerical Circuit Analysis Chapter Six NONLINEAR DC ANALYSIS OR: Solution of Nonlinear Algebraic Equations Copyright © I. Hajj 2012 All rights.
Computational Optimization
CHAPTER FIVE Orthogonality Why orthogonal? Least square problem Accuracy of Numerical computation.
Linear Algebra Chapter 4 Vector Spaces.
The mean value theorem and curve sketching
Linear Programming System of Linear Inequalities  The solution set of LP is described by Ax  b. Gauss showed how to solve a system of linear.
Pareto Linear Programming The Problem: P-opt Cx s.t Ax ≤ b x ≥ 0 where C is a kxn matrix so that Cx = (c (1) x, c (2) x,..., c (k) x) where c.
Introduction to Real Analysis Dr. Weihu Hong Clayton State University 8/21/2008.
Nonlinear Programming Models
OR Backgrounds-Convexity  Def: line segment joining two points is the collection of points.
What is a set? A set is a collection of objects.
Chapter 4 Hilbert Space. 4.1 Inner product space.
Naïve Set Theory. Basic Definitions Naïve set theory is the non-axiomatic treatment of set theory. In the axiomatic treatment, which we will only allude.
I.4 Polyhedral Theory 1. Integer Programming  Objective of Study: want to know how to describe the convex hull of the solution set to the IP problem.
MA4229 Lectures 9, 10 Weeks 5-7 Sept 7 - Oct 1, 2010 Chapter 7 The theory of minimax approximation Chapter 8 The exchange algorithm.
Introduction to Optimization
Differential Equations MTH 242 Lecture # 09 Dr. Manshoor Ahmed.
Linear & Nonlinear Programming -- Basic Properties of Solutions and Algorithms.
Chapter Three SEQUENCES
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
Department of Statistics University of Rajshahi, Bangladesh
Linear Programming Chap 2. The Geometry of LP  In the text, polyhedron is defined as P = { x  R n : Ax  b }. So some of our earlier results should.
Matrices CHAPTER 8.9 ~ Ch _2 Contents  8.9 Power of Matrices 8.9 Power of Matrices  8.10 Orthogonal Matrices 8.10 Orthogonal Matrices 
D Nagesh Kumar, IISc Water Resources Systems Planning and Management: M2L2 Introduction to Optimization (ii) Constrained and Unconstrained Optimization.
Function Optimization
Chapter 3 The Real Numbers.
EMGT 6412/MATH 6665 Mathematical Programming Spring 2016
Chapter 5 Limits and Continuity.
Chapter 11 Optimization with Equality Constraints
Chapter 3 The Real Numbers.
Ch 10.1: Two-Point Boundary Value Problems
Advanced Algorithms Analysis and Design
L11 Optimal Design L.Multipliers
Computational Optimization
Chapter 3 The Real Numbers.
Chapter 3 The Real Numbers.
Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi
Chapter 3 The Real Numbers.
Chapter 5 Induction and Recursion
Chapter 2 Sets and Functions.
Chapter 5. Optimal Matchings
Systems of First Order Linear Equations
Lecture 8 – Nonlinear Programming Models
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Chap 3. The simplex method
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Linear Algebra Lecture 39.
Chapter 5. The Duality Theorem
Outline Unconstrained Optimization Functions of One Variable
Chapter 3 The Real Numbers.
I.4 Polyhedral Theory (NW)
Derivatives in Action Chapter 7.
Back to Cone Motivation: From the proof of Affine Minkowski, we can see that if we know generators of a polyhedral cone, they can be used to describe.
I.4 Polyhedral Theory.
Foundations of Discrete Mathematics
Eigenvalues and Eigenvectors
Convex and Concave Functions
Presentation transcript:

8/7/2019 Berhanu G (Dr) 1 Chapter 3 Convex Functions and Separation Theorems In this chapter we focus mainly on Convex functions and their properties in relation with optimization

8/7/2019 Berhanu G (Dr) Convex functions Def n : Let S   n be a convex set. A functional f : S  R is said to be convex (over S) if for any x 1, x 2  S and any λ  [0,1], f(λx 1 + (1  λ)x 2 )  λf(x 1 ) + (1  λ)f(x 2 ). f is said to be concave iff  f is convex [ so, f is concave iff … f(λx 1 + (1  λ)x 2 )  λf(x 1 ) + (1  λ)f(x 2 ) ] Neither convex nor concave y=f(x) Convex function y=f(x) Concave function f is strictly convex if for every distinct x 1, x 2  S, and λ  (0,1), f(λx 1 + (1  λ)x 2 ) < λf(x 1 ) + (1  λ)f(x 2 ).

8/7/2019 Berhanu G (Dr) 3 Examples: 1. Given x 0  R n, let d: R n  R be given by d(x) = ||x – x o ||. Then d is a convex function, 2. Let q(x) = x T Ax, where A is an n  n symmetric matrix. (i) q is convex if A is positive semi definite (ii) q is strictly convex if A is positive definite. [Note: A is PSD  2x T Ay  x T Ax + y T Ay, as (x  y) T A(x  y)  0; and the inequalities are strict when A is PD and x  y. ] 3. Let f(x) = a T x+ b, where a  R n, b  R (affine function) Then, f is convex. (In fact, affine function is also concave.) Note: f : S  R is convex iff for every points x 1, x 2, …, x n in S

8/7/2019 Berhanu G (Dr) 4 hyp(f) Def n : Let f : S  R. 1. The epigraph of f is the set epi(f):=  (x,y)  SxR : f(x)  y . 2. The hypograph of f is the set hyp(f):=  (x,y)  SxR : f(x) )  y . y=f(x) hyp(f) y=f(x) hyp(f) epi(f) Theorem : Let S  R n be convex and f : S  R. f is convex function  epi(f) is convex set epi(f)

8/7/2019 Berhanu G (Dr) 5 4.2: Some Properties of Convex functions Theorem 1: If f: S  R is a convex function, then the level set S  = { x  S | f(x)   }, where , is a convex set. In this section, let S  n be a convex set unless stated otherwise. Proof: ( Direct computations ) Theorem 3: If f 1, f 2 : S  R are convex functions, then f(x) =  1 f 1 (x) +  2 f 2 (x), where  1,  2  0, is a convex function Proof: Take x 1, x 2  S ,  [0,1] and x= x 1 +(1  )x 2 By convexity, f( x 1 +(1  )x 2 )  f(x 1 )+ (1  )f(x 2 )   +(1  )  =   x 1 +(1  )x 2  S  Theorem 2: Let g i : S  R, i=1,2,…m be convex functions. Then, S = { x  S | g i (x)  0, i=1,2,…m} is a convex set. Proof: Let S i0 = {x  S | g i (x)  0 }, i=1,2,…m.  S i0 is convex for each i (Thm 1)  S=  i S i0 is convex.

8/7/2019 Berhanu G (Dr) 6 f is convex iff its hessian  2 f(x) is positive semi definite at each x  S. Theorem 5: Let S be open and f: S  R be twice differentiable. Proof: Follows from 2 nd order Taylor’s Theorem, continuity of  2 f(x), and Theorem 4 above. f is convex iff f(x)  f(x 0 ) +  f(x 0 ) T (x  x 0 ) for every x 0, x  S (called subgradient inequality) Theorem 4: Let S be open and f: S  R is differentiable. Proof: (  ) f(x 0 + (x  x o )  f(x 0 )+ [f(x)  f(x 0 )]  Df(x 0 ; x  x 0 )  f(x)  f(x 0 ) (  ) Let x 0 = x 1 + (1  )x 2. Then, f(x 1 )  f(x 0 ) +  f(x 0 ) T (x 1 -x 0 ), f(x 2 )  f(x 0 ) +  f(x 0 ) T (x 2 -x 0 ) Then multiply the 1 st inequality by and the 2 nd by (1  ) and add.  f(x 1 ) + (1  ) f(x 2 )  f(x 0 ). Example: 1. Show that f(x,y, z) = x 4 + y 2 + e z  5y is convex on  Find a domain (set) on which f(x,y) =  x 3 + y 2 +y is convex

8/7/2019 Berhanu G (Dr) 7 4.3: Minimizing Convex functions Theorem 6: Let S be a convex set and f: S  R a convex function. x o is a local minimum of f over S iff x 0 is global minimum of f over S. Proof: A problem If minimizing a convex function over a convex set is called is called convex programming. That is, if S is a convex set and f is a convex function (on S), then min f(x) s.t. x  S is convex programming. In particular, if f, g i : R n  R, i =1,2,…,m are all convex functions and h j : R n  R, j =1,2,…,k are all affine functions, then min f(x) s.t. g i (x)  0, i=1,2,…,m h j (x) = 0, j =1,2,…,k x  0 is a convex programming problem. (Follows from Theorem 2)

8/7/2019 Berhanu G (Dr) 8 : (Convex Programming Optimality Conditions) Theorem 8: (Convex Programming Optimality Conditions) Let S  be convex Let S  R n be convex and f: S  R be a convex function. Then, 1. x o  S minimizes f on S iff   f(x o ), x – x o   0,  x  S. 2. x o int(S) minimizes f on S iff  f(x o ) = 0. ( In particular, if  f(x o ) = 0, at x o  R n, then x o is the minimizer of f on R n.) Theorem 7: Let if f : R n  R be a differentiable convex function. x o is a minimum of f (over R n ) iff  f(x 0 )=0

8/7/2019 Berhanu G (Dr) Quasi-Convex functions Definition: Let S be a convex set and f : S  R. f is said to be quasi-convex if S  (f) is convex, for every  R. 1. f : R  R, f(x) = x 3  f is quasi- convex Examples: 2.Every convex function is quasi-convex. 3. If f : R  R is monotonic (increasing /decreasing), then f is quasi-convex. A quasi-convex function can be characterized also as follows: Theorem 8: Let S be convex and f : S  R. f is quasi-convex iff for each x 1,x 2  S and  [ 0,1], f(λx 1 + (1- λ)x 2 )  max  f(x 1 ), f(x 2 ) .

8/7/2019 Berhanu G (Dr) 10 Theorem 9: Let S be convex and f : S  R be quasi-convex function. Suppose M =  x o  S | f(x o )  f(x),  x  S . (set of minimal points) Then, M is convex. Theorem 10: Let S be convex and f : S  R be strictly quasi-convex. If x o  S is a local minimizer of f over S, then it is global minimizer of f over S. Note: Some important properties of Convex functions hold also for quasi-convex functions. For instance, the following two Theorem: Definition: Let S be convex and f : S  R. f is said to be strictly quasi-convex if for every x 1, x 2  S with f(x 1 ) ≠ f(x 2 ) and  [ 0,1], we have f(λx 1 + (1- λ)x 2 ) < max  f(x 1 ), f(x 2 ) . Proof: Let α = f(x 0 ), where x 0 is a minimizer of f on S. Notice that M = S α (f) and hence convex since f is quasi-convex. Proof: Similar to the prove of Theorem 6

8/7/2019 Berhanu G (Dr) Approximations and Separation Theorems Given a nonempty S  V, and y  V \ S, the theory of best approximation deals with the problem of finding an  x  S which is closest to y. In the sequel, V is a normed real vector space Definition: Let S  V and y  V \ S.  x is called the best approximation of y in S, if  x  S and || y   x ||  || y  x ||,  x  S. i.e., best approximation of y in S is a solution of the minimization problem min { || y  x || : x  S }

8/7/2019 Berhanu G (Dr) 12 Examples of problem best approximation: 1. Let V = { f : [ -1, 1]  R | |f (x)| <  }, S = C 1 [-1,1]. b. Let y 2 (x) = |x|. What is the best approximation of y 2 (x) in S ? What is the best approximation of y 1 (x) in S ?. a. 2. Let V= R 2, S = { X  R 2 | ||X||  1 } and Y= (2,3). What is the best approximation of the point Y in S ? Definition: Let S  V. S is said to be proximinal if for every y  V there is a best approximation of y in S. That is, a set S  V is proximinal if the problem min { || y  x || : x  S } has a solution for any y  V.

8/7/2019 Berhanu G (Dr) 13 Theorem 11: Let S  R n, S ≠ Ø, and closed. Then, 1. S is proximinal. 2. additionally if S is convex, then for any y  R n its best approximation in S is unique. Proof: (1) Given any y  R n, define d: S  ℝ by d(x)= ||x  y ||. Pick an x 0  S, and let α = d(x 0 )= ||x 0  y ||.  S α = { x  S: d(x)  α } is compact. Hence, d has a minimizer on S α. Consequently, d has a minimizer on S. (2) The uniqueness follows from the fact that d is strictly convex. The following theorem gives us a sufficient condition for existence of solution for approximation problem in R n.

8/7/2019 Berhanu G (Dr) 14 Definition: Let V be an inner product space and S  V. A nonzero u  V is said to be normal to the set S at  x  S if  u, x   x  ≤ 0,  x  S. Examples: Let S = { (x,y) T  R 2 : 0 ≤ x, y ≤ 2 } 1) u = (-1,0) T is normal to S at  x= (0,1) T. 2) u = (0,1) T is normal to S at  x = (1,2) T. 3) u = (a,b) T, where a, b ≤ 0 (but not both 0 ) is normal to S at  x =(0,0) T. Theorem 12: Let S be a nonempty closed convex subset of V and y  V \ S.  x  S is the best approximation of y in S iff y –  x is normal to S at  x. Proof:

8/7/2019 Berhanu G (Dr) 15 Proof of Theorem 12: ( By Thm 11 the best approximation exists)   y –  x, x –  x   0. (  ) Let y –  x be normal to S at  x, and take arbitrary x  S. Now, ||y  x || 2 = || y –  x – (x –  x) || 2 = || y –  x || 2 + || x –  x || 2 – 2  y –  x, x –  x   || y –  x || 2 (  ) Let || y   x ||  || y  x ||,  x  S. Take arbitrary x  S, λ  [0,1], and let x λ =  x + λ(x –  x).  x λ  S  || y –  x || 2  || y – x λ || 2 = || y –  x – λ(x –  x) || 2   y –  x, x –  x   (λ/2 ) ||x –  x || 2 Thus, taking λ  0 +, we get  y –  x, x –  x   0.

8/7/2019 Berhanu G (Dr) 16 Definition:Let S  V, and H = { x  V |  u, x  = α } be a hyperplane for some nonzero u  V* and α  R. H is said to be support S at x o iff x o  S ∩ H and either S  H – or S  H +. ● x o S Supporting line at more than one point S S Several Supporting lines at x o ●xo●xo S No supporting line at x o Examples: Exercise: Let S be a nonempty closed convex subset of R n and y  R n \ S. Show that x o  S is the best approximation of y in S if and only if H = {x  R n |  u, x- x o  = 0 } supports S at x o, where u = y– x o and S  H – One unique supporting line x1●x1● x2 ●x2 ● ● x o

8/7/2019 Berhanu G (Dr) 17 Definition: Let S 1 and S 2 be nonempty subsets of V. S 1 and S 2 are said to be 1) separable if there is a nonzero u  V* and α  R such that  u, x  ≤ α ≤  u, y   x  S 1,  y  S 2. 2) strongly separable if there is a nonzero u  V* such that Sup {  u, x  | x  S 1 } < Inf {  u, x  | x  S 2 }. That is, considering the hyperplane H = { x  V |  u, x  = α }, S 1 and S 2 are 1) separable if S 1  H – and S 2  H +. ( In this case, we say H separates S 1 and S 2 ) 2) strongly separable if S 1  H –, S 2  H + and H supports neither S 1 nor S 2. ( In this case, we say H strongly separates S 1 and S 2 ) Separable S1S1 S2S2 H u Strongly Separable S1S1 S2S2 H u S1S1 S2S2 Not separable

8/7/2019 Berhanu G (Dr) 18 Theorem 14: Let S  R n be a closed convex set and y  S.Then {y} and S can be strongly separable. Theorem 13: Let S  R n be convex and 0  cl(S). Then, 1) If a  cl(S) is the element of minimal norm, then  a, x   ||a|| 2 > 0  x  cl(S). 2) {0} and S are strongly separable. Theorem 15: Let S  R n be a closed convex set and x o  bd(S). Then there is a hyperplane that supports S at x o. Proof:(1) --- (2) follows directly from (1). Notice that if we take α = ½ ||a|| 2, then H = { x  R n |  a, x  = α } strongly separates {0} and S. ( {0}  H -, S  H + and H supports neither of them. )

8/7/2019 Berhanu G (Dr) 19 Theorem 16: Let S  R n be a convex set and x o   (S). Then there is a nonzero u  R n such that  u, x - x o  ≤ 0,  x  S. Theorem 17: (Separation Theorem for two sets) Proof: Let S 1 and S 2 be two disjoint convex subsets of R n. Then there is a hyperplane that separates S 1 and S 2. Proof: Directly follows from Theorem Corollary 18: Let S 1 and S 2 be two disjoint convex subsets of R n. Then there is a a nonzero u  R n such that Sup {  u, x  | x  S 1 } ≤ Inf {  u, x  | x  S 2 }.

8/7/2019 Berhanu G (Dr) Subdifferentials Defn: Let S  V convex, Defn: Let S  V convex, x o  S and f : S  R is a function. A vector  V (or  V* ) is called a subgradient of f at x o if f(x)  f(x o ) +  , x – x o , for all x  S.  The set of all subgradients of f at x o, denoted by  f(x o ), is called subdifferential of at x o. i.e.,  f(x o ) =   V : f(x)  f(x o ) +  , x – x o , for all x  S   If  f(x o )  , then f is said to be subdifferentiable at x o. Example: Let f(x) = ||x|| on R n. Then,  f(0)= {  R n : ||  ||  1 }. Theorem 19:  f(x o ) is a convex set. Note: subgradient at a point may not be unique. We will show that a subgradient is unique at a point where f is differentiable.

8/7/2019 Berhanu G (Dr) 21 : Let S  V be a convex set Theorem 20: Let S  V be a convex set and f : S  R. If f is subdifferentiable on int(S), then f is convex. : Let S  V be convex Theorem 21: Let S  V be convex and f : S  R be a convex functional. Then,  f(x o )   at every x o  int(S). : Let S  V be convex Theorem 23: Let S  V be convex and f: S  R be a convex functional. Then, x o  S minimizes f on S iff   f(x o ) s.t.  , x – x o   0,  x  S. : Let S  V be convex Corollary 24: Let S  V be convex and f: S  R be a convex functional. Then, x o int(S) minimizes f on S iff 0  f(x o ). : Let S  V be convex Corollary 22: Let S  V be convex and open ; f: S  R be a functional. Then, f is convex iff  f(x o )   at each x o  S.

8/7/2019 Berhanu G (Dr) Subgradient Optimization Method Let S be a convex set. Consider a Convex Programming Problem: (P) min { f(x) : x  S } where, f : S  R is convex, but not necessarily differentiable. The subgradient method to solve (P) : Step 1: Start with initial point x 0  S. Step 2: At the current iterate point x k, find a subgradient  k of f at x k. Step 3: If  k =0, STOP (x k is optimal solution). Otherwise, set d k = –  k / ||  k ||, and Let y = x k + k d k, where k > 0 is a suitable step length. The next iterate point is x k+1 = y, if y  S ; else x k+1 = P S ( y) where P S (y) = x k ; i.e., x k  S is the best approximation of y in S. Step 4: Repeat Step 2 and 3 until a stopping condition holds.

8/7/2019 Berhanu G (Dr) 23 Note: 1. For the subgradient method to be practical, there should be tractable way to identify a subgradient of f at every iterate point; and to perform the projection operation P S (y). This depends on the specific problem. 2. The step direction at each iterate point x k is d k = –  k / ||  k ||, where  k  f(x k ) This direction need not necessarily be a descent direction; However, if x* is an optimal solution of (P) and k > 0 is small, we get || x k+1 – x* || < || x k – x* ||, for each k. i.e., for each k, x k+1 gets closer to x* than x k is. This is so, because for every non-optimal x k we have  –  k, x* – x k  > 0 since f(x*)  f(x k ) +   k, x* – x k  and f(x*) < f(x k ).

8/7/2019 Berhanu G (Dr) 24