CSE 203B: Convex Optimization Week 4 Discuss Session

Slides:

Advertisements

Similar presentations

3.2 Inverse Functions and Logarithms 3.3 Derivatives of Logarithmic and Exponential functions.

Advertisements

If R = {(x,y)| y = 3x + 2}, then R -1 = (1) x = 3y + 2 (2) y = (x – 2)/3 (3) {(x,y)| y = 3x + 2} (4) {(x,y)| y = (x – 2)/3} (5) {(x,y)| y – 2 = 3x} (6)

Computational Intelligence Winter Term 2014/15 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Primitive Recursive Functions (Chapter 3)

Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.

Tangent lines Recall: tangent line is the limit of secant line The tangent line to the curve y=f(x) at the point P(a,f(a)) is the line through P with slope.

How Bad is Selfish Routing? By Tim Roughgarden Eva Tardos Presented by Alex Kogan.

Copyright © 2013, 2009, 2005 Pearson Education, Inc.

LIAL HORNSBY SCHNEIDER

+ Convex Functions, Convex Sets and Quadratic Programs Sivaraman Balakrishnan.

A KTEC Center of Excellence 1 Convex Optimization: Part 1 of Chapter 7 Discussion Presenter: Brian Quanz.

SE- 521: Nonlinear Programming and Applications S. O. Duffuaa.

What does say about f ? Increasing/decreasing test

1 Outline  multi-period stochastic demand  base-stock policy  convexity.

Convex Sets (chapter 2 of Convex programming) Keyur Desai Advanced Machine Learning Seminar Michigan State University.

Exploiting Duality (Particularly the dual of SVM) M. Pawan Kumar VISUAL GEOMETRY GROUP.

Unconstrained Optimization Problem

Theorem 1-a Let be a sequence: L L a) converges to a real number L iff every subsequence of converge to L Illustrations 0 0.

Easy Optimization Problems, Relaxation, Local Processing for a single variable.

Optimality Conditions for Nonlinear Optimization Ashish Goel Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.

CSE 245: Computer Aided Circuit Simulation and Verification

Collaborative Filtering Matrix Factorization Approach

Chapter 9 Theory of Differential and Integral Calculus Section 1 Differentiability of a Function Definition. Let y = f(x) be a function. The derivative.

Introduction to Real Analysis Dr. Weihu Hong Clayton State University 8/21/2008.

Section 5.1 The Natural Log Function: Differentiation

Sec 15.6 Directional Derivatives and the Gradient Vector

Non-Bayes classifiers. Linear discriminants, neural networks.

CSE 245: Computer Aided Circuit Simulation and Verification Matrix Computations: Iterative Methods I Chung-Kuan Cheng.

OPERATIONS ON FUNCTIONS. Definition. Sum, Difference, Product, Quotient and Composite Functions Let f and g be functions of the variable x. 1. The sum.

CPSC 536N Sparse Approximations Winter 2013 Lecture 1 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA.

Lecture Note 2 – Calculus and Probability Shuaiqiang Wang Department of CS & IS University of Jyväskylä

Introduction to Optimization

Generalization Error of pac Model  Let be a set of training examples chosen i.i.d. according to  Treat the generalization error as a r.v. depending on.

Summary of the Last Lecture This is our second lecture. In our first lecture, we discussed The vector spaces briefly and proved some basic inequalities.

Network Systems Lab. Korea Advanced Institute of Science and Technology No.1 Maximum Norms & Nonnegative Matrices  Weighted maximum norm e.g.) x1x1 x2x2.

Operations of Functions Given two functions  and g, then for all values of x for which both  (x) and g (x) are defined, the functions  + g,

Regularized Least-Squares and Convex Optimization.

Learning Goals: 3.9: Derivatives of Exponential & Logarithmic Functions Derivatives of exponential and logarithmic functions. Logarithmic differentiation.

Chapter 3 The Real Numbers.

Markov Chains and Mixing Times

What does say about f ? Increasing/decreasing test

The Quotient Rule The Quotient Rule is used to find derivatives for functions written as a fraction:

Computational Optimization

Convex functions Lecture 4

Chapter 8 Infinite Series.

Aim: How do we determine if a function is differential at a point?

Chapter 1. Introduction Mathematical Programming (Optimization) Problem: min/max

Chapter P Prerequisites. Chapter P Prerequisites.

CSE 245: Computer Aided Circuit Simulation and Verification

Unit 6 – Fundamentals of Calculus Section 6

Collaborative Filtering Matrix Factorization Approach

CS5321 Numerical Optimization

The Curve Merger (Dvir & Widgerson, 2008)

CSE 245: Computer Aided Circuit Simulation and Verification

Aviv Rosenberg 10/01/18 Seminar on Experts and Bandits

Techniques Of Differentiation

Sequences and Series of Functions

Composition OF Functions.

Composition OF Functions.

AREA Section 4.2.

Chapter 5 Limits and Continuity.

Checking the Consistency of Local Density Matrices

Combinations of Functions

AREA Section 4.2.

Extreme values of functions

8/7/2019 Berhanu G (Dr) 1 Chapter 3 Convex Functions and Separation Theorems In this chapter we focus mainly on Convex functions and their properties in.

Chapter 4 Graphing and Optimization

Derivatives of Logarithmic and Exponential functions

CSE 203B: Convex Optimization Week 2 Discuss Session

Presentation transcript:

CSE 203B: Convex Optimization Week 4 Discuss Session Xinyuan Wang 01/30/2019

Contents Convex functions (Ref. Chap.3) Review: definition, first order condition, second order condition, operations that preserve convexity Epigraph Conjugate function

Review of Convex Function Definition ( often simplified by restricting to a line) First order condition: a global underestimator Second order condition Operations that preserve convexity Nonnegative weighted sum Composition with affine function Pointwise maximum and supremum Composition Minimization Perspective See Chap 3.2, try to prove why the convexity is preserved with those operations.

Epigraph 𝛼-sublevel set of 𝑓: 𝑅 𝑛 →𝑅 𝐶 𝛼 = 𝑥∈𝑑𝑜𝑚 𝑓 𝑓 𝑥 ≤𝛼} 𝐶 𝛼 = 𝑥∈𝑑𝑜𝑚 𝑓 𝑓 𝑥 ≤𝛼} sublevel sets of a convex function are convex for any value of 𝛼. Epigraph of 𝑓: 𝑅 𝑛 →𝑅 is defined as 𝐞𝐩𝐢 𝑓= (𝑥,𝑡) 𝑥∈𝑑𝑜𝑚 𝑓,𝑓 𝑥 ≤𝑡}⊆ 𝑅 𝑛+1 A function is convex iff its epigraph is a convex set.

Relation between convex sets and convex functions A function is convex iff its epigraph is a convex set. Consider a convex function 𝑓 and 𝑥, 𝑦∈𝑑𝑜𝑚 𝑓 𝑡≥𝑓 𝑦 ≥𝑓 𝑥 +𝛻𝑓 𝑥 𝑇 (𝑦−𝑥) The hyperplane supports epi 𝒇 at (𝑥, 𝑓 𝑥 ), for any 𝑦,𝑡 ∈𝐞𝐩𝐢 𝑓⇒ 𝛻𝑓 𝑥 𝑇 𝑦−𝑥 +𝑓 𝑥 −𝑡≤0 ⇒ 𝛻𝑓 𝑥 −1 𝑇 𝑦 𝑡 − 𝑥 𝑓 𝑥 ≤0 First order condition for convexity epi 𝒇 𝑡 𝑥 Supporting hyperplane, derived from first order condition

Pointwise Supremum If for each 𝑦∈𝑈, 𝑓(𝑥, 𝑦): 𝑅 𝑛 →𝑅 is convex in 𝑥, then function 𝑔(𝑥)= sup 𝑦∈𝑈 𝑓(𝑥,𝑦) is convex in 𝑥. Epigraph of 𝑔 𝑥 is the intersection of epigraphs with 𝑓 and set 𝑈 𝐞𝐩𝐢 𝑔= ∩ 𝑦∈𝑈 𝐞𝐩𝐢 𝑓(∙ ,𝑦) knowing 𝐞𝐩𝐢 𝑓(∙ ,𝑦) is a convex set (𝑓 is a convex function in 𝑥 and regard 𝑦 as a const), so 𝐞𝐩𝐢 𝑔 is convex. An interesting method to establish convexity of a function: the pointwise supremum of a family of affine functions. (ref. Chap 3.2.4)

Conjugate Function Given function 𝑓: 𝑅 𝑛 →𝑅 , the conjugate function 𝑓 ∗ 𝑦 = sup 𝑥∈𝑑𝑜𝑚 𝑓 𝑦 𝑇 𝑥 −𝑓(𝑥) The dom 𝑓 ∗ consists 𝑦∈ 𝑅 𝑛 for which sup 𝑥∈𝑑𝑜𝑚 𝑓 𝑦 𝑇 𝑥 −𝑓(𝑥) is bounded Theorem: 𝑓 ∗ (𝑦) is convex even 𝑓 𝑥 is not convex. Supporting hyperplane: if 𝑓 is convex in that domain Slope 𝑦=−1 ( 𝑥 , 𝑓( 𝑥 )) where 𝛻 𝑓 𝑇 𝑥 =𝑦 if 𝑓 is differentiable Slope 𝑦=0 (0, − 𝑓 ∗ (0)) (Proof with pointwise supremum) 𝒚 𝑻 𝒙 −𝒇(𝒙) is affine function in 𝒚

Examples of Conjugates Derive the conjugates of 𝑓:𝑅→𝑅 𝑓 ∗ 𝑦 = sup 𝑥∈𝑑𝑜𝑚 𝑓 𝑦𝑥 −𝑓(𝑥) Affine quadratic Norm See the extension to domain 𝑹 𝒏 in Example 3.21

Examples of Conjugates Log-sum-exp: 𝑓 𝑥 = log ( Σ 𝑖=1 n 𝑒 𝑥 𝑖 ) , 𝑥∈ 𝑅 𝑛 . The conjugate is 𝑓 ∗ 𝑦 = sup 𝑥∈𝑑𝑜𝑚 𝑓 𝑦 𝑇 𝑥 −𝑓(𝑥) Since 𝑓(𝑥) is differentiable, first calculate the gradient of 𝑔 𝑥 = 𝑦 𝑇 𝑥 −𝑓(𝑥) 𝛻𝑔 𝑥 =𝑦− 𝑍 𝟏 𝑇 𝑍 where 𝑍= 𝑒 𝑥 1 , ⋯ 𝑒 𝑥 𝑛 𝑇 . The maximum of 𝑔 𝑥 is attained at 𝛻𝑔 𝑥 =0, so that 𝑦= 𝑍 𝟏 𝑇 𝑍 . 𝑦 𝑖 = 𝑒 𝑥 𝑖 Σ 𝑖=1 n 𝑒 𝑥 𝑖 , 𝑖=1,…, 𝑛 Does the bounded supremum (not →∞) exist for all 𝒚∈ 𝑹 𝒏 ?

Examples of Conjugates Does the bounded supremum (not →∞) exist for all 𝒚∈ 𝑹 𝒏 ? If 𝑦 𝑖 <0, then the gradient on 𝛻 𝑖 𝑔= 𝑦 𝑖 − 𝑒 𝑥 𝑖 Σ 𝑖=1 n 𝑒 𝑥 𝑖 <0 for all 𝑥 𝑖 ∈𝑅. Then 𝑔 𝑥 reaches maximum at ∞ at 𝑥→−∞. So 𝑦≽0. To have 𝑦= 𝑍 1 𝑇 𝑍 solvable, we notice that 𝟏 𝑇 𝑦= 𝟏 𝑇 𝑍 1 𝑇 𝑍 =1. If 𝟏 𝑇 𝑦≠1 and 𝑦≽0, we choose 𝑥=𝑡𝟏 s.t. 𝑔 𝑥 = 𝑦 𝑇 𝑥 −𝑓 𝑥 =𝑡 𝑦 𝑇 𝟏− log 𝑛 𝑒 𝑡 =𝑡 𝑦 𝑇 𝟏−1 − log 𝑛 𝑔 𝑥 reaches maximum at ∞ when we let 𝑡→∞(−∞), which is not bounded. The conjugate function 𝑓 ∗ 𝑦 = Σ 𝑖=1 n 𝑦 𝑖 log 𝑦 𝑖 if 𝟏 𝑇 𝑦=1 and 𝑦≽0 ∞ otherwise