Nonlinear Programming Models

Slides:



Advertisements
Similar presentations
Fin500J: Mathematical Foundations in Finance
Advertisements

Geometry and Theory of LP Standard (Inequality) Primal Problem: Dual Problem:
Solving LP Models Improving Search Special Form of Improving Search
Nonlinear Programming McCarl and Spreen Chapter 12.
Linear Programming (LP) (Chap.29)
The securities market economy -- theory Abstracting again to the two- period analysis - - but to different states of payoff.
Mean-variance portfolio theory
MS&E 211 Quadratic Programming Ashish Goel. A simple quadratic program Minimize (x 1 ) 2 Subject to: -x 1 + x 2 ≥ 3 -x 1 – x 2 ≥ -2.
1 OR II GSLM Outline  some terminology  differences between LP and NLP  basic questions in NLP  gradient and Hessian  quadratic form  contour,
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Introducción a la Optimización de procesos químicos. Curso 2005/2006 BASIC CONCEPTS IN OPTIMIZATION: PART II: Continuous & Unconstrained Important concepts.
Engineering Optimization
Introduction to Management Science
Lecture 8 – Nonlinear Programming Models Topics General formulations Local vs. global solutions Solution characteristics Convexity and convex programming.
L12 LaGrange Multiplier Method Homework Review Summary Test 1.
Thursday, April 25 Nonlinear Programming Theory Separable programming Handouts: Lecture Notes.
Easy Optimization Problems, Relaxation, Local Processing for a small subset of variables.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
Optimality conditions for constrained local optima, Lagrange multipliers and their use for sensitivity of optimal solutions Today’s lecture is on optimality.
Separating Hyperplanes
MIT and James Orlin © Nonlinear Programming Theory.
Optimization using Calculus
OPTIMAL CONTROL SYSTEMS
Chapter 5: Linear Discriminant Functions
ENGR 351 Numerical Methods Instructor: Dr. L.R. Chevalier
Exploiting Duality (Particularly the dual of SVM) M. Pawan Kumar VISUAL GEOMETRY GROUP.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Support Vector Machines and Kernel Methods
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
MAE 552 – Heuristic Optimization Lecture 1 January 23, 2002.
Optimization using Calculus
Optimality Conditions for Nonlinear Optimization Ashish Goel Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.
D Nagesh Kumar, IIScOptimization Methods: M2L3 1 Optimization using Calculus Optimization of Functions of Multiple Variables: Unconstrained Optimization.
Computer Algorithms Mathematical Programming ECE 665 Professor Maciej Ciesielski By DFG.
Tier I: Mathematical Methods of Optimization
Lecture 9 – Nonlinear Programming Models
Applied Economics for Business Management
KKT Practice and Second Order Conditions from Nash and Sofer
Roman Keeney AGEC  In many situations, economic equations are not linear  We are usually relying on the fact that a linear equation.
Chapter 11 Nonlinear Programming
STOCHASTIC DOMINANCE APPROACH TO PORTFOLIO OPTIMIZATION Nesrin Alptekin Anadolu University, TURKEY.
Linear Programming Topics General optimization model LP model and assumptions Manufacturing example Characteristics of solutions Sensitivity analysis Excel.
CHAPTER SEVEN PORTFOLIO ANALYSIS. THE EFFICIENT SET THEOREM THE THEOREM An investor will choose his optimal portfolio from the set of portfolios that.
0 Portfolio Managment Albert Lee Chun Construction of Portfolios: Introduction to Modern Portfolio Theory Lecture 3 16 Sept 2008.
1 DSCI 3023 Linear Programming Developed by Dantzig in the late 1940’s A mathematical method of allocating scarce resources to achieve a single objective.
Nonlinear Programming (NLP) Operation Research December 29, 2014 RS and GISc, IST, Karachi.
1 Chapter 7 Linear Programming. 2 Linear Programming (LP) Problems Both objective function and constraints are linear. Solutions are highly structured.
Pareto Linear Programming The Problem: P-opt Cx s.t Ax ≤ b x ≥ 0 where C is a kxn matrix so that Cx = (c (1) x, c (2) x,..., c (k) x) where c.
Systems of Equations and Inequalities Systems of Linear Equations: Substitution and Elimination Matrices Determinants Systems of Non-linear Equations Systems.
Advanced Operations Research Models Instructor: Dr. A. Seifi Teaching Assistant: Golbarg Kazemi 1.
EASTERN MEDITERRANEAN UNIVERSITY Department of Industrial Engineering Non linear Optimization Spring Instructor: Prof.Dr.Sahand Daneshvar Submited.
Chapter 3 Linear Programming Methods
Nonlinear Programming I Li Xiaolei. Introductory concepts A general nonlinear programming problem (NLP) can be expressed as follows: objective function.
Signal & Weight Vector Spaces
Introduction to Optimization
Linear & Nonlinear Programming -- Basic Properties of Solutions and Algorithms.
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
1 Introduction Optimization: Produce best quality of life with the available resources Engineering design optimization: Find the best system that satisfies.
OR II GSLM
D Nagesh Kumar, IISc Water Resources Systems Planning and Management: M2L2 Introduction to Optimization (ii) Constrained and Unconstrained Optimization.
Regularized Least-Squares and Convex Optimization.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
deterministic operations research
Chapter 11 Optimization with Equality Constraints
Linear Programming Dr. T. T. Kachwala.
Lecture 8 – Nonlinear Programming Models
Lecture 9 – Nonlinear Programming Models
Outline Unconstrained Optimization Functions of One Variable
EE 458 Introduction to Optimization
Convex and Concave Functions
Presentation transcript:

Nonlinear Programming Models In LP ... the objective function & constraints are linear and the problems are “easy” to solve. Most real-world problems have nonlinear elements and are hard to solve.

General NLP Minimize f(x) s.t. gi(x) (, , =) bi, i = 1,…,m x is the n-dimensional vector of decision variables f(x) is the objective function gi(x) are the constraint functions bi are fixed known constants

“decreasing efficiencies” 4 Example 1 Max 3x1 + 2x2 2 s.t. x1 + x2 £ 1, x1 ³ 0, x2 unrestricted Example 2 Max e c x e c x … e c x 1 1 2 2 n n s.t. Ax = b, x ³ 0 n Example 3 Min å fj(xj) Problems with “decreasing efficiencies” j=1 s.t. Ax = b, x ³ 0 fj(xj) where each fj(xj) is of the form xj Examples 2 and 3 can be reformulated as LPs

NLP Graphical Solution Method Max f(x1, x2) = x1x2 s.t. 4x1 + x2 £ 8 x1, x2 ³ 0 x2 8 f(x) = 2 f(x) = 1 2 x1 Optimal solution will lie on the line g(x) = 4x1 + x2 – 8 = 0.

Solution Characteristics Gradient of f(x) = f(x1, x2)  (f/x1, f/x2)T This gives f/x1 = x2, f/x2 = x1 and g/x1 = 4, g/x2 = 1 At optimality we have f(x1, x2) = g(x1, x2) or x2* = 4 and x1* = 1 Solution is not a vertex of feasible region. For this particular problem the solution is on the boundary of the feasible region. This is not always the case.

f(x) x local min global max stationary point Nonconvex Function Let S  Rn be the set of feasible solutions to an NLP. Definition: A global minimum is any x0  S such that f(x0)  f(x) for all feasible x not equal to x0.

Function with Unique Global Minimum at x = (1, –3) What is the optimal solution if x1 ³ 0 and x2 ³ 0 ?

Function with Multiple Maxima and Minima Min {f(x)= sin(x) : 0  x  5p}

Constrained Function with Unique Global Maximum and Unique Global Minimum

Convexity Convex function: If you draw a straight line between any two points on f(x) the line will be above or on the line of f(x). Concave function: If f(x) is convex than - f(x) is concave. d 2 f ( x ) ≥ 0 for all x. Convex for Univariate f : Linear functions are both convex and concave.

Definition of Convexity Let x1 and x2 be two points in S  Rn. A function f(x) is convex if and only if f(lx1 + (1–l)x2) ≤ lf(x1) + (1–l)f(x2) for all 0 < l < 1. It is strictly convex if the inequality sign ≤ is replaced with the sign <. 1-dimensional example

Nonconvex -- Nonconcave Function f(x) x

Theoretical Result for Convex Functions A positively weighted sum of convex functions is convex: if fk(x) k =1,…,m are convex and 1,…,m ³ 0 then f(x) = å akfk(x) is convex. m k=1 … Hessian of f at x: Example: f(x) = 2x13 + 3x22 – 4x12x2 + 5x1-8

Determining Convexity Single Dimensional Functions: A function f(x) Î C1 is convex if and only if it is underestimated by linear extrapolation; i.e., f(x2) ≥ f(x1) + (df(x1)/dx)(x2 – x1) for all x1 and x2. x1 x2 f(x) A function f(x) Î C2 is convex if and only if its second derivative is nonnegative. d2f(x)/dx2 ≥ 0 for all x If the inequality is strict (>), the function is strictly convex.

Multiple Dimensional Functions Definition: The Hessian matrix H(x) associated with f(x) is the n  n symmetric matrix of second partial derivatives of f(x) with respect to the components of x. When f(x) is quadratic, H(x) has only constant terms; when f(x) is linear, H(x) does not exist. Example: f(x) = 3(x1)2 + 4(x2)3 – 5x1x2 + 4x1

Properties of the Hessian How can we use Hessian to determine whether or not f(x) is convex? H(x) is positive semi-definite (PSD) if and only if xTHx ≥ 0 for all x and there exists an x  0 such that xTHx ≥ 0. H(x) is positive definite (PD) if and only if xTHx > 0 for all x  0. H(x) is indefinite if and only if xTHx > 0 for some x, and xTHx < 0 for some other x.

Multiple Dimensional Functions and Convexity f(x) is convex if only if f(x2) ≥ f(x1) + ÑTf(x1)(x2 – x1) for all x1 and x2. f(x) is convex (strictly convex) if its associated Hessian matrix H(x) is positive semi-definite (definite) for all x. f(x) is concave if only if f(x2) ≤ f(x1) + ▽Tf(x1)(x2 – x1) for all x1 and x2. f(x) is concave (strictly concave) if its associated Hessian matrix H(x) is negative semi-definite (definite) for all x. f(x) is neither convex nor concave if its associated Hessian matrix H(x) is indefinite

Testing for Definiteness Let Hessian, H = Definition: The ith leading principal submatrix of H is the matrix formed taking the intersection of its first i rows and i columns. Let Hi be the value of the corresponding determinant:

Definition The kth order principal submatrices of an n   n symmetric matrix A are the k    k matrices obtained by deleting n - k rows and the corresponding n - k columns of A (where k = 1, ... , n). Example

Rules for Definiteness H is positive definite if and only if the determinants of all the leading principal submatrices are positive; i.e., Hi > 0 for i = 1,…,n. H is negative definite if and only if H1 < 0 and the remaining leading principal determinants alternate in sign: H2 > 0, H3 < 0, H4 > 0, . . . H is positive-semidefinite if and only if all principal submatrices ( Hi ) have nonnegative determinants. H is negative semi-definiteness if and only if Hi  0 for i odd and Hi  0 for i even .

Quadratic Functions Example 1: f(x) = 3x1x2 + x12 + 3x22 so H1 = 2 and H2 = 12 – 9 = 3 Conclusion  f(x) is strictly convex because H(x) is positive definite.

Quadratic Functions Example 2: f(x) = 24x1x2 + 9x12 + 6x22 H1 = 18 and H2 = 576 – 576 = 0 → f is not PD H is positive semi-definite (determinants of all principal submatrices are nonnegative) → f(x) is convex . Note, xTHx = 18(x1 + (4/3)x2)2 ≥ 0.

Nonquadratic Functions Example 3: f(x) = (x2 – x12)2 + (1 – x1)2 Thus the Hessian depends on the point under consideration: At x = (1, 1), which is positive definite. At x = (0, 1), which is indefinite. Thus f(x) is not convex although it is strictly convex near (1, 1).

Example Is matrix A PD or PSD or ND or NSD or Indefinite ?

x0 = lx1 + (1–l)x2 Î S for all l such that 0 ≤ l ≤ 1. Convex Sets Definition: A set S  n is convex if any point on the line segment connecting any two points x1, x2 Î S is also in S. Mathematically, this is equivalent to x0 = lx1 + (1–l)x2 Î S for all l such that 0 ≤ l ≤ 1.  x1 x2 x1 x1   x2   x2 

(Nonconvex) Feasible Region S = {(x1, x2) : (0.5x1 – 0.6)x2 ≤ 1 2(x1)2 + 3(x2)2 ≥ 27; x1, x2 ≥ 0}

Convex Sets and Optimization Let S = { x Î n : gi(x) £ bi, i = 1,…,m } Fact: If gi(x) is a convex function for each i = 1,…,m then S is a convex set. Convex Programming Theorem: Let x  n and let f(x) be a convex function defined over a convex constraint set S. If a finite solution exists to the problem Minimize{f(x) : x Î S} then all local optima are global optima. If f(x) is strictly convex, the optimum is unique.

Note Let s = { x  n : g(x) b}. Fact: If g (x) is a convex function, then s is a convex set. Let S = { x  n : gi(x)  bi, i = 1,…,m } Fact: If gi(x) is a convex function for each i = 1,…,m then S is a convex set. Let t = { x  n : g(x) b}. Fact: If g (x) is a concave function, then t is a convex set. Let T = { x  n : gi(x)  bi, i = 1,…,m } Fact: If gi(x) is a concave function for each i = 1,…,m then T is a convex set.

Convex Programming Min f(x1,…,xn) s.t. gi(x1,…,xn) £ bi i = 1,…,m is a convex program if f is convex and each gi is convex. Max f(x1,…,xn) s.t. gi(x1,…,xn) £ bi i = 1,…,m x1 ³ 0,…,xn ³ 0 is a convex program if f is concave and each gi is convex.

Linearly Constrained Convex Function with Unique Global Maximum Maximize f(x) = (x1 – 2)2 + (x2 – 2)2 subject to –3x1 – 2x2 ≤ –6 –x1 + x2 ≤ 3 x1 + x2 ≤ 7 2x1 – 3x2 ≤ 4

(Nonconvex) Optimization Problem

Importance of Convex Programs Commercial optimization software cannot guarantee that a solution is globally optimal to a nonconvex program. NLP algorithms try to find a point where the gradient of the Lagrangian function is zero – a stationary point – and complementary slackness holds. Given L(x,m) = f(x) + m(g(x) – b) we want L(x,m) = 0, g(x) – b ≤ 0, m[g(x)-b] = 0, x ³ 0, m ³ 0 However, for a convex program, all local solutions are globally optima.

Example: Cylinder Design We want to build a cylinder (with a top and a bottom) of maximum volume such that its surface area is no more than S units. Max V(r,h) = pr2h s.t. 2pr2 + 2prh = S r ³ 0, h ³ 0 r h There are a number of ways to approach this problem. One way is to solve the surface area constraint for h and substitute the result into the objective function.

Solution by Substitution S - 2pr2 S - 2pr2 rS  Volume = V = pr2 - pr3 h = [ ] = 2 p r 2pr 2 dV S S S 1/2 1/2 = 0  r = ( ) , h = - r = 2( ) dr 6 p 2pr 6 p S 3/2 S 1/2 S 1/2 V = pr2h = 2p ( ) p r = ( ) h = 2( ) 6 6 p 6 p Is this a global optimal solution?

Test for Convexity rS dV(r) S d2V(r) - pr3  = - 3pr2  = -6pr V(r) = dr 2 dr2 d 2 V £ 0 for all r ³ 0 dr 2 Thus V(r) is concave on r ³ 0 so the solution is a global maximum.

Advertising (with Diminishing Returns) A company wants to advertise in two regions. The marketing department says that if $x1 is spent in region 1, sales volume will be 6(x1)1/2. If $x2 is spent in region 2 the sales volume will be 4(x2)1/2. The advertising budget is $100. Model: Max f(x) = 6(x1)1/2 + 4(x2)1/2 s.t. x1 + x2 £ 100, x1 ³ 0, x2 ³ 0 Solution: x1* = 69.2, x2* = 30.8, f(x*) = 72.1 Is this a global optimum?

Excel Add-in Solution

Portfolio Selection with Risky Assets (Markowitz) Suppose that we may invest in (up to) n stocks. Investors worry about (1) expected gain (2) risk. Let mj = expected return sjj = variance of return We are also concerned with the covariance terms: sij = cov (ri, rj) If sij > 0 then returns on i and j are positively correlated. If sij < 0 returns are negatively correlated.

Decision Variables: xj = # of shares of stock j purchased R(x) = å mjxj n j=1 Expected return of the portfolio: n i=1 n j=1 Variance (measure of risk): V(x) = å å sijxixj Example If x1 = x2 = 1, we get V(x) = s11x1x1 + s12x1x2 + s21x2x1 + s22x2x1 = 2 + (-2) + (-2) + 2 = 0 Thus we can construct a “risk-free” portfolio (from variance point of view) if we can find stocks “fully” negatively correlated.

If , then purchasing stock 2 is just like purchasing additional shares of stock 1.

Nonlinear optimization models … Let pj = price of stock j, b = our total budget b = risk-aversion factor (b = 0 risk is not a factor) Consider 3 different models: 1) Max f(x) = R(x) – bV(x) s.t. å pjxj £ b, xj ³ 0, j = 1,…,n where b ³ 0 determined by the decision maker n j=1

s.t. V(x) £ a, å pjxj £ b, xj ³ 0, j = 1,…,n Max f(x) = R(x) s.t. V(x) £ a, å pjxj £ b, xj ³ 0, j = 1,…,n where a ³ 0 is determined by the investor. Smaller values of a represent greater risk aversion. n j=1 3) Min f(x) = V(x) s.t. R(x) ³ g, å pjxj £ b, xj ³ 0, j = 1,…,n where g ³ 0 is the desired rate of return (minimum expectation) is selected by the investor. n j=1