Thursday, April 18 Nonlinear Programming (NLP)

Slides:



Advertisements
Similar presentations
Thursday, March 7 Duality 2 – The dual problem, in general – illustrating duality with 2-person 0-sum game theory Handouts: Lecture Notes.
Advertisements

Asset Pricing. Pricing Determining a fair value (price) for an investment is an important task. At the beginning of the semester, we dealt with the pricing.
The Simple Regression Model
Lesson 08 Linear Programming
BA 452 Lesson B.6 Nonlinear Programming ReadingsReadings Chapter 8 Nonlinear Optimization Models.
Mean-variance portfolio theory
MS&E 211 Quadratic Programming Ashish Goel. A simple quadratic program Minimize (x 1 ) 2 Subject to: -x 1 + x 2 ≥ 3 -x 1 – x 2 ≥ -2.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
CHAPTER SEVEN PORTFOLIO ANALYSIS.
Investment Science D.G. Luenberger
Lecture 8 – Nonlinear Programming Models Topics General formulations Local vs. global solutions Solution characteristics Convexity and convex programming.
Nonlinear Programming
Ch.7 The Capital Asset Pricing Model: Another View About Risk
Thursday, April 25 Nonlinear Programming Theory Separable programming Handouts: Lecture Notes.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
Lecture note 6 Continuous Random Variables and Probability distribution.
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter.
Non Linear Programming 1
MIT and James Orlin © Nonlinear Programming Theory.
LECTURE 7 : THE CAPM (Asset Pricing and Portfolio Theory)
Chapter 6 Continuous Random Variables and Probability Distributions
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
Corporate Finance Portfolio Theory Prof. André Farber SOLVAY BUSINESS SCHOOL UNIVERSITÉ LIBRE DE BRUXELLES.
Vicentiu Covrig 1 Portfolio management. Vicentiu Covrig 2 “ Never tell people how to do things. Tell them what to do and they will surprise you with their.
Lecture: 4 - Measuring Risk (Return Volatility) I.Uncertain Cash Flows - Risk Adjustment II.We Want a Measure of Risk With the Following Features a. Easy.
Chapter 5 Continuous Random Variables and Probability Distributions
Maximum likelihood (ML)
Lecture 9 – Nonlinear Programming Models
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
FIN638 Vicentiu Covrig 1 Portfolio management. FIN638 Vicentiu Covrig 2 How Finance is organized Corporate finance Investments International Finance Financial.
Lecture 3-2 Summarizing Relationships among variables ©
Measuring Returns Converting Dollar Returns to Percentage Returns
Roman Keeney AGEC  In many situations, economic equations are not linear  We are usually relying on the fact that a linear equation.
The Capital Asset Pricing Model (CAPM)
Portfolio Management-Learning Objective
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter 7.
Investment Analysis and Portfolio Management Chapter 7.
0 Portfolio Managment Albert Lee Chun Construction of Portfolios: Introduction to Modern Portfolio Theory Lecture 3 16 Sept 2008.
Lecture 10 The Capital Asset Pricing Model Expectation, variance, standard error (deviation), covariance, and correlation of returns may be based on.
Nonlinear Programming (NLP) Operation Research December 29, 2014 RS and GISc, IST, Karachi.
Physics 114: Exam 2 Review Lectures 11-16
Block 4 Nonlinear Systems Lesson 14 – The Methods of Differential Calculus The world is not only nonlinear but is changing as well 1 Narrator: Charles.
1 Risk Learning Module. 2 Measures of Risk Risk reflects the chance that the actual return on an investment may be different than the expected return.
Computational Finance 1/34 Panos Parpas Asset Pricing Models 381 Computational Finance Imperial College London.
Industrial Organization- Matilde Machado The Hotelling Model Hotelling Model Matilde Machado.
Nonlinear Programming Models
FIN 819: lecture 4 Risk, Returns, CAPM and the Cost of Capital Where does the discount rate come from?
Risk Analysis & Modelling
1 Multi-Objective Portfolio Optimization Jeremy Eckhause AMSC 698S Professor S. Gabriel 6 December 2004.
Nonlinear Programming I Li Xiaolei. Introductory concepts A general nonlinear programming problem (NLP) can be expressed as follows: objective function.
Optimal portfolios and index model.  Suppose your portfolio has only 1 stock, how many sources of risk can affect your portfolio? ◦ Uncertainty at the.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Chapter 7 Nonlinear Optimization Models. Introduction The objective and/or the constraints are nonlinear functions of the decision variables. Select GRG.
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
1 Optimization Techniques Constrained Optimization by Linear Programming updated NTU SY-521-N SMU EMIS 5300/7300 Systems Analysis Methods Dr.
Chapter 7 An Introduction to Portfolio Management.
Introduction to Integer Programming Integer programming models Thursday, April 4 Handouts: Lecture Notes.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Single Index Model. Lokanandha Reddy Irala 2 Single Index Model MPT Revisited  Take all the assets in the world  Create as many portfolios possible.
FIN437 Vicentiu Covrig 1 Portfolio management Optimum asset allocation Optimum asset allocation (see chapter 8 RN)
Key Concepts and Skills
Calculus-Based Solutions Procedures MT 235.
Principles of Investing FIN 330
Lecture 6 General Overview of Non-Linear Programming
Lecture 8 – Nonlinear Programming Models
Lecture 9 – Nonlinear Programming Models
How accurately can you (1) predict Y from X, and (2) predict X from Y?
Regression Lecture-5 Additional chapters of mathematics
Chapter 7 Functions of Several Variables
Presentation transcript:

15.053 Thursday, April 18 Nonlinear Programming (NLP) – Modeling Examples – Convexity – Local vs. Global Optima Handouts: Lecture Notes

Linear Programming Model Maximize ..... c1x1 +c2x2 +……+cnxn subject to a11x1+a12x2 +…+a1nxn ≤ b1 a21x1+a22x2 +…+a2nxn ≤ b2 . am1x1+am2x2 +…+amnxn ≤ bm x1,x2,…,xn ≥ 0 ASSUMPTIONS: Proportionality Assumption – Objective function – Constraints Additivity Assumption

What is a non-linear program? maximize 3 sin x + xy + y3 - 3z + log z subject to x2 + y3 = 1 x + 4z ≥ 2 z ≥ 0 A non-linear program is permitted to have non-linear constraints or objectives. A linear program is a special case of non- linear programming!

Nonlinear Programs (NLP) Let x = (x1,x2,…,xn) Max f(x) gi (x) ≤ bi Nonlinear objective function f(x) and/or Nonlinear constraints gi(x) Could include xi ≥ 0 by adding the constraints xi = yi 2 for i=1,…,n.

Unconstrained Facility Location This is the warehouse location problem with a single warehouse that can be located anywhere in the plane. Distances are “Euclidean.” Loc. Dem. A: (8,2) 19 B: (3,10) 7 C: (8,15) 2 D: (14,13) 5 P: ?

An NLP Costs proportional to distance; known daily demands d(P,A) = … d(P,D) = minimize 19 d(P,A) + … + 5 d(P,D) subject to: P is unconstrained

Here are the objective values for 55 different locations. for y

Facility Location. What happens if P must be within a specified region?

The model Minimize Subject to

0-1 integer programs as NLPs minimize Σj cj xj subject to Σj aij xj = bi for all i xj is 0 or 1 for all j is “nearly” equivalent to minimize Σj cj xj + 106 Σj xj (1- xj). subject to Σj aij xj = bi for all i 0 ≤ xj ≤ 1 for all j

Some comments on non-linear models The fact that non-linear models can model so much is perhaps a bad sign – How can we solve non-linear programs if we have trouble with integer programs? – Recall, in solving integer programs we use techniques that rely on the integrality. Fact: some non-linear models can be solved, and some are WAY too difficult to solve. More on this later.

Variant of exercise from Bertsimas and Freund Buy a machine and keep it for t years, and then sell it. (0 ≤ t ≤ 10) – all values are measured in $ million – Cost of machine = 1.5 – Revenue = 4(1 - .75t) – Salvage value = 1(1 + t)

Machine values revenue Millions of dollars salvage total Time

How long should we keep the machine? Work with your partner on how long we should keep the machine, and why?

Nonlinearities Because of Time Discount rates decreasing value of equipment over time – wear and tear, improvements in technology Tax implications (Depreciation) Salvage value Secondary focus of the previous model(s): Finding the right model can be subtle

Nonlinearities in Pricing The price of an item may depend on the number sold – quantity discounts for a small seller – price elasticity for monopolist Complex interactions because of substitutions: – Lowering the price of GM automobiles will decrease the demand for the competitors

Non-linearities because of congestion The time it takes to go from MIT to Harvard by car depends non-linearly on the congestion. As congestion increases just to its limit, the traffic sometimes comes to a near halt.

Portfolio Optimization In the following slides, we will show how to model portfolio optimization as NLPs The key concept is that risk can be modeled using non-linear equations Since this is one of the most famous applications of non-linear programming, we cover it in much more detail

Risk vs. Return In finance, one trades of risk and return. For a given rate of return, one wants to minimize risk. For a given rate of risk, one wants to maximize return. Return is modeled as expected value. Risk is modeled as variance (or standard deviation.)

Portfolio Selection: The value of diversification. Suppose that the following investments all have an expected return of 10% per year, and have similar variance. You can choose any of the following 3 pairs. Penguin Umbrellas, and Bay Watch Sunglasses (negatively correlated) Cogswell Cogs and Gilligan’s Cruise Tours (no correlation) CSX Railroad, Burlington Northern Railroad (positively correlated)

On Correlations These variables have a correlation of .998

More on correlations Finding the best linear fit is itself a nonlinear program. Regression programs do this “automatically” using a least squares fit.

The best fit regression line minimizes the sum of the squares of the residuals. The vertical red lines are the residuals. The goal is to select the line the minimizes the sum of the residuals squared. It is a non- linear program.

Correlations that are 0 (or close to 0). Correlation is related to the best linear fit. These Variables have a correlation of -.026 These variables are dependent but have a correlation of 0

Key Formula for Expected Values Let X and Y be random variables, and E( ) denote the expected value. Expected values act in a linear manner. For all constants a and b, E(aX + bY) = a E(X) + b E(Y) e.g., E(.3X + .7Y) = .3 E(X) + .7 E(Y)

Mixing distributions Expected Values E(pX + (1-p)Y) Suppose that E(X) = 5 and E(Y) = 10. What is the expected value of pX + (1-p)Y as p varies from 0 to 1? E(pX + (1-p)Y)

Key Formula for Variances Let X and Y be random variables, Var(X) and Var(Y) denote their variances. (risk ~ variance) The variance of aX + bY depends on the covariance of X and Y, which depends on how correlated the two variables X and Y are. For all constants a and b Var(aX + bY) = a2 Var(X) + b2 Var(Y) + 2ab Cov(X,Y) For example, Var(.3X + .7Y) = .09 Var(X) + .49 Var(Y) + .42 Cov(X,Y)

On Reducing Variance if X and Y are independent If two variables X and Y are independent, then their covariance is 0. Var(pX + (1-p)Y) = p2 Var(X) + (1-p)2 Var(Y) ≤ p Var(X) + (1-p) Var(Y).

Mixing Uncorrelated Distributions Here X and Y both have a standard deviation of 5, and they have a correlation of 0. Let W = pX + (1-p)Y, as p goes from 0 to 1.

On reducing variance if X and Y are negatively correlated If two variables X and Y are negatively correlated then their covariance is negative. Var(pX + (1-p)Y) = p2 Var(X) + (1-p)2 Var(Y) + 2p(1-p) Cov(X,Y) < p Var(X) + (1-p) Var(Y). The most extreme example is if the correlation is –1.

Mixing Negatively Correlated Distributions Suppose X and Y both have a standard deviation of 5, and they have a correlation of –1. Standard Deviation of W Standard Deviation of W Let W = pX + (1-p)Y, as p goes from 0 to 1.

On reducing variance if X and Y are positively correlated If two variables X and Y are positively correlated then their covariance is positive. If 0 < p < 1, and if the positive correlation is less than 1, then Var(pX + (1-p)Y) = p2 Var(X) + (1-p)2 Var(Y) + 2p(1-p) Cov(X,Y) < p Var(X) + (1-p) Var(Y). If the correlation is 1, the above holds with equality.

Mixing Positively Correlated Distributions Suppose X and Y both have a standard deviation of 5, and they have a correlation of 1. Standard Deviation Let W = pX + (1-p)Y, as a goes from 0 to 1. Conclusion: Covariances are important!

Summary of reducing risk Diversification is a method of reducing risk, even when investments are positively correlated (which they often are). If only two investments are made, then the risk reduction depends on the covariance. Diversifying over investments that are negatively correlated has a powerful impact on risk reduction.

Portfolio Selection Example When trying to design a financial portfolio investors seek to simultaneously minimize risk and maximize return. Risk is often measured as the variance of the total return, a nonlinear function. FACT: var (x1+x2+…xn)= var (x1 )+ …+var(x2) + Σcov (xi,xj) i ≠j

Portfolio Selection (cont’d) Two Methods are commonly used: – Min Risk s.t. Expected Return ≥ Bound – Max Expected Return - θ (Risk) where θ reflects the tradeoff between return and risk.

Portfolio Selection Example There are 3 candidate assets for out portfolio, X, Y and Z. The expected returns are 30%, 20% and 8% respectively (if possible we would like at least a 12% return). Suppose the covariance matrix is: What are the variables? Let X,Y,Z be percentage of portfolio of each asset.

Portfolio Selection Example Min 3X2+2Y2+Z2+2XY−XZ−0.8YZ st 1.3X+1.2Y+1.08Z ≥ 1.12 X+Y+Z=1 X ≥ 0, Y ≥ 0, Z ≥ 0 Max 1.3X+1.2Y+1.08Z -θ(3X2+2Y2+Z2+2XY-XZ-0.8YZ) st X+Y+Z=1

More on Portfolio Selection There can be institutional constraints as well, especially for mutual funds. No more than 15% in the energy sector Between 20% to 25% high growth At most 3% in any one firm etc. We end up with a large non-linear program. The unconstrained version becomes the “CapM model” in finance.

Determining best linear fits A famous application in Finance of determining the best linear fit is determining the β of a stock. CAPM assumes that the return of a stock s in a given time period is rs = a + βrm + ε, rs = return on stock s in the time period rm = return on market in the time period β = a 1% increase in stock market will lead to a β% increase in the return on s (on average)

Regression, and estimating β Return on Stock A vs. Market Return Stock What is the best linear fit for this data? What does one mean by best? Market

Regression. The vertical red lines are the residuals. The goal is to select the line the minimizes the sum of the residuals squared. It is a non- linear program.

Regression, and estimating β Return on Stock A vs. Market Return Stock Market The value β is the slope of the regression line. Here it is around .6 (lower expected gain than the market, and lower risk.)

Difficulties of NLP Models Linear Program: Nonlinear Programs:

Difficulties of NLP Models (contd.) Def’n: Let x be a feasible solution, then – x is a global max if f(x) ≥ f(y) for every feasible y. – x is a local max if f(x) ≥ f(y) for every feasible y sufficiently close to x (i.e. xj-ε ≤ yj ≤ xj+ ε for all j and some small ε). There may be several locally optimal solutions.

Line joining any points Convex Functions Convex Functions: f(λ y + (1- λ)z) ≤ λ f(y) + (1- λ)f(z) for every y and z and for 0≤ λ ≤1. e.g., f((y+z)/2) ≤ f(y)/2 + f(z)/2 We say “strict” convexity if sign is “<” for 0< λ <1. Line joining any points is above the curve

Line joining any points Convex Functions Convex Functions: f(λ y + (1- λ)z) ≥ λ f(y) + (1- λ)f(z) for every y and z and for 0≤ λ ≤1. e.g., f((y+z)/2) ≥ f(y)/2 + f(z)/2 We say “strict” convexity if sign is “<” for 0< λ <1. Line joining any points is above the curve

Classify as convex or concave or both or neither.

Recognizing convex functions For functions of one variable, if the 2nd derivative is always positive, then the function is convex . The sum of convex functions is convex – e.g., f(x,y) = x2 + ex + 3(y-7)4 - log2 y

Recognizing convex feasible regions If all constraints are linear, then the feasible region is convex The intersection of convex regions is convex If for all feasible x and y, the midpoint of x and y is feasible, then the region is convex (except in totally non-realistic examples. )

Local Maximum (Minimum) Property A local max of a concave function on a convex feasible region is also a global max. A local min of a convex function on a convex feasible region is also a global min. Strict convexity or concavity implies that the global optimum is unique. Given this, we can exactly solve: – Maximization Problems with a concave objective function and linear constraints – Minimization Problems with a convex objective function and

More on local optimality The techniques for non-linear optimization minimization usually find local optima. This is useful when a locally optimal solution is a globally optimal solution It is not so useful in many situations. Conclusion: if you solve an NLP, try to find out how good the local optimal solutions are.

Solving NLP’s by Excel Solver

Summary Applications of NLP to location problems, portfolio management, regression Non-linear programming is very general and very hard to solve Special case of convex minimization NLP is easier, because a local minimum is a global minimum