Inequalities for Stochastic Linear Programming Problems By Albert Madansky Presented by Kevin Byrnes.

Slides:



Advertisements
Similar presentations
C&O 355 Lecture 15 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A A A A A A A.
Advertisements

Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.
1 LP Duality Lecture 13: Feb Min-Max Theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum.
Linear Programming. Introduction: Linear Programming deals with the optimization (max. or min.) of a function of variables, known as ‘objective function’,
Price Of Anarchy: Routing
Introduction to Sensitivity Analysis Graphical Sensitivity Analysis
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
MS&E 211 Quadratic Programming Ashish Goel. A simple quadratic program Minimize (x 1 ) 2 Subject to: -x 1 + x 2 ≥ 3 -x 1 – x 2 ≥ -2.
Introduction to Algorithms
How Bad is Selfish Routing? By Tim Roughgarden Eva Tardos Presented by Alex Kogan.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
C&O 355 Mathematical Programming Fall 2010 Lecture 15 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Equilibrium Concepts in Two Player Games Kevin Byrnes Department of Applied Mathematics & Statistics.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
The General Linear Model. The Simple Linear Model Linear Regression.
The number of edge-disjoint transitive triples in a tournament.
The Most Important Concept in Optimization (minimization)  A point is said to be an optimal solution of a unconstrained minimization if there exists no.
Visual Recognition Tutorial
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
Duality Lecture 10: Feb 9. Min-Max theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum Cut Both.
Chebyshev Estimator Presented by: Orr Srour. References Yonina Eldar, Amir Beck and Marc Teboulle, "A Minimax Chebyshev Estimator for Bounded Error Estimation"
Approximation Algorithms
INTEGRALS 5. INTEGRALS We saw in Section 5.1 that a limit of the form arises when we compute an area.  We also saw that it arises when we try to find.
Visual Recognition Tutorial
Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.
Distributed Combinatorial Optimization
Name: Mehrab Khazraei(145061) Title: Penalty or Exterior penalty function method professor Name: Sahand Daneshvar.
1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.
C&O 355 Lecture 2 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A.
1. The Simplex Method.
Limits and the Law of Large Numbers Lecture XIII.
C&O 355 Mathematical Programming Fall 2010 Lecture 4 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Simplex method (algebraic interpretation)
Curriculum Optimization Project CURRICULUM REFORM MEETING California State University, Los Angeles Kevin Byrnes Dept. of Applied Math Johns Hopkins University.
Integrals  In Chapter 2, we used the tangent and velocity problems to introduce the derivative—the central idea in differential calculus.  In much the.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Chapter 1 Introduction n Introduction: Problem Solving and Decision Making n Quantitative Analysis and Decision Making n Quantitative Analysis n Model.
Monte-Carlo method for Two-Stage SLP Lecture 5 Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius, Lithuania EURO Working Group on Continuous.
A Membrane Algorithm for the Min Storage problem Dipartimento di Informatica, Sistemistica e Comunicazione Università degli Studi di Milano – Bicocca WMC.
Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.
X y x-y · 4 -y-2x · 5 -3x+y · 6 x+y · 3 Given x, for what values of y is (x,y) feasible? Need: y · 3x+6, y · -x+3, y ¸ -2x-5, and y ¸ x-4 Consider the.
C&O 355 Mathematical Programming Fall 2010 Lecture 5 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A.
CPSC 536N Sparse Approximations Winter 2013 Lecture 1 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA.
OR Chapter 7. The Revised Simplex Method  Recall Theorem 3.1, same basis  same dictionary Entire dictionary can be constructed as long as we.
Stochastic Optimization
INTEGRALS We saw in Section 5.1 that a limit of the form arises when we compute an area. We also saw that it arises when we try to find the distance traveled.
5 INTEGRALS.
Approximation Algorithms based on linear programming.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
Computational Geometry
EMGT 6412/MATH 6665 Mathematical Programming Spring 2016
The Duality Theorem Primal P: Maximize
6.5 Stochastic Prog. and Benders’ decomposition
Proving that a Valid Inequality is Facet-defining
Chap 9. General LP problems: Duality and Infeasibility
Nuclear Norm Heuristic for Rank Minimization
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Chap 3. The simplex method
The Curve Merger (Dvir & Widgerson, 2008)
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Chapter 5. The Duality Theorem
Lecture 20 Linear Program Duality
Flow Feasibility Problems
Proving that a Valid Inequality is Facet-defining
6.5 Stochastic Prog. and Benders’ decomposition
1.2 Guidelines for strong formulations
1.2 Guidelines for strong formulations
Presentation transcript:

Inequalities for Stochastic Linear Programming Problems By Albert Madansky Presented by Kevin Byrnes

Outline Introduction Definition of Terms Convexity and Continuity Putting it all together: Jensen’s Inequality Conditions for Equality An Application Improving Bounds Critique ‘An Application’ © Kevin Byrnes, 2006

Introduction In this presentation, we shall consider stochastic linear programs with recourse, i.e. problems of the form: (i) Minimize c T x+f T y Subject to: Ax+By=b Subject to: Ax+By=b x, y >=0 Where the distribution of b is assumed known, the the specific value is not.

Introduction We may interpret (i) in two different ways. First, that we wish to solve the problem all in one stage, and thus wish to find min x c T x+E[min y f T y]

Introduction We may interpret (i) in two different ways. Second, that we wish to generate realizations of b, and solve a sequence of deterministic linear programs. In this case, we are really solving: min x (c T x+min y|x f T y)

Introduction We would generally expect that: min x c T y+E[min y f T y]>=E[min x (c T x+min y|x f T y)] And, indeed, this will be the case. Madansky’s paper investigates the circumstances under which equality holds above. (i.e. when the ‘Here and Now’ objective function value is equal to the expected value of the ‘Wait and See’ approach.)

Introduction Finally, a related, but computationally simpler problem is to solve: (ii) Minimize c T x+f T y Subject to: Ax+By=E[b] Subject to: Ax+By=E[b] x, y >=0 Which is simply an LP with the random vector b replaced by its mean. We shall shortly see that the value of (ii) is, in fact, a useful lower bound for both interpretations of (i).

Definition of Terms Let our stochastic LP be given by (i), and let our ‘approximate LP’ be given by (ii). Let min y c T x+f T y such that (x,y) is (i) feasible be denoted as C(b,x) (i.e. C is a function of the random variable realization and our decision variable)

Definition of Terms Then we wish to find min x E[C(b,x)] and E[min x C(b,x)] for (i), and min x C(E[b],x) for (ii).

Convexity and Continuity Claim: min x C(b,x) is convex in b. Proof: Let b=tb 1 +(1-t)b 2, and let x(b i ) be a vector that minimizes C(b i,x) subject to our constraints. Let x=tx(b 1 )+(1-t)x(b 2 ), clearly then x is also (i) feasible.

Convexity and Continuity Now: b=Ax=tAx(b 1 )+(1-t)Ax(b 2 ) b=Ax=tAx(b 1 )+(1-t)Ax(b 2 ) =tb 1 +(1-t)b 2 Thus: Thus: C(b,x)=tC(b 1,x(b 1 ))+(1-t)C(b 2,x(b 2 )) C(b,x)=tC(b 1,x(b 1 ))+(1-t)C(b 2,x(b 2 )) Since min x C(b,x) <=C(b,x) we have that: Since min x C(b,x) <=C(b,x) we have that: min x C(b,x)<=tmin x C(b 1,x)+(1-t)min x C(b 2,x), as desired.

Convexity and Continuity So min x C(b,x) is a convex function in b.

Convexity and Continuity So min x C(b,x) is a convex function in b. Now we claim that min x C(b,x) is also continuous. Proof: Clearly the feasible region for which there exists an x minimizing C(b,x) is convex in b (it’s a polyhedron)

Convexity and Continuity And we’ve just proven that min x C(b,x) is a convex function. From elementary analysis it follows that min x C(b,x) is convex at every point in the interior of its domain. Getting continuity at the boundary is routine exercise in limits.

Convexity and Continuity Let b be a boundary point of our domain. Consider the sequence b i ->b such that x(b i ) is defined as before. Now: lim i->inf min x C(b i,x)=lim i->inf C(b i,x(b i )) >=min x C(lim i->inf b i,x)=C(b, x(b))

Convexity and Continuity By convexity of min x C(b,x) we have: lim i->inf min x C(b i,x) inf min x C(b i,x)<=min x C(b,x) Hence lim i->inf min x C(b i,x)=min x C(b,x) Thus proving convexity. (Note: if there were no boundary b, then we could’ve stopped with continuity in the interior.)

Putting It All Together: Jensen’s Theorem We’ve just shown that min x C(b,x) is a convex, continuous function on a convex domain. Recall that Jensen’s Theorem states that for any convex function J, and any real-valued function f, we have that: >= J( ) >= J( ) Where denotes ‘average of’

Putting It All Together: Jensen’s Theorem In particular, we have that: E[min x C(b,x)]>=min x C(E[b],x) Now observe that: 1)E[C(b,x(b))]>=min x E[C(b,x)] Since the expected optimal value of C(b,x) given a realization b is certainly at least as large as the minimum of the expectation of C(b,x) over x

Putting It All Together: Jensen’s Theorem 2) min x E[C(b,x)]>=E[min x C(b,x)] To see 2), let x’ be the value of x that minimizes E[C(b,x)], and let x(b) be defined as before. Then: min x E[C(b,x)]=E[C(b,x’)], and min x E[C(b,x)]=E[C(b,x’)], and E[min x C(b,x)]=E[C(b,x(b))]

Putting It All Together: Jensen’s Theorem Since: C(b,x’)>=C(b,x(b)) for every b, We have that: E[C(b,x’)]>=E[C(b,x(b))]

Putting It All Together: Jensen’s Theorem Putting all of our inequalities together, we have: E[C(b,x(E[b]))]>=min x E[C(b,x)] >=E[min x C(b,x)]>=min x C(E[b],x) With these inequalities, we shall demonstrate a necessary condition for min x E[C(b,x)]= E[min x C(b,x)] by achieving ‘tightness’.

Conditions for Equality First consider the well known (cf. Savage, foundations of Statistics, 1954, p.265) result that when the probability measure of the set of b’s is sigma additive (i.e. the sum of the measure of countably many disjoint parts of the domain is equal to that of their union) or the set of b’s is finite with probability one, then:

Conditions for Equality E[min x C(b,x)]=min x C(E[b],x) If and only if: min x C(b,x) is a linear function of b.

Conditions for Equality A simple condition for equality is that C(b,x) be a linear function of b. To see why, note that in this case: min x E[C(b,x)]=min x C(E[b],x) (compare with Charnes, Cooper and Thomson’s result) Now by our earlier derived inequalities, we must have: min x E[C(b,x)]=E[min x C(b,x)]=min x C(E[b],x)

An Application Now let us consider an application of the results we’ve proven thus far. In recent assignments, we’ve been asked to min x E[C(b,x)], where b has some known distribution. In particular, consider the case where b is an environmental quality level.

An Application Let b 1 be a random vector with a given distribution, and let b 2 be another, independent random vector with a given distribution. Let the random vector b be defined as: b=tb 1 +(1-t)b 2, for a fixed value of t in [0,1]

An Application By convexity, we know that for known realizations of b 1 and b 2, we have that: min x C(b,x)<=tmin x C(b 1,x)+(1-t)min x C(b 2,x) Taking the expectation over b 1, b 2 of both sides and using independence and linearity, we get: E[min x C(b,x)]<=tE[min x C(b 1,x)]+(1-t)E[min x C(b 2,x)]

An Application Now, let b’ be a random vector with the same distribution as b (but b’ is not necessarily equal to b). Does this convexity of expectation also hold for b’? Under the assumption that (i) is feasible and bounded for any realization of b or b’, we see that the optimal value of these problems, min x C(b,x) and min x C(b’,x) must be equal to those of their duals by Strong Duality.

An Application In particular, we have that: D(i) = max b T z Subject to: A T z<=c B T z<=f B T z<=f D(i’) = max b’ T z Subject to: A T z<=c B T z<=f B T z<=f

An Application Notice that the feasible region for both of these problems is the same, let us call it F. Then max z b T z subject to z in F is the random variable g(b), and max z b’ T z subject to z in F is the random variable g(b’) Clearly g(b) and g(b’) have the same distribution, and thus the same mean.

An Application Thus: E[min x C(b’,x)]=E[g(b’)]=E[g(b)]=E[min x C(b,x)] And so the inequality holds for any b’ with the same distribution as b. In particular, if the b i were normal, we could’ve generated a random vector with the (easily computable) distribution of their convex combination, to have this hold.

An Application Now, under Madansky’s equality criteria, that C(b,x) be a linear function of b (which it is obviously via the dual formulation), we have that E[min x C(b,x)]=min x E[C(b,x)] and so the optimal objective function value that we observed should have been ‘convex in distribution’.

Improving Bounds From the string of equalities we previously determined, we have found upper and lower bounds on min x E[C(b,x)], the function that we wish to minimize for (i). These bounds are: L=min x C(E[b],x) and U=E[C(b,x(E(b)))]

Improving Bounds L can be found by simply solving the deterministic approximate LP. U can be found by simply taking an expectation, since x(E[b]) is a fixed quantity. A natural question to ask is, if we add more structure to the problem, do we get tighter, easily computable bounds?

Improving Bounds Perhaps not, but as was seen in ‘An Application’, E[min x C(b,x)] is also a naturally arising quantity, which shares the same bounds. Claim: If b is defined on a bounded m- dimensional rectangle I m, and the b’s are independent, then:

Improving Bounds E[min x C(b,x)]<= =H*(E[b])

Improving Bounds What can we do if b is a vector distributed normally? If the elements in b, b i are independent, then we may replace each of their distributions with an approximation. In particular, we can construct smooth, compactly supported approximation functions that match the distribution for the b i on all but a set of arbitrarily small measure.

Improving Bounds These approximations are not necessarily bad things, because they may, in some sense, be more accurate functions than our normal distributions (especially when a random vector is not permitted to assume negative values). For details, consult the proof of Urysohn’s Lemma.

Critique From Madansky’s paper, we have: 1)Characterized the function min x C(b,x) 2)Derived easily computable bounds for (i) and (ii) 3)Used these bounds to determine when equality holds between E[min x C(b,x)] and min x E[C(b,x)] 4)Determined tighter bounds on E[min x C(b,x)] under assumptions on the distribution of the b i.

Critique We did not: 1) Further develop any of our intermediate results. 2)Develop applications for using these equality conditions.