Inequalities for Stochastic Linear Programming Problems By Albert Madansky Presented by Kevin Byrnes
Outline Introduction Definition of Terms Convexity and Continuity Putting it all together: Jensen’s Inequality Conditions for Equality An Application Improving Bounds Critique ‘An Application’ © Kevin Byrnes, 2006
Introduction In this presentation, we shall consider stochastic linear programs with recourse, i.e. problems of the form: (i) Minimize c T x+f T y Subject to: Ax+By=b Subject to: Ax+By=b x, y >=0 Where the distribution of b is assumed known, the the specific value is not.
Introduction We may interpret (i) in two different ways. First, that we wish to solve the problem all in one stage, and thus wish to find min x c T x+E[min y f T y]
Introduction We may interpret (i) in two different ways. Second, that we wish to generate realizations of b, and solve a sequence of deterministic linear programs. In this case, we are really solving: min x (c T x+min y|x f T y)
Introduction We would generally expect that: min x c T y+E[min y f T y]>=E[min x (c T x+min y|x f T y)] And, indeed, this will be the case. Madansky’s paper investigates the circumstances under which equality holds above. (i.e. when the ‘Here and Now’ objective function value is equal to the expected value of the ‘Wait and See’ approach.)
Introduction Finally, a related, but computationally simpler problem is to solve: (ii) Minimize c T x+f T y Subject to: Ax+By=E[b] Subject to: Ax+By=E[b] x, y >=0 Which is simply an LP with the random vector b replaced by its mean. We shall shortly see that the value of (ii) is, in fact, a useful lower bound for both interpretations of (i).
Definition of Terms Let our stochastic LP be given by (i), and let our ‘approximate LP’ be given by (ii). Let min y c T x+f T y such that (x,y) is (i) feasible be denoted as C(b,x) (i.e. C is a function of the random variable realization and our decision variable)
Definition of Terms Then we wish to find min x E[C(b,x)] and E[min x C(b,x)] for (i), and min x C(E[b],x) for (ii).
Convexity and Continuity Claim: min x C(b,x) is convex in b. Proof: Let b=tb 1 +(1-t)b 2, and let x(b i ) be a vector that minimizes C(b i,x) subject to our constraints. Let x=tx(b 1 )+(1-t)x(b 2 ), clearly then x is also (i) feasible.
Convexity and Continuity Now: b=Ax=tAx(b 1 )+(1-t)Ax(b 2 ) b=Ax=tAx(b 1 )+(1-t)Ax(b 2 ) =tb 1 +(1-t)b 2 Thus: Thus: C(b,x)=tC(b 1,x(b 1 ))+(1-t)C(b 2,x(b 2 )) C(b,x)=tC(b 1,x(b 1 ))+(1-t)C(b 2,x(b 2 )) Since min x C(b,x) <=C(b,x) we have that: Since min x C(b,x) <=C(b,x) we have that: min x C(b,x)<=tmin x C(b 1,x)+(1-t)min x C(b 2,x), as desired.
Convexity and Continuity So min x C(b,x) is a convex function in b.
Convexity and Continuity So min x C(b,x) is a convex function in b. Now we claim that min x C(b,x) is also continuous. Proof: Clearly the feasible region for which there exists an x minimizing C(b,x) is convex in b (it’s a polyhedron)
Convexity and Continuity And we’ve just proven that min x C(b,x) is a convex function. From elementary analysis it follows that min x C(b,x) is convex at every point in the interior of its domain. Getting continuity at the boundary is routine exercise in limits.
Convexity and Continuity Let b be a boundary point of our domain. Consider the sequence b i ->b such that x(b i ) is defined as before. Now: lim i->inf min x C(b i,x)=lim i->inf C(b i,x(b i )) >=min x C(lim i->inf b i,x)=C(b, x(b))
Convexity and Continuity By convexity of min x C(b,x) we have: lim i->inf min x C(b i,x) inf min x C(b i,x)<=min x C(b,x) Hence lim i->inf min x C(b i,x)=min x C(b,x) Thus proving convexity. (Note: if there were no boundary b, then we could’ve stopped with continuity in the interior.)
Putting It All Together: Jensen’s Theorem We’ve just shown that min x C(b,x) is a convex, continuous function on a convex domain. Recall that Jensen’s Theorem states that for any convex function J, and any real-valued function f, we have that: >= J( ) >= J( ) Where denotes ‘average of’
Putting It All Together: Jensen’s Theorem In particular, we have that: E[min x C(b,x)]>=min x C(E[b],x) Now observe that: 1)E[C(b,x(b))]>=min x E[C(b,x)] Since the expected optimal value of C(b,x) given a realization b is certainly at least as large as the minimum of the expectation of C(b,x) over x
Putting It All Together: Jensen’s Theorem 2) min x E[C(b,x)]>=E[min x C(b,x)] To see 2), let x’ be the value of x that minimizes E[C(b,x)], and let x(b) be defined as before. Then: min x E[C(b,x)]=E[C(b,x’)], and min x E[C(b,x)]=E[C(b,x’)], and E[min x C(b,x)]=E[C(b,x(b))]
Putting It All Together: Jensen’s Theorem Since: C(b,x’)>=C(b,x(b)) for every b, We have that: E[C(b,x’)]>=E[C(b,x(b))]
Putting It All Together: Jensen’s Theorem Putting all of our inequalities together, we have: E[C(b,x(E[b]))]>=min x E[C(b,x)] >=E[min x C(b,x)]>=min x C(E[b],x) With these inequalities, we shall demonstrate a necessary condition for min x E[C(b,x)]= E[min x C(b,x)] by achieving ‘tightness’.
Conditions for Equality First consider the well known (cf. Savage, foundations of Statistics, 1954, p.265) result that when the probability measure of the set of b’s is sigma additive (i.e. the sum of the measure of countably many disjoint parts of the domain is equal to that of their union) or the set of b’s is finite with probability one, then:
Conditions for Equality E[min x C(b,x)]=min x C(E[b],x) If and only if: min x C(b,x) is a linear function of b.
Conditions for Equality A simple condition for equality is that C(b,x) be a linear function of b. To see why, note that in this case: min x E[C(b,x)]=min x C(E[b],x) (compare with Charnes, Cooper and Thomson’s result) Now by our earlier derived inequalities, we must have: min x E[C(b,x)]=E[min x C(b,x)]=min x C(E[b],x)
An Application Now let us consider an application of the results we’ve proven thus far. In recent assignments, we’ve been asked to min x E[C(b,x)], where b has some known distribution. In particular, consider the case where b is an environmental quality level.
An Application Let b 1 be a random vector with a given distribution, and let b 2 be another, independent random vector with a given distribution. Let the random vector b be defined as: b=tb 1 +(1-t)b 2, for a fixed value of t in [0,1]
An Application By convexity, we know that for known realizations of b 1 and b 2, we have that: min x C(b,x)<=tmin x C(b 1,x)+(1-t)min x C(b 2,x) Taking the expectation over b 1, b 2 of both sides and using independence and linearity, we get: E[min x C(b,x)]<=tE[min x C(b 1,x)]+(1-t)E[min x C(b 2,x)]
An Application Now, let b’ be a random vector with the same distribution as b (but b’ is not necessarily equal to b). Does this convexity of expectation also hold for b’? Under the assumption that (i) is feasible and bounded for any realization of b or b’, we see that the optimal value of these problems, min x C(b,x) and min x C(b’,x) must be equal to those of their duals by Strong Duality.
An Application In particular, we have that: D(i) = max b T z Subject to: A T z<=c B T z<=f B T z<=f D(i’) = max b’ T z Subject to: A T z<=c B T z<=f B T z<=f
An Application Notice that the feasible region for both of these problems is the same, let us call it F. Then max z b T z subject to z in F is the random variable g(b), and max z b’ T z subject to z in F is the random variable g(b’) Clearly g(b) and g(b’) have the same distribution, and thus the same mean.
An Application Thus: E[min x C(b’,x)]=E[g(b’)]=E[g(b)]=E[min x C(b,x)] And so the inequality holds for any b’ with the same distribution as b. In particular, if the b i were normal, we could’ve generated a random vector with the (easily computable) distribution of their convex combination, to have this hold.
An Application Now, under Madansky’s equality criteria, that C(b,x) be a linear function of b (which it is obviously via the dual formulation), we have that E[min x C(b,x)]=min x E[C(b,x)] and so the optimal objective function value that we observed should have been ‘convex in distribution’.
Improving Bounds From the string of equalities we previously determined, we have found upper and lower bounds on min x E[C(b,x)], the function that we wish to minimize for (i). These bounds are: L=min x C(E[b],x) and U=E[C(b,x(E(b)))]
Improving Bounds L can be found by simply solving the deterministic approximate LP. U can be found by simply taking an expectation, since x(E[b]) is a fixed quantity. A natural question to ask is, if we add more structure to the problem, do we get tighter, easily computable bounds?
Improving Bounds Perhaps not, but as was seen in ‘An Application’, E[min x C(b,x)] is also a naturally arising quantity, which shares the same bounds. Claim: If b is defined on a bounded m- dimensional rectangle I m, and the b’s are independent, then:
Improving Bounds E[min x C(b,x)]<= =H*(E[b])
Improving Bounds What can we do if b is a vector distributed normally? If the elements in b, b i are independent, then we may replace each of their distributions with an approximation. In particular, we can construct smooth, compactly supported approximation functions that match the distribution for the b i on all but a set of arbitrarily small measure.
Improving Bounds These approximations are not necessarily bad things, because they may, in some sense, be more accurate functions than our normal distributions (especially when a random vector is not permitted to assume negative values). For details, consult the proof of Urysohn’s Lemma.
Critique From Madansky’s paper, we have: 1)Characterized the function min x C(b,x) 2)Derived easily computable bounds for (i) and (ii) 3)Used these bounds to determine when equality holds between E[min x C(b,x)] and min x E[C(b,x)] 4)Determined tighter bounds on E[min x C(b,x)] under assumptions on the distribution of the b i.
Critique We did not: 1) Further develop any of our intermediate results. 2)Develop applications for using these equality conditions.