Chap 3. The simplex method Standard form problem minimize 𝑐 ′ 𝑥 subject to 𝐴𝑥=𝑏 𝑥≥0 𝐴:𝑚×𝑛, full row rank From earlier results, we know that there exists an extreme point (b.f.s.) optimal solution if LP has finite optimal value. Simplex method searches b.f.s’ to find optimal one. If LP unbounded, there exists an extreme ray 𝑑 𝑖 in the recession cone 𝐾 (𝐾= 𝑥:𝐴𝑥=0, 𝑥≥0 ) such that 𝑐′ 𝑑 𝑖 <0. Simplex finds the direction 𝑑 𝑖 if LP unbounded, hence providing the proof of unboundedness. Linear Programming 2018
3.1 Optimality conditions A strategy for algorithm : Given a feasible solution 𝑥, look at the neighborhood of 𝑥 for a feasible point that gives improved objective value. If no such point exists, we are at a local minimum point. In general, such local minimum point is not a global minimum point. However, if we minimize a convex function over a convex set (convex program), local minimum point is a global minimum point, which is the case for linear programming problem. (HW later) Def: 𝑃 is polyhedron. Given 𝑥∈𝑃, 𝑑∈ 𝑅 𝑛 is a feasible direction at 𝑥 if ∃ 𝜃>0 such that 𝑥+𝜃𝑑∈𝑃. Given a b.f.s. 𝑥∈𝑃, ( 𝑥 𝐵 = 𝐵 −1 𝑏, 𝑥 𝑁 =0 for 𝑁: nonbasic) ( 𝐵= 𝐴 𝐵(1) , 𝐴 𝐵(2) ,…, 𝐴 𝐵(𝑚) ) want to find a new point 𝑥+𝜃𝑑 such that it satisfies 𝐴𝑥=𝑏 and 𝑥≥0 and the new point gives an improved objective value. Linear Programming 2018
Consider moving to 𝑥+𝜃𝑑= 𝑥 𝐵 𝑥 𝑁 +𝜃 𝑑 𝐵 𝑑 𝑁 , where 𝑑 𝑗 =1 for some nonbasic variable 𝑥 𝑗 , 𝑗∈𝑁 0 for other nonbasic variables except 𝑥 𝑗 and 𝑥 𝐵 ← 𝑥 𝐵 +𝜃 𝑑 𝐵 We require 𝐴 𝑥+𝜃𝑑 =𝑏 for 𝜃>0 need 𝐴𝑑=0 (iff condition to satisfy 𝐴 𝑥+𝜃𝑑 =𝑏, for 𝜃>0) 0=𝐴𝑑= 𝑖=1 𝑛 𝐴 𝑖 𝑑 𝑖 = 𝑖=1 𝑚 𝐴 𝐵(𝑖) 𝑑 𝐵(𝑖) + 𝐴 𝑗 =𝐵 𝑑 𝐵 + 𝐴 𝑗 𝑑 𝐵 =− 𝐵 −1 𝐴 𝑗 Assuming that columns of 𝐴 are permuted so that 𝐴= 𝐵,𝑁 and 𝑥= 𝑥 𝐵 , 𝑥 𝑁 , 𝑑= 𝑑 𝐵 , 𝑑 𝑁 , 𝑑= − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 is called 𝑗−𝑡ℎ basic direction. Linear Programming 2018
Note that ∃ (𝑛−𝑚) basic directions when 𝐵 is given. Recall 𝑑:𝐴𝑑=0 is the null space of 𝐴 and its basis is given by columns of 𝑃 − 𝐵 −1 𝑁 𝐼 𝑛−𝑚 , where 𝑃 is a permutation matrix for 𝐴𝑃= 𝐵,𝑁 𝑃 − 𝐵 −1 𝑁 𝐼 𝑛−𝑚 =𝑃 − 𝐵 −1 𝐴 1 ⋯ − 𝐵 −1 𝐴 𝑗 ⋯ − 𝐵 −1 𝐴 𝑛−𝑚 𝑒 1 ⋯ 𝑒 𝑗 ⋯ 𝑒 𝑛−𝑚 Each column here gives a basic direction. Those 𝑛−𝑚 basic directions constitute a basis for null space of matrix 𝐴 (from earlier). Hence we can move along any direction 𝑑 which is a linear combination of these basis vectors while satisfying 𝐴 𝑥+𝜃𝑑 =𝑏, 𝜃>0. Linear Programming 2018
However, we also need to satisfy the nonnegativity constraints in addition to 𝐴𝑥=𝑏 to remain feasible. Since 𝑑 𝑗 =1>0 for nonbasic variable index 𝑗 and 𝑥 𝑗 =0 at the current solution ( 𝑥 𝑗 is a nonbasic variable), moving along (− 𝑑 𝑗 ) will make 𝑥 𝑗 ≥0 violated immediately. Hence we do not consider moving along (− 𝑑 𝑗 ) direction. Therefore, the direction we can move is the nonnegative linear combination of the basic directions, which is the cone generated by the 𝑛−𝑚 basic directions. Note that if a basic direction satisfies 𝑑 𝐵 =− 𝐵 −1 𝐴 𝑗 ≥0, it is an extreme ray of recession cone of 𝑃 (recall HW). In simplex method, we choose one of the basic directions as the direction of movement. Linear Programming 2018
Two cases: (a) current solution 𝑥 is nondegenerate : 𝑥 𝐵 >0 guarantees that 𝑥 𝐵 +𝜃 𝑑 𝐵 >0 for some 𝜃>0. (b) 𝑥 is degenerate : some basic variable 𝑥 𝐵(𝑖) =0. It may happen that 𝑖−𝑡ℎ component of 𝑑 𝐵 =− 𝐵 −1 𝐴 𝑗 is negative. Then 𝑥 𝐵(𝑖) becomes negative if we move along 𝑑. So we cannot make 𝜃>0. Details later. Linear Programming 2018
𝑥 1 , 𝑥 3 nonbasic at E. 𝑥 3 , 𝑥 5 nonbasic at F ( 𝑥 4 basic at 0). 𝑥 3 =0 F E 𝑥 5 =0 𝑥 1 =0 𝑥 4 =0 G 𝑥 2 =0 Figure 3.2: 𝑛=5, 𝑛−𝑚=2 𝑥 1 , 𝑥 3 nonbasic at E. 𝑥 3 , 𝑥 5 nonbasic at F ( 𝑥 4 basic at 0). Linear Programming 2018
Now consider the cost function: Want to choose the direction that improves the objective value (𝑐′ 𝑥+𝜃 𝑑 𝑗 −𝑐′𝑥=𝜃𝑐′ 𝑑 𝑗 <0) 𝑐′ 𝑑 𝑗 = 𝑐 𝐵 ′, 𝑐 𝑁 ′ − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 = 𝑐 𝑗 − 𝑐 𝐵 ′ 𝐵 −1 𝐴 𝑗 ≡ 𝑐 𝑗 (called reduced cost) If 𝑐 𝑗 <0, 𝑗∈𝑁, then objective value improves if we move to 𝑥+𝜃 𝑑 𝑗 for some 𝜃>0. (𝑁: index set of nonbasic variables) Note) For 𝑖−𝑡ℎ basic variable, 𝑐 𝑖 may be computed using above formula. 𝑐 𝑖 = 𝑐 𝑖 − 𝑐 𝐵 ′ 𝐵 −1 𝐴 𝐵 𝑖 = 𝑐 𝑖 − 𝑐 𝐵 ′ 𝑒 𝑖 =0, for all 𝑖∈𝐵 (𝐵: index set of basic variables) Linear Programming 2018
Thm 3.1: (optimality condition) Consider b.f.s. 𝑥 with basis matrix 𝐵. Let 𝑐 be the reduced cost vector. (a) If 𝑐 ≥0 𝑥 is optimal (sufficient condition for optimality) (b) 𝑥 is optimal and nondegenerate 𝑐 ≥0 Pf) (a) Assume 𝑐 ≥0. 𝑦 is an arbitrary point in 𝑃. Let 𝑑=𝑦−𝑥 𝐴𝑑=0 𝐵 𝑑 𝐵 + 𝑖∈𝑁 𝐴 𝑖 𝑑 𝑖 =0 𝑑 𝐵 =− 𝑖∈𝑁 𝐵 −1 𝐴 𝑖 𝑑 𝑖 𝑐′𝑑=𝑐′𝑦−𝑐′𝑥= 𝑐 𝐵 ′ 𝑑 𝐵 + 𝑖∈𝑁 𝑐 𝑖 𝑑 𝑖 = 𝑖∈𝑁 𝑐 𝑖 − 𝑐 𝐵 ′ 𝐵 −1 𝐴 𝑖 𝑑 𝑖 = 𝑖∈𝑁 𝑐 𝑖 𝑑 𝑖 ≥0 ( 𝑐 𝑖 ≥0, 𝑑 𝑖 ≥0 since 𝑦 𝑖 ≥0, 𝑥 𝑖 =0 for 𝑖∈𝑁 and 𝑑=𝑦−𝑥) (b) Suppose 𝑥 is nondegenerate b.f.s. and 𝑐 𝑗 <0 for some 𝑗. 𝑥 𝑗 must be a nonbasic variable and we can obtain improved solution by moving to 𝑥+𝜃 𝑑 𝑗 , 𝜃>0 and small. Hence 𝑥 is not optimal. Linear Programming 2018
Def 3.3: Basis matrix 𝐵 is said to be optimal if (a) 𝐵 −1 𝑏≥0 Note that the condition 𝑐 ≥0 is a sufficient condition for optimality of a b.f.s. 𝑥, but it is not necessary. The necessity holds only when 𝑥 is nondegenerate. Def 3.3: Basis matrix 𝐵 is said to be optimal if (a) 𝐵 −1 𝑏≥0 (b) 𝑐 ′=𝑐′− 𝑐 𝐵 ′ 𝐵 −1 𝐴≥0′ Linear Programming 2018
3.2 Development of the simplex method (Assume nondegenerate b.f.s. for the time being) Suppose we are at a b.f.s. 𝑥 and computed 𝑐 𝑗 , 𝑗∈𝑁. If 𝑐 𝑗 ≥0, ∀ 𝑗∈𝑁, current solution is optimal. Otherwise, choose 𝑗∈𝑁 such that 𝑐 𝑗 <0 and find 𝑑 vector (𝑗-th basic direction) (If have maximization problem, choose 𝑗∈𝑁 such that 𝑐 𝑗 >0.) ( 𝑑 𝑗 =1, 𝑑 𝑖 =0 for 𝑖≠𝐵 1 , …, 𝐵 𝑚 , 𝑗, and 𝑑 𝐵 =− 𝐵 −1 𝐴 𝑗 ) Want to find 𝜃 ∗ =max {𝜃≥0:𝑥+𝜃𝑑∈𝑃}. Cost change is 𝜃 ∗ 𝑐′𝑑= 𝜃 ∗ 𝑐 𝑗 The vector 𝑑 satisfies 𝐴 𝑥+𝜃𝑑 =𝑏, also want to satisfy (𝑥+𝜃𝑑)≥0. Linear Programming 2018
(a) If 𝑑≥0, then (𝑥+𝜃𝑑)≥0 for all 𝜃≥0. Hence 𝜃 ∗ =∞. (b) If 𝑑 𝑖 <0 for some 𝑖, ( 𝑥 𝑖 +𝜃 𝑑 𝑖 )≥0 𝜃≤− 𝑥 𝑖 / 𝑑 𝑖 For nonbasic variables, 𝑑 𝑖 ≥0, 𝑖∈𝑁. Hence only consider basic variables. 𝜃 ∗ = min 𝑖=1,…,𝑚: 𝑑 𝐵 𝑖 <0 − 𝑥 𝐵(𝑖) 𝑑 𝐵(𝑖) (called minimum ratio test) Let 𝑦=𝑥+ 𝜃 ∗ 𝑑. Have 𝑦 𝑗 = 𝜃 ∗ >0 for entering nonbasic variable 𝑥 𝑗 . (we assumed nondegeneracy, hence 𝑥 𝐵(𝑖) >0 for all basic variables) Let 𝑙 be the index of the basic variable selected in the minimum ratio test, i.e. − 𝑥 𝐵 𝑙 𝑑 𝐵 𝑙 = min 𝑖=1,…,𝑚: 𝑑 𝐵 𝑖 <0 − 𝑥 𝐵 𝑖 𝑑 𝐵 𝑖 = 𝜃 ∗ . Then 𝑥 𝐵(𝑙) + 𝜃 ∗ 𝑑 𝐵(𝑙) =0. Linear Programming 2018
Replace 𝑥 𝐵(𝑙) in the basis with the entering variable 𝑥 𝑗 . New basis matrix is 𝐵 = | 𝐴 𝐵(1) | ⋯ | 𝐴 𝐵(𝑙−1) | | 𝐴 𝑗 | | 𝐴 𝐵(𝑙+1) | ⋯ | 𝐴 𝐵(𝑚) | Also replace the set {𝐵 1 ,…,𝐵 𝑚 } of basic indices by { 𝐵 1 ,…, 𝐵 𝑚 } given by 𝐵 𝑖 = 𝐵 𝑖 , 𝑖≠𝑙, 𝑗, 𝑖=𝑙. Linear Programming 2018
(b) 𝑦=𝑥+ 𝜃 ∗ 𝑑 is a b.f.s. with basis 𝐵 . Thm 3.2: (a) 𝐴 𝐵(𝑖) , 𝑖≠𝑙, and 𝐴 𝑗 are linearly independent. Hence 𝐵 is a basis matrix. (b) 𝑦=𝑥+ 𝜃 ∗ 𝑑 is a b.f.s. with basis 𝐵 . Pf) (a) If 𝐴 𝐵 𝑖 , 𝑖=1,…,𝑚 are linearly dependent. ∃ 𝜆 1 ,…, 𝜆 𝑚 , not all of them zero, such that 𝑖=1 𝑚 𝜆 𝑖 𝐴 𝐵 (𝑖) = 𝐵 𝜆=0. 𝑖=1 𝑚 𝜆 𝑖 𝐵 −1 𝐴 𝐵 (𝑖) =0 Hence 𝐵 −1 𝐴 𝐵 (𝑖) ′𝑠 are linearly dependent. But 𝐵 −1 𝐴 𝐵 (𝑖) = 𝑒 𝑖 , 𝑖=1,…,𝑚, 𝑖≠𝑙 𝐵 −1 𝐴 𝑗 =− 𝑑 𝐵 , and by definition − 𝑑 𝐵(𝑙) ≠0. Hence 𝐵 −1 𝐴 𝑗 and 𝐵 −1 𝐴 𝐵(𝑖) , 𝑖≠𝑙 are linearly independent, contradiction. (b) Have 𝑦≥0, 𝐴𝑦=𝑏, 𝑦 𝑖 =0, 𝑖∉ 𝐵 . Columns of 𝐵 are linearly independent. Hence b.f.s. Linear Programming 2018
(a) optimal basis 𝐵 and optimal b.f.s See the text for a complete description of an iteration of the simplex method. Thm 3.3: Assume standard polyhedron 𝑃≠∅ and every b.f.s. is nondegenerate. Then simplex method terminates after a finite number of iterations. At termination, two possibilities: (a) optimal basis 𝐵 and optimal b.f.s (b) Have found a vector 𝑑 satisfying 𝐴𝑑=0, 𝑑≥0, and c’d < 0, and the optimal cost is −∞. Linear Programming 2018
Remarks 𝐴𝑥=𝑏 𝐵,𝑁 𝑥 𝐵 𝑥 𝑁 =𝑏, 𝑑= 𝑑 𝐵 𝑑 𝑁 = − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 𝐴𝑥=𝑏 𝐵,𝑁 𝑥 𝐵 𝑥 𝑁 =𝑏, 𝑑= 𝑑 𝐵 𝑑 𝑁 = − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 1) Suppose 𝑥 nondegenerate b.f.s. and we move to 𝑥+𝜃𝑑, 𝜃>0. Consider the point 𝑦=𝑥+ 𝜃 ∗ 𝑑, 𝜃 ∗ >0 and 𝑦 feasible. (nondegeneracy of 𝑥 guarantees the existence of 𝜃 ∗ >0 and 𝑦 feasible) Then 𝐴 𝑥+ 𝜃 ∗ 𝑑 =𝑏 𝑦=( 𝑦 𝐵 , 𝑦 𝑁 ) 𝑦 𝐵 = 𝑥 𝐵 + 𝜃 ∗ 𝑑 𝐵 >0 for sufficiently small 𝜃 ∗ >0 𝑦 𝑁 = 𝑥 𝑁 + 𝜃 ∗ 𝑑 𝑁 =0+ 𝜃 ∗ 𝑒 𝑗 =(0,…, 𝜃 ∗ ,0,…,0) Since (𝑛−𝑚−1) of constraints 𝑥 𝑗 ≥0 are active and 𝑚 constraints 𝐴𝑥=𝑏 active, we have (𝑛−1) constraints are active at (𝑥+ 𝜃 ∗ 𝑑) (also the active constraints are lin. ind.) and no more inequalities are active. Linear Programming 2018
Hence we get a new b.f.s., which is a 0-dimensional face of 𝑃. (continued) Hence 𝑦 is in the face defined by the active constraints, which is one-dimensional since the equality set of the face is 𝑛−1 -dimensional. So 𝑦 is in one-dimensional face of 𝑃 (edge) and no other proper face of it. When 𝜃 ∗ is such that at least one of the basic variables becomes 0 (say 𝑥 𝑙 ), then entering nonbasic variable replaces 𝑥 𝑙 in the basis and the new basis matrix is nonsingular and the leaving basic variable 𝑥 𝑙 =0 𝑥 𝑙 ≥0 becomes active. Hence we get a new b.f.s., which is a 0-dimensional face of 𝑃. For a nondegenerate simplex iteration, we start from a b.f.s. ( 0-dimensional face), then follow an edge ( 1-dimensional face ) of 𝑃 until we reach another b.f.s. ( 0-dimensional face) Linear Programming 2018
The recession cone of 𝑃 is 𝐾={𝑦:𝐴𝑦=0, 𝑦≥0}. (𝑃=𝐾+𝑄) (continued) (2) If 𝑑= 𝑑 𝐵 𝑑 𝑁 = − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 ≥0, then 𝑥+𝜃𝑑≥0, ∀ 𝜃>0, hence feasible. The recession cone of 𝑃 is 𝐾={𝑦:𝐴𝑦=0, 𝑦≥0}. (𝑃=𝐾+𝑄) Since 𝑑∈𝐾 and (𝑛−1) independent rows active at 𝑑, 𝑑 is an extreme ray of 𝐾 (recall HW) and 𝑐′𝑑= 𝑐 𝑗 − 𝑐 𝐵 ′ 𝐵 −1 𝐴 𝑗 <0 LP unbounded. Hence, given a basis (b.f.s.), finding an extreme ray 𝑑 (basic direction) in the recession cone with 𝑐 ′ 𝑑<0 provides a proof of unboundedness of LP. Linear Programming 2018
Simplex method for degenerate problems If degeneracy allowed, two possibilities : (a) current b.f.s. degenerate 𝜃 ∗ may be 0 (if, for some 𝑙, 𝑥 𝐵(𝑙) =0 and 𝑑 𝐵(𝑙) <0) Perform the iteration as usual with 𝜃 ∗ =0. New basis 𝐵 is still nonsingular ( solution not changed, only basis changes), hence the current solution is b.f.s with different basis 𝐵 . ( Note that we may have nondegenerate iteration although we have a degenerate solution.) (b) although 𝜃 ∗ may be positive, new point may have more than one of the original basic variables become 0 at the new point. Only one of them exits the basis and the resulting solution is degenerate. (It happens when we have ties in the minimum ratio test.) Linear Programming 2018
Figure 3.3: 𝑛−𝑚=2. 𝑥 4 , 𝑥 5 nonbasic. (𝑔,𝑓 are basic dir.) −𝑔 𝑥 𝑔 𝑓 𝑥 5 =0 𝑥 4 =0 ℎ 𝑥 3 =0 𝑥 6 =0 𝑦 𝑥 2 =0 𝑥 1 =0 Figure 3.3: 𝑛−𝑚=2. 𝑥 4 , 𝑥 5 nonbasic. (𝑔,𝑓 are basic dir.) Then pivot with 𝑥 4 entering, 𝑥 6 exiting basis. (ℎ,−𝑔 are basic dir.) Now if 𝑥 5 enters basis, we follow the direction ℎ until 𝑥 1 ≥0 becomes active, in which case 𝑥 1 leaves basis. Linear Programming 2018
Cycling : a sequence of basis changes that leads back to the initial basis. ( only basis changes, no solution change) Cycling may occur if there exists degeneracy. Finite termination of the simplex method is not guaranteed. Need special rules for entering and/or leaving variable selection to avoid cycling (later). Although cycling hardly occurs in practice, prolonged degenerate iterations might happen frequently, especially in well-structured problems. Hence how to get out of degenerate iterations as early as possible is of practical concern. Linear Programming 2018
Pivot Selection (a) Smallest (largest) coefficient rule : choose 𝑥 𝑗 with 𝑎𝑟𝑔𝑚𝑖𝑛 𝑗∈𝑁 𝑐 𝑗 : 𝑐 𝑗 <0 (b) largest increase rule: 𝑥 𝑗 with 𝑐 𝑗 <0 and 𝜃 ∗ 𝑐 𝑗 is max. (c) steepest edge rule (d) maintain candidate list (e) smallest subscript rule ( avoid cycling). Linear Programming 2018
Review of calculus Purpose: Interpret the value 𝑐 ′ 𝑑 𝑗 ≡ 𝑐 𝑗 in a different way and derive the logic for the steepest edge rule Def: 𝑝>0 integer. ℎ: 𝑅 𝑛 →𝑅, then ℎ(𝑥)≡𝑜 𝑥 𝑝 if and only if lim 𝑥 𝑘 →0 ℎ 𝑥 𝑘 𝑥 𝑘 𝑝 =0 for all sequences 𝑥 𝑘 with 𝑥 𝑘 ≠0 for all 𝑘, that converge to 0. Def: 𝑓: 𝑅 𝑛 →𝑅 is called differentiable at 𝑥 if and only if there exists a vector 𝛻𝑓(𝑥) (called gradient) such that 𝑓 𝑧 =𝑓 𝑥 +𝛻𝑓 𝑥 ′ 𝑧−𝑥 +𝑜 𝑧−𝑥 or in other words, lim 𝑧→𝑥 𝑓 𝑧 −𝑓 𝑥 −𝛻𝑓(𝑥)′ 𝑧−𝑥 𝑧−𝑥 =0 (Frechet differentiability) Linear Programming 2018
𝑓′(𝑥;𝑦)≡ lim 𝜆↓0 𝑓 𝑥+𝜆𝑦 −𝑓(𝑥) 𝜆 if it exists. Def : 𝑓: 𝑅 𝑛 →𝑅. One sided directional derivative of 𝑓 at 𝑥 with respect to a vector 𝑦 is defined as 𝑓′(𝑥;𝑦)≡ lim 𝜆↓0 𝑓 𝑥+𝜆𝑦 −𝑓(𝑥) 𝜆 if it exists. Note that − 𝑓 ′ 𝑥;−𝑦 = lim 𝜆↑0 𝑓 𝑥+𝜆𝑦 −𝑓(𝑥) 𝜆 . Hence the one-sided directional derivative 𝑓′(𝑥;𝑦) is two-sided if and only if 𝑓′(𝑥;−𝑦) exists and 𝑓 ′ 𝑥;−𝑦 =−𝑓′(𝑥;𝑦). Def : 𝑖−𝑡ℎ partial derivative of 𝑓 at 𝑥 : 𝜕𝑓 𝑥 𝜕 𝑥 𝑖 = lim 𝜆→0 𝑓 𝑥+𝜆 𝑒 𝑖 −𝑓(𝑥) 𝜆 if it exists (two sided) ( Gateuax differentiability ) ( 𝑓 is called Gateaux differentiable at 𝑥 if all (two-sided) directional derivatives of 𝑓 at a vector 𝑥 exist and 𝑓′(𝑥;𝑦) is a linear function of 𝑦. F differentiability implies G differentiability, but not conversely. We do not need to distinguish F and G differentiability for our purposes here.) Linear Programming 2018
Suppose 𝑓 is F differentiable at 𝑥, then for any 𝑦≠0 0= lim 𝜆↓0 𝑓 𝑥+𝜆𝑦 −𝑓 𝑥 −𝛻𝑓(𝑥)′(𝜆𝑦) 𝜆 𝑦 = 1 𝑦 𝑓 ′ 𝑥;𝑦 −𝛻𝑓 𝑥 ′𝑦 Hence 𝑓 ′ 𝑥;𝑦 exists and 𝑓 ′ 𝑥;𝑦 =𝛻𝑓 𝑥 ′𝑦 (linear function of 𝑦) If 𝑓 is F differentiable, then it implies 𝑓 ′ 𝑥;−𝑦 =−𝑓′(𝑥;𝑦) from above, hence 𝑓′(𝑥;𝑦) is two-sided. In particular, 𝛻𝑓 𝑥 ′ 𝑒 𝑖 = lim 𝜆→0 𝑓 𝑥+𝜆 𝑒 𝑖 −𝑓(𝑥) 𝜆 = 𝜕𝑓 𝜕 𝑥 𝑖 (𝑥) Hence 𝛻𝑓 𝑥 = 𝜕𝑓 𝜕 𝑥 1 𝑥 ,…, 𝜕𝑓 𝜕 𝑥 𝑛 (𝑥) . Linear Programming 2018
But, 𝑓′(𝑥;𝑑) is sensitive to the size (norm) of 𝑑. In simplex algorithm, moving direction 𝑑= − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 for 𝑥 𝑗 entering. Then, 𝑓 ′ 𝑥;𝑑 =𝛻𝑓 𝑥 ′ 𝑑= 𝑐 ′ 𝑑= 𝑐 𝐵 ′, 𝑐 𝑁 ′ − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 = 𝑐 𝑗 − 𝑐 𝐵 ′ 𝐵 −1 𝐴 𝑗 = 𝑐 𝑗 . Hence the rate of change 𝑐 ′ 𝑑 in the objective function when we move in the direction 𝑑 from 𝑥 is the directional derivative. So 𝑐 𝑗 − 𝑐 𝐵 ′ 𝐵 −1 𝐴 𝑗 is the rate of change of 𝑓 when we move in the direction 𝑑. But, 𝑓′(𝑥;𝑑) is sensitive to the size (norm) of 𝑑. ( 𝑓 ′ 𝑥;𝑘𝑑 =𝛻𝑓 𝑥 ′ 𝑘𝑑 =𝑘𝑓′(𝑥;𝑑)) To make fair comparison among basic directions, use normalized vector 𝑑/ 𝑑 to compute 𝑓′(𝑥;𝑑). Linear Programming 2018
𝛻𝑓 𝑥 ′ 𝑑 𝑗 𝑑 𝑗 = 𝑐 𝑗 𝑑 𝑗 = 𝑐 𝑗 − 𝐵 −1 𝐴 𝑗 2 +1 𝛻𝑓 𝑥 ′ 𝑑 𝑗 𝑑 𝑗 = 𝑐 𝑗 𝑑 𝑗 = 𝑐 𝑗 − 𝐵 −1 𝐴 𝑗 2 +1 ( 𝑑 = 𝑖 𝑑 𝑖 2 1 2 , 𝑑= − 𝐵 −1 𝐴 𝑗 𝑒 𝑗 ) Hence, among basic directions with 𝑐 𝑗 <0, choose the one with smallest normalized directional derivative. (steepest edge rule. choose the basic direction which makes smallest angle with (−𝑐) vector.) Problem here is that we need to compute 𝑑 𝑗 (additional efforts needed). But, once 𝑑 𝑗 is computed, it can be updated efficiently in subsequent iterations. Competitive (especially the dual form) against other rules in real implementation. Linear Programming 2018