Chapter 6. Large Scale Optimization 6.1 Delayed column generation min 𝑐 ′ 𝑥, 𝐴𝑥=𝑏, 𝑥≥0. 𝐴 is full row rank with a large number of columns. Impractical to have all columns initially. Start with a few columns and a basic feasible solution (restricted problem). Want to generate (find) entering nonbasic variable (column) as needed. ((delayed) column generation) If 𝑐 𝑖 <0, then 𝑥 𝑖 can enter basis. Hence solve min 𝑐 𝑖 over all 𝑖. If min 𝑐 𝑖 <0, have found an entering variable (column). If min 𝑐 𝑖 ≥0, no entering column exists, hence current basis optimal. If entering column found, add the column to the restricted problem and solve the restricted problem again to optimality. min 𝑖∈𝐼 𝑐 𝑖 𝑥 𝑖 , 𝑖∈𝐼 𝐴 𝑖 𝑥 𝑖 =𝑏, 𝑥≥0 ( 𝐼 : index set of variables (columns) we have at hand) Then continue to find entering columns. Linear Programming 2018
6.2. Cutting stock problem W = 70 17 17 17 15 scrap Linear Programming 2018
𝑏 𝑖 rolls of width 𝑤 𝑖 , 𝑖=1,2,…,𝑚 need to be produced. Rolls of paper with width W ( called raw) to be cut into small pieces (called final). 𝑏 𝑖 rolls of width 𝑤 𝑖 , 𝑖=1,2,…,𝑚 need to be produced. How to cut the raws to minimize the number of raws used while satisfying order (equivalent to minimizing scraps)? ex) W = 70, then 3 of 𝑤 1 = 17 and 1 of 𝑤 2 = 15 can be produced from a raw. This way of production can be represented as pattern (3, 1, 0, 0, … , 0). 𝑎 1𝑗 , 𝑎 2𝑗 ,…, 𝑎 𝑚𝑗 for 𝑗−𝑡ℎ pattern is feasible if 𝑖=1 𝑚 𝑎 𝑖𝑗 𝑤 𝑖 ≤𝑊. Linear Programming 2018
where 𝑎 𝑖𝑗 is the number of 𝑖−𝑡ℎ finals produced in 𝑗−𝑡ℎ pattern. Formulation min 𝑗=1 𝑛 𝑥 𝑗 𝑗=1 𝑛 𝑎 𝑖𝑗 𝑥 𝑗 = 𝑏 𝑖 , 𝑖=1,…,𝑚 𝑥 𝑗 ≥0 and integer, 𝑗=1,…,𝑛, where 𝑎 𝑖𝑗 is the number of 𝑖−𝑡ℎ finals produced in 𝑗−𝑡ℎ pattern. 𝑥 𝑗 : the number of raws to be cut using cutting pattern 𝑗. 𝑛: total number of possible cutting patterns, which can be very large. ( 𝑗=1 𝑛 𝐴 𝑗 𝑥 𝑗 =𝑏, where 𝐴 𝑗 is the vector denoting the 𝑗−𝑡ℎ cutting pattern.) We need integer solution, but LP relaxation can be used to find good approximate solution if solution value large (round down the solution to obtain an integer solution). Use a few more raws to meet unsatisfied demand. For initial b.f.s., for 𝑗=1,…,𝑚, let 𝑗−𝑡ℎ pattern consists of one final of width 𝑤 𝑗 and none of the other widths. Linear Programming 2018
Dynamic programming algorithm for integer knapsack problem. After computing 𝑝 vector ( 𝑝 ′ = 𝑐 𝐵 ′ 𝐵 −1 ) from the optimal solution to the restricted problem, we try to find entering nonbasic variable (column). Candidate for entering column (pattern) is any nonbasic variable with reduced cost 1−𝑝′ 𝐴 𝑗 <0 (𝑝′ 𝐴 𝑗 >1) , hence solve min 1−𝑝′ 𝐴 𝑗 over all possible patterns. solve max 𝑝′ 𝐴 𝑗 over all possible patterns (integer knapsack problem) 𝑧= max 𝑖=1 𝑚 𝑝 𝑖 𝑎 𝑖 𝑖=1 𝑚 𝑤 𝑖 𝑎 𝑖 ≤𝑊 𝑎 𝑖 ≥0 and integer, 𝑖=1,…,𝑚 If 𝑧>1, have found a cutting pattern (nonbasic variable) that can enter the basis. Otherwise (𝑧≤1), current optimal solution to the restricted problem is optimal to the whole problem. Here, 𝑝 𝑖 can be interpreted as the value of 𝑖−𝑡ℎ final at the current solution (current basis 𝐵). Dynamic programming algorithm for integer knapsack problem. ( can assume 𝑤 𝑖 >0, integer. knapsack is NP-hard, so no polynomial time algorithm is known. ) Linear Programming 2018
For 𝑣≥ 𝑤 𝑚𝑖𝑛 , 𝐹 𝑣 = max 𝑖=1,…,𝑚 𝐹 𝑣− 𝑤 𝑖 + 𝑝 𝑖 : 𝑣≥ 𝑤 𝑖 Let 𝐹(𝑣) be the optimal value of the problem when knapsack capacity is 𝑣. 𝑤 𝑚𝑖𝑛 = min 𝑖 𝑤 𝑖 For 𝑣< 𝑤 𝑚𝑖𝑛 , 𝐹 𝑣 =0 For 𝑣≥ 𝑤 𝑚𝑖𝑛 , 𝐹 𝑣 = max 𝑖=1,…,𝑚 𝐹 𝑣− 𝑤 𝑖 + 𝑝 𝑖 : 𝑣≥ 𝑤 𝑖 Suppose 𝑎 0 is opt. solution when r.h.s. is 𝑣− 𝑤 𝑖 , then 𝑎 0 + 𝑒 𝑖 is a feasible solution when r.h.s. is 𝑣. Hence 𝐹 𝑣 ≥𝐹 𝑣− 𝑤 𝑖 + 𝑝 𝑖 , 𝑖=1,…,𝑚, 𝑣≥ 𝑤 𝑖 Suppose 𝑎 ∗ is optimal solution when r.h.s. is 𝑣≥ 𝑤 𝑚𝑖𝑛 , then there exists some 𝑘 with 𝑎 𝑘 ∗ >0 and 𝑣≥ 𝑤 𝑘 . Hence 𝑎 ∗ − 𝑒 𝑘 is a feasible solution when r.h.s. is 𝑣− 𝑤 𝑘 . So 𝐹 𝑣− 𝑤 𝑘 ≥𝐹 𝑣 − 𝑝 𝑘 ( 𝐹 𝑣 ≤𝐹 𝑣− 𝑤 𝑘 + 𝑝 𝑘 for some 𝑘) Linear Programming 2018
Actual solution recovered by backtracking the recursion. Running time of the algorithm is 𝑂(𝑚𝑊) which is not polynomial of the length of encoding. Called pseudopolynomial running time ( polynomial of data 𝑊 itself). Note : the running time becomes polynomial if it is polynomial with respect to 𝑚 and log 2 𝑊 , but 𝑊= 2 log 2 𝑊 which is not polynomial of log 2 𝑊 . Many practical problems can be naturally formulated similar to the cutting stock problem. Especially in 0-1 IP with many columns. For cutting stock problem, we only obtained a fractional solution. But for 0-1 IP, fractional solution can be of little help and we need a mechanism to find optimal integer solution ( branch-and-price approach, column generation combined with branch-and-bound ). Linear Programming 2018
6.3. Cutting plane methods Dual of column generation (constraint generation) Consider max 𝑝 ′ 𝑏, 𝑝′ 𝐴 𝑖 ≤ 𝑐 𝑖 , 𝑖=1,…,𝑛 (1) ( 𝑛 can be very large ) Solve max 𝑝 ′ 𝑏, 𝑝′ 𝐴 𝑖 ≤ 𝑐 𝑖 , 𝑖∈𝐼, 𝐼⊆ 1,…,𝑛 (2) and get optimal solution 𝑝 ∗ to (2). If 𝑝 ∗ is feasible to (1), then it is also optimal to (1) If 𝑝 ∗ is infeasible to (1), find a violated constraint in (1) and add it to (2), then reoptimize (2) again. Repeat it. Linear Programming 2018
Solve min 𝑐 𝑖 − 𝑝 ∗ ′ 𝐴 𝑖 over all 𝑖. If optimal value ≥0 𝑝 ∗ ∈𝑃 Separation problem : Given a polyhedron 𝑃 (described with possibly many inequalities) and a vector 𝑝 ∗ , determine if 𝑝 ∗ ∈𝑃. If 𝑝 ∗ ∉𝑃, find a (valid) inequality violated by 𝑝 ∗ . Solve min 𝑐 𝑖 − 𝑝 ∗ ′ 𝐴 𝑖 over all 𝑖. If optimal value ≥0 𝑝 ∗ ∈𝑃 If optimal value <0 𝑐 𝑖 < 𝑝 ∗ ′ 𝐴 𝑖 (violated) Linear Programming 2018
6.4. Dantzig-Wolfe decomposition Use of decomposition theorem to represent a specially structured LP problem in different form. Column generation is used to solve the problem. Consider an LP in the following form min 𝑐 1 ′ 𝑥 1 + 𝑐 2 ′ 𝑥 2 𝐷 1 𝑥 1 + 𝐷 2 𝑥 2 = 𝑏 0 𝐹 1 𝑥 1 = 𝑏 1 (1) 𝐹 2 𝑥 2 = 𝑏 2 𝑥 1 , 𝑥 2 ≥0 𝑥 1 , 𝑥 2 : dimension 𝑛 1 , 𝑛 2 , 𝑏 0 , 𝑏 1 , 𝑏 2 : dimension 𝑚 0 , 𝑚 1 , 𝑚 2 Let 𝑃 𝑖 = 𝑥 𝑖 ≥0: 𝐹 𝑖 𝑥 𝑖 = 𝑏 𝑖 , 𝑖=1,2. Assume 𝑃 𝑖 ≠∅. Note that the nonnegativity constraints guarantee that 𝑃 𝑖 is pointed, hence 𝑆={0}. (𝑃=𝑆+𝐾+𝑄) Linear Programming 2018
Plug into (2) get master problem min 𝑐 1 ′ 𝑥 1 + 𝑐 2 ′ 𝑥 2 𝐷 1 𝑥 1 + 𝐷 2 𝑥 2 = 𝑏 0 (2) 𝑥 1 ∈ 𝑃 1 , 𝑥 2 ∈ 𝑃 2 𝑥 𝑖 ∈ 𝑃 𝑖 can be represented as 𝑥 𝑖 = 𝑗∈ 𝐽 𝑖 𝜆 𝑖 𝑗 𝑥 𝑖 𝑗 + 𝑘∈ 𝐾 𝑖 𝜃 𝑖 𝑘 𝑤 𝑖 𝑘 𝜆 𝑖 𝑗 , 𝜃 𝑖 𝑘 ≥0, 𝑗∈ 𝐽 𝑖 𝜆 𝑖 𝑗 =1 Plug into (2) get master problem min 𝑗∈ 𝐽 1 𝜆 1 𝑗 𝑐 1 ′ 𝑥 1 𝑗 + 𝑘∈ 𝐾 1 𝜃 1 𝑘 𝑐 1 ′ 𝑤 1 𝑘 + 𝑗∈ 𝐽 2 𝜆 2 𝑗 𝑐 2 ′ 𝑥 2 𝑗 + 𝑘∈ 𝐾 2 𝜃 2 𝑘 𝑐 2 ′ 𝑤 2 𝑘 𝑗∈ 𝐽 1 𝜆 1 𝑗 𝐷 1 𝑥 1 𝑗 + 𝑘∈ 𝐾 1 𝜃 1 𝑘 𝐷 1 𝑤 1 𝑘 + 𝑗∈ 𝐽 2 𝜆 2 𝑗 𝐷 2 𝑥 2 𝑗 + 𝑘∈ 𝐾 2 𝜃 2 𝑘 𝐷 2 𝑤 2 𝑘 = 𝑏 0 𝑗∈ 𝐽 1 𝜆 1 𝑗 =1 𝑗∈ 𝐽 2 𝜆 2 𝑗 =1 𝜆 𝑖 𝑗 , 𝜃 𝑖 𝑘 ≥0, ∀𝑖,𝑗,𝑘 dual vec. 𝑞 dual var. 𝑟 1 dual var. 𝑟 2 Linear Programming 2018
Alternatively, its columns can be viewed as 𝑗∈ 𝐽 1 𝜆 1 𝑗 𝐷 1 𝑥 1 𝑗 1 0 + 𝑗∈ 𝐽 2 𝜆 2 𝑗 𝐷 2 𝑥 2 𝑗 0 1 + 𝑘∈ 𝐾 1 𝜃 1 𝑘 𝐷 1 𝑤 1 𝑘 0 0 + 𝑘∈ 𝐾 2 𝜃 2 𝑘 𝐷 2 𝑤 2 𝑘 0 0 = 𝑏 0 1 1 The new formulation has many variables (columns), but it can be solved by column generation technique. Actual solution 𝑥 1 , 𝑥 2 can be recovered from and 𝜃. 𝑥 𝑖 is expressed as convex combination of extreme points of 𝑃 𝑖 + conical combination of extreme rays of 𝑃 𝑖 . Linear Programming 2018
Decomposition algorithm Start with a restricted mater problem with a few columns, providing an initial b.f.s. to the restricted master. Suppose having an optimal b.f.s. to the restricted master problem, dual vector 𝑝= 𝑞, 𝑟 1 , 𝑟 2 , 𝑞∈ 𝑅 𝑚 0 , 𝑟 1 , 𝑟 2 ∈𝑅. Then reduced costs are (for 𝜆 1 , 𝜃 1 ) 𝑐 1 ′ 𝑥 1 𝑗 − 𝑞 ′ 𝑟 1 𝑟 2 𝐷 1 𝑥 1 𝑗 1 0 = 𝑐 1 ′ − 𝑞 ′ 𝐷 1 𝑥 1 𝑗 − 𝑟 1 𝑐 1 ′ 𝑤 1 𝑘 − 𝑞 ′ 𝑟 1 𝑟 2 𝐷 1 𝑤 1 𝑘 0 0 = 𝑐 1 ′ − 𝑞 ′ 𝐷 1 𝑤 1 𝑘 Entering variable (column) identified if reduced costs for some variables (which are not present currently in the restricted master) <0. Hence solve min 𝑐 1 ′ −𝑞′ 𝐷 1 𝑥 1 , 𝑥 1 ∈ 𝑃 1 (called subproblem) Linear Programming 2018
Generate a column for 𝜃 1 𝑘 , i.e. 𝐷 1 𝑤 1 𝑘 0 0 . (a) optimal cost is −∞ returns extreme ray 𝑤 1 𝑘 with 𝑐 1 ′ −𝑞′ 𝐷 1 𝑤 1 𝑘 <0. Generate a column for 𝜃 1 𝑘 , i.e. 𝐷 1 𝑤 1 𝑘 0 0 . (b) optimal finite and < 𝑟 1 returns extreme point 𝑥 1 𝑗 with 𝑐 1 ′ −𝑞′ 𝐷 1 𝑥 1 𝑗 < 𝑟 1 . Generate a column for 𝜆 1 𝑗 , i.e. 𝐷 1 𝑥 1 𝑗 1 0 . (c) optimal cost ≥ 𝑟 1 𝑐 1 ′ −𝑞′ 𝐷 1 𝑥 1 𝑗 ≥ 𝑟 1 ∀ 𝑥 1 𝑗 , 𝑐 1 ′ −𝑞′ 𝐷 1 𝑤 1 𝑘 ≥0 ∀ 𝑤 1 𝑘 no entering variable among 𝜆 1 𝑗 , 𝜃 1 𝑘 . Perform the same for 𝜆 2 𝑗 , 𝜃 2 𝑘 . The method can also be used when there are more than 2 blocks or just one block in the constraints. (see ex. 6.2, 6.3) Linear Programming 2018
Starting the algorithm Find extreme points 𝑥 1 1 , 𝑥 2 1 of 𝑃 1 and 𝑃 2 . May assume that 𝐷 1 𝑥 1 1 + 𝐷 2 𝑥 2 1 ≤𝑏 (if not, multiply (-1) on both sides of the corresponding constraint), then solve min 𝑡=1 𝑚 0 𝑦 𝑡 𝑖=1,2 𝑗∈ 𝐽 𝑖 𝜆 𝑖 𝑗 𝐷 𝑖 𝑥 𝑖 𝑗 + 𝑘∈ 𝐾 𝑖 𝜃 𝑖 𝑘 𝐷 𝑖 𝑤 𝑖 𝑘 +𝑦= 𝑏 0 𝑗∈ 𝐽 1 𝜆 1 𝑗 =1 𝑗∈ 𝐽 2 𝜆 2 𝑗 =1 𝜆 𝑖 𝑗 ≥0, 𝜃 𝑖 𝑘 ≥0, 𝑦 𝑡 ≥0, ∀ 𝑖,𝑗,𝑘,𝑡 Initial b.f.s.: 𝜆 1 1 = 𝜆 2 1 =1, 𝜆 𝑖 𝑗 =0, 𝑗≠1, 𝜃 𝑖 𝑘 =0, ∀ 𝑘, 𝑦= 𝑏 0 − 𝐷 1 𝑥 1 1 − 𝐷 2 𝑥 2 1 . From optimal solution of phase 1 (objective value = 0), can find initial b.f.s. for phase 2. Linear Programming 2018
Termination and computational experience Fast improvement in early iterations, but convergence becomes slow in the tail of the sequence. Revised simplex is more competitive in terms of running time. Suitable for large, structured problems. Researches on improving the convergence speed. Adding a column is equivalent to adding a violated constraint in the dual problem. Dual extreme point solution oscillates much in the dual space, which is not desirable → stabilized column generation. Think in dual space. How to obtain dual optimal solution fast? Column generation also used to solve LP relaxation of D-W decomposed integer programs. Need integer solution → branch-and-price. Advantages of decompositon approach also lies in the capability to handle (isolate) difficult structures in the subproblem when we consider large integer programs (e.g., constrained shortest path, robust knapsack problem type). Recall the alternative formulation for communication path problem in Chapter 1. It is D-W decomposition. Linear Programming 2018
Bounds on the optimal cost Thm 6.1 : Suppose optimal 𝑧 ∗ is finite. Let 𝑧 be the current best solution (upper bound on 𝑧 ∗ ), 𝑟 𝑖 dual variable value for 𝑖−𝑡ℎ convexity constraint and 𝑧 𝑖 finite optimal cost for 𝑖−𝑡ℎ subproblem. Then 𝑧+ 𝑖 𝑧 𝑖 − 𝑟 𝑖 ≤ 𝑧 ∗ ≤𝑧. pf) Modify the current dual solution to a dual feasible solution by decreasing the value of 𝑟 𝑖 to 𝑧 𝑖 . Dual of master problem is max 𝑞′ 𝑏 0 + 𝑟 1 + 𝑟 2 𝑞′ 𝐷 1 𝑥 1 𝑗 + 𝑟 1 ≤ 𝑐 1 ′ 𝑥 1 𝑗 , ∀ 𝑗∈ 𝐽 1 𝑞′ 𝐷 1 𝑤 1 𝑘 ≤ 𝑐 1 ′ 𝑤 1 𝑘 , ∀ 𝑘∈ 𝐾 1 𝑞′ 𝐷 2 𝑥 2 𝑗 + 𝑟 2 ≤ 𝑐 2 ′ 𝑥 2 𝑗 , ∀ 𝑗∈ 𝐽 2 𝑞′ 𝐷 2 𝑤 2 𝑘 ≤ 𝑐 2 ′ 𝑤 2 𝑘 , ∀𝑘∈ 𝐾 2 Linear Programming 2018
Suppose have a b.f.s. to master problem with 𝑧 and 𝑞, 𝑟 1 , 𝑟 2 . (continued) Suppose have a b.f.s. to master problem with 𝑧 and 𝑞, 𝑟 1 , 𝑟 2 . Have 𝑞′ 𝑏 0 + 𝑟 1 + 𝑟 2 =𝑧 Optimal cost 𝑧 1 to the first subproblem finite min 𝑗∈ 𝐽 1 𝑐 1 ′ 𝑥 1 𝑗 −𝑞′ 𝐷 1 𝑥 1 𝑗 = 𝑧 1 min 𝑘∈ 𝐾 1 𝑐 1 ′ 𝑤 1 𝑘 −𝑞′ 𝐷 1 𝑤 1 𝑘 ≥0 Note that currently we have 𝑧 1 = min 𝑗∈ 𝐽 1 𝑐 1 ′ −𝑞′ 𝐷 1 𝑥 1 𝑗 < 𝑟 1 ( reduced cost 𝑐 1 ′ −𝑞′ 𝐷 1 𝑥 1 𝑗 − 𝑟 1 <0 for entering variable). If we use 𝑧 1 in place of 𝑟 1 , get dual feasibility for the first two sets of dual constraints. Similarly, use 𝑧 2 in place of 𝑟 2 . Cost is 𝑞′ 𝑏 0 + 𝑧 1 + 𝑧 2 𝑧 ∗ ≥𝑞′ 𝑏 0 + 𝑧 1 + 𝑧 2 =𝑞′ 𝑏 0 + 𝑟 1 + 𝑟 2 + 𝑧 1 − 𝑟 1 + 𝑧 2 − 𝑟 2 =𝑧+ 𝑧 1 − 𝑟 1 + 𝑧 2 − 𝑟 2 Linear Programming 2018