1 One Size Fits All? : Computational Tradeoffs in Mixed Integer Programming Software Bob Bixby, Mary Fenelon, Zonghao Gu, Ed Rothberg, and Roland Wunderling ILOG, Inc.
2 A MIP Code is a Bag of Tricks q Presolve q Cutting planes q Gomory cuts q Knapsack cuts q Etc. q Node presolve q Heuristics q Node selection strategy q Etc.
3 1. Improvement on a test set 2. Evaluation and tuning on a wider problem set 3. Implementation and deployment in a MIP code 4. Modelers (silently) benefit Typical Path to Widespread Adoption of a New Technique
4 q Cuts 53.7X q Gomory2.5X q MIR1.8X q Knapsack1.4X q Flow covers1.2X q Implied bounds1.2X q … q Presolve10.8X q Heuristics1.4X q Node presolve1.3X q Probed dives1.1X A Few Examples [Bixby, Fenelon, Gu, Rothberg, Wunderling, 2002]
5 1. Improvement on a test set 2. Overall degradation on a wider problem set 3. Not implemented in a MIP Code 4. Modeler reads paper and implements technique Unlikely Path to Widespread Adoption of a New Technique
6 1.Improvement on a test set 2.Overall degradation on a wider problem set 3.Implemented in a MIP code anyway 4.Modeler: 1.Reads documentation or paper 2.Recognizes that technique will be effective on his models 3.Enables non-default option in MIP code Another Unlikely Path to Widespread Adoption of a New Technique
7 q Explore logical consequences of fixing binary variables to 0/1 q Variable fixing q Coefficient lifting q Implied bound cuts q Model mod011.mps q Moderately difficult model from MIPLIB set Example : Probing [Brearley, Mitra, Williams, 1975]
8 Model mod011.mps without probing Nodes Cuts/ Node Left Objective IInf Best Integer Best Node ItCnt Gap e e … e Flowcuts: * e e % * e e % * e e % * e e % * e e % * e e % * e e % * e e % * e e % * e e % * e e % * e e % Implied bound cuts applied: 410 Flow cuts applied: 695 Flow path cuts applied: 117 Gomory fractional cuts applied: 14 Integer optimal solution: Objective = e+007 Solution time = sec. Iterations = Nodes = 16437
9 Model mod011.mps with probing Reduced MIP has 1558 rows, 6895 columns, and nonzeros. … Probing added 286 nonzeros Probing time = 0.31 sec. Nodes Cuts/ Node Left Objective IInf Best Integer Best Node ItCnt Gap e e … e Cuts: * e e % * e e % * e e % Implied bound cuts applied: 2398 Flow cuts applied: 496 Flow path cuts applied: 207 Gomory fractional cuts applied: 3 Integer optimal solution: Objective = e+007 Solution time = sec. Iterations = Nodes = 110
10 q Mean performance ratio: q 1s0.68 q 10s0.59 q 100s0.34 q 500s0.31 q 1000s0.25 Probing on a Wider Set of Models
11 q Use dual simplex to choose branching variable q Estimate objective degradation by performing a limited number of simplex iterations q Maximize the minimum degradation q Finding optimal solutions for large Traveling Salesman Problems q Crucial for improving objective lower bound Another Example : Strong Branching [Applegate, Bixby, Chvatal, and Cook, 1995]
12 q Mean performance ratio: q 1s0.96 q 10s0.76 q 100s0.72 q 500s0.66 q 1000s0.69 Strong Branching on a Wider Set of Models
13 Consistency Between User Goals and Code Goals
14 q Emphasis before CPLEX 7.0: q Minimize time to proven optimality q Important components of approach: q Aggressive cut generation q Search strategy that attempts to avoid unnecessary work m Depth First Search until first feasible found m Best Bound Search until termination A MIP Code Has An (Implicit) Emphasis
15 q User reaction q Depends heavily on user metric q Common metric: m Time to first “good” feasible solution q Reevaluate the bag of tricks q Time to first feasible? q Time to 10% (20%?) gap? Potential Mismatch Between Goals
16 q Mean performance improvement (CPLEX 7.5) : Desired optimality gap 10%20%30%finite q 1s q 10s q 100s q 500s q 1000s Performance for Feasibility Emphasis
17 q Simple underlying algorithmic changes q Less aggressive application of cuts q More time spent near leaves of search tree q Could be achieved with parameter changes q Users pleased nonetheless q Describe goals rather than understanding and choosing techniques User Emphasis Setting: Feasibility Instead of Optimality
18 q Improvements reduce the importance of emphasis: q More heuristics produce feasible solutions faster and more consistently q Probed dives make dives more likely to lead to good feasible solutions User Emphasis Setting in 8.0
19 q Mean performance improvement (CPLEX 8.0) : Desired optimality gap 10%20%30%Finite q 1s q 10s q 100s q 500s q 1000s Performance for Feasibility Emphasis
20 q Test set: 978 models Selected from our library with over 1500 models q 100,000 seconds limit on ES40 Compaq Alpha q Solved to optimality 775 (77%) q Among those not solved to optimality 116 had gap less than 10% (11.9%) 32 had no integral solution (3.2%) q MIP emphasis feasibility on the 32 models 25 found no feasible solution (2.6%) MIP Results of CPLEX8.0 [Bixby, Fenelon, Gu, Rothberg, Wunderling, 2002]
21 q Natural progression: 1. Try default settings 2. Specify an emphasis 3. Change parameter settings 4. Use priority order 5. Reformulate model q Some models still unsolved after all these steps Hard Problems
22 Exploiting User Knowledge
23 q Users sometimes have domain knowledge that can help solution q Crucial variables q Heuristics for finding good feasible solutions q Strategies to decompose the problem through branching q Special cutting planes q Etc. q User knowledge on the original model q MIP code solves the presolved model User Knowledge
24 q MIP callbacks q Cut callback q Heuristic callback q Branch callback q Incumbent callback q Advanced presolve q Allows user to express domain knowledge in terms of the original model Advanced Features
25 q Before CPLEX 7.0 q Usually need to turn off presolve (lose 10.8X) q All constraints must be explicit q Since CPLEX 7.0 q Can choose whether callbacks will work with original or presolved model q Can obtain mappings for variables and constraints in the original model q No need to specify all constraints up front m Lazy constraints Adding User Cuts or Lazy Constraints
26 q Original model max {x 1 +x 2 +2x 3 : 3x 1 +3x 2 +4x 3 6, x 1, x 2, x 3 B} q Presolved model max {y+2x 3 : 3y+4x 3 6, 0 y 2, 0 x 3 1, y, x 3 Z} q Cut for the original model x 1 + x 3 1 It cannot be transformed and added to the presolved model q Need to turn off non-linear reductions, such as parallel column reduction Caution: User Cuts
27 q Model max {x: 5x + 3y 10, x - y 0, x 0, y 0, x Z} q x - y 0 treated as a lazy constraint Presolve will fix y to 0 and x to 2 x – y 0 becomes 2 – 0 0 Augmented model is infeasible? q Need to turn off dual reductions q Reductions that depend on the objective function Caution: Lazy Constraints
28 Side Constraints
29 q Many types of “side constraints” q SOS constraints q Semi-continuous variables q Cardinality constraints q Min, Max and Abs functions q Logical expressions m e.g., x = 1 implies y+z <= 3 q Tour requirements (TSP) q Etc. Side Constraints
30 q Example: SOS1(z 1,z 2 ) q Introduce auxiliary binary variables b 1, b 2 q z 1 u 1 b 1 ; z 2 u 2 b 2 ; b 1 + b 2 1 q Pros: q All MIP tricks apply (cuts, presolve, heuristics, etc.) q No need to handle special cases in MIP code q Cons: q Model size increases q Often leads to large big-M coefficients Handling Side Constraints - Linearize
31 q Example: SOS1(z 1,z 2 ) q When both z 1 > 0 and z 2 > 0 at a node… q Branch on SOS1: m Left child: z 1 = 0 m Right child: z 2 = 0 q Cons: q Special case for each construct q No presolve, cuts, heuristics, … q Looser relaxation Handling Side Constraints - Branching
32 q Is it possible to tighten relaxation without an explicit linearization? q Specialized cuts or lazy constraints E.g., cardinality constraints [ de Farias and Nemhauser], TSP [Applegate, Bixby, Chvátal, and Cook ] q Need to derive for each type of non-linear constraint q Alternative: extension to Gomory cuts Tighter Implicit Formulation?
33 q Given y, x j Z +, and y + a ij x j = d = d + f, f > 0 q Rounding: Where a ij = a ij + f j, define t = y + ( a ij x j : f j f) + ( a ij x j : f j > f) Z q Then (f j x j : f j f) + (f j -1)x j : f j > f) = d - t q Disjunction: t d (f j x j : f j f) f t d ((1-f j )x j : f j > f) 1-f q Combining: ((f j /f)x j : f j f) + ([(1-f j )/(1-f)]x j : f j > f) 1 Gomory Cut Review
34 q Typical disjunctive set of constraints x must satisfy at least k of n sets of linear constraints, S i = {x: A i x b i } for i = 1, …, n q Modeling with binary variables Dantzig (1957), Nemhauser and Wolsey (1988) q Side constraints in the class q SOS constraints q Semi-continuous variables q Cardinality constraints q Min, Max and Abs functions q Logical linear expressions An Important Class: Disjunctive Constraints
35 q Definition At most m variables of x 1,…, x n can be positive q Use typical disjunctive set to express S i = {x: -x i 0} for i = 1, …, n k = n - m Cardinality Constraint
36 q Given x j R +, and x is not in S 1, S 2, …, S m, with m > n – k Note x should be in at least k - (n - m) of the above sets q Pick a violated constraints from each set a ij x j d i, i = 1, …, m q Substitute basic variables with nonbasic ones f ij x j g i, i = 1, …, m Note g i > 0. Let h ij = max (0, f ij / g i ), then h ij x j 1, i = 1, …, m q Combine h ij x j m + k – n Gomory Cut Extension (with Puget)
37 q Default works well to prove optimality or to find good feasible solutions for most models q Try it first q CPLEX has an emphasis setting. q Using it to specify a goal may help for some models q Several features are off by default q Turning them on or changing parameter settings can help solving hard models q CPLEX provides advanced routines for exploiting user knowledge q They can be helpful for hard models, e.g. their use for extending Gomory cuts for handling side constraints One Size Fits All?