Formulations and Reformulations in Integer Programming Michael Trick Carnegie Mellon University Workshop on Modeling and Reformulation, CP 2004.

Slides:



Advertisements
Similar presentations
Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.
Advertisements

CPAIOR02 School on OptimizationC. Le Pape1 Integrating Operations Research Algorithms in Constraint Programming Claude Le Pape ILOG S.A.
1 Branch-and-Price (Column Generation) Solving Integer Programs With a Huge Number of Variables CP-AI-OR02 School on Optimization Le Croisic, France March.
Branch-and-Bound Technique for Solving Integer Programs
Introduction to Integer Programming Modeling and Methods Michael Trick Carnegie Mellon University CPAI-OR School, Le Croisic 2002.
Chapter 3 Workforce scheduling.
Michael Trick Tepper School, Carnegie Mellon Combinatorial Benders’ Cuts for Sports Scheduling Optimization.
Lecture 10: Integer Programming & Branch-and-Bound
Progress in Linear Programming Based Branch-and-Bound Algorithms
Water Resources Development and Management Optimization (Integer Programming) CVEN 5393 Mar 11, 2013.
Branch & Bound Algorithms
Tutorial on Scheduling Sports Tournaments Michael Trick Tepper School of Business Carnegie Mellon University CORS/INFORMS Banff May, 2004.
1 Logic-Based Methods for Global Optimization J. N. Hooker Carnegie Mellon University, USA November 2003.
Sports Scheduling and the “Real World” Michael Trick Carnegie Mellon University May, 2000.
Integer Programming 3 Brief Review of Branch and Bound
Computational Methods for Management and Economics Carla Gomes
Optimization for Network Planning Includes slide materials developed by Wayne D. Grover, John Doucette, Dave Morley © Wayne D. Grover 2002, 2003 E E 681.
© J. Christopher Beck Lecture 20: Sports Scheduling.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.
Cutting Planes II. The Knapsack Problem Recall the knapsack problem: n items to be packed in a knapsack (can take multiple copies of the same item). The.
LP formulation of Economic Dispatch
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding ILPs with Branch & Bound ILP References: ‘Integer Programming’
Integer programming Branch & bound algorithm ( B&B )
How to Play Sudoku & Win Integer Programming Formulation of a Popular Game Sven LeyfferSven Leyffer, Argonne, Feb. 15, 2005 (windoze powerpoint sumi painting.
Michael Trick Tepper School, Carnegie Mellon INFORMS/ALIO 2010 Combinatorial Benders Approaches to Hard Problems Tweet this with #alioinforms.
Decision Procedures An Algorithmic Point of View
1.3 Modeling with exponentially many constr.  Some strong formulations (or even formulation itself) may involve exponentially many constraints (cutting.
CP Summer School Modelling for Constraint Programming Barbara Smith 1.Definitions, Viewpoints, Constraints 2.Implied Constraints, Optimization,
© J. Christopher Beck Lecture 21: Sports Scheduling 1.
Notes 5IE 3121 Knapsack Model Intuitive idea: what is the most valuable collection of items that can be fit into a backpack?
MILP algorithms: branch-and-bound and branch-and-cut
CP Summer School Modelling for Constraint Programming Barbara Smith 2. Implied Constraints, Optimization, Dominance Rules.
Lecture 6 – Integer Programming Models Topics General model Logic constraint Defining decision variables Continuous vs. integral solution Applications:
Sports Scheduling Written by Kelly Easton, George Nemhauser, Michael Trick Presented by Matthew Lai.
Carnegie Mellon Lecture 14 Loop Optimization and Array Analysis I. Motivation II. Data dependence analysis Chapter , 11.6 Dror E. MaydanCS243:
Chap 10. Integer Prog. Formulations
1 Lagrangean Relaxation --- Bounding through penalty adjustment.
15.053Tuesday, April 9 Branch and Bound Handouts: Lecture Notes.
© J. Christopher Beck Lecture 25: Workforce Scheduling 3.
Column Generation By Soumitra Pal Under the guidance of Prof. A. G. Ranade.
Integer Linear Programming Terms Pure integer programming mixed integer programming 0-1 integer programming LP relaxation of the IP Upper bound O.F. Lower.
Integer Programming Li Xiaolei. Introduction to Integer Programming An IP in which all variables are required to be integers is called a pure integer.
Integer Programming (정수계획법)
© J. Christopher Beck Lecture 21: IP and CP Models for Sports Scheduling.
8/14/04 J. Bard and J. W. Barnes Operations Research Models and Methods Copyright All rights reserved Lecture 6 – Integer Programming Models Topics.
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
Lecture 6 – Integer Programming Models Topics General model Logic constraint Defining decision variables Continuous vs. integral solution Applications:
Introduction to Integer Programming Integer programming models Thursday, April 4 Handouts: Lecture Notes.
1 Chapter 6 Reformulation-Linearization Technique and Applications.
Discrete Optimization MA2827 Fondements de l’optimisation discrète Material from P. Van Hentenryck’s course.
Integer Programming An integer linear program (ILP) is defined exactly as a linear program except that values of variables in a feasible solution have.
The CPLEX Library: Mixed Integer Programming
Introduction to Operations Research
MILP algorithms: branch-and-bound and branch-and-cut
Combinatorial Benders Approaches to Hard Problems
Manpower Planning: Task Scheduling
1.3 Modeling with exponentially many constr.
Integer Programming (정수계획법)
Chapter 1. Formulations (BW)
1.3 Modeling with exponentially many constr.
Integer Programming (정수계획법)
11.5 Implicit Partitioning/Packing Problems
Cynthia Phillips (Sandia National Laboratories)
11.5 Implicit Partitioning/Packing Problems
Chapter 1. Formulations.
Branch-and-Bound Algorithm for Integer Program
Discrete Optimization
Presentation transcript:

Formulations and Reformulations in Integer Programming Michael Trick Carnegie Mellon University Workshop on Modeling and Reformulation, CP 2004

Goals Provide a perspective on what makes a “good” integer programming formulation for a problem Give examples on automatic versus manual reformulation of problems Outline some challenges in the automatic reformulation of integer programs (and perhaps constraint programs?)

Outline Quick review of key concepts in integer programming Two models  Truck-route contracting  Traveling Tournament Problem General Comments

Integer Program (IP) Minimize cx Subject to Ax=b l<=x<=u some or all of x j integral X: variables Linear objective Linear constraints Makes things hard!

Rules of the Game Must put in that form! Seems limiting, but 50 years of experience gives “tricks of the trade” Many formulations for same problem

Simple example Variables x, y both binary (0-1) variables Formulate requirement that x can be 1 only if y is 1 Formulation 1: x ≤ y; x,y  {0,1} Formulation 2: x ≤ 20y; x,y  {0,1} Are they different? Do we care which we use?

Differences From a modeling point of view, they are the same: they both correctly model the given requirement From an algorithmic point of view, they may be different, depending on algorithm used

Solving Integer Programming problems Most common method is some form of branch and bound  Use linear relaxation to bound objective value  Branch on fractional values in linear relaxation solution  Stop branching when subproblem is Infeasible Integer Fathomed (cannot be better than best found so far)

Linear Relaxation Minimize cx Subject to Ax=b l<=x<=u some or all of x j integral X: variables Linear objective Linear constraints Makes things hard!

Illustration

Linear Relaxation

Key is linear relaxation If linear relaxation is very different from integer program then  Choose wrong variables to branch on  Fathoming will be done less often

Ideal Formulation gives convex hull of feasible integer points

Simple example (binary variables) x ≤ y x ≤ 20 y x y x y

Fundamental Mantra of Integer Programming Formulations Use formulations with good linear relaxations! Other issues in formulations: avoiding symmetry issues, keeping problem size down, scaling, etc. that will not be covered here This guideline is quite misleading!

Model 1: Truck Route Contracting Real application Highly simplified version (which shows everything I learned) AB D: 8, A: 12, $150, C: 100 D: 9, A: 1, $250, C: 80 D: 10, A: 2, $200, C: 125 TRUCK DATA D: Departure Time A: Arrival Time $: Cost C: Capacity Sample Package Size: 10 Time Available: 9 Time Needed: 2 Problem: Purchase trucks sufficient to move all packages on time

Model Variables: y(i) = 1 if truck i purchased, 0 else x(j,i) = 1 if package j on i, 0 else Objective: Minimize truck costs Constraints: Packages fit on assigned truck Use only paid for trucks Every package on some truck No partial trucks or package splitting

Formulation: declarations model "Transportation Planning" uses "mmxprs" declarations TRUCKS = PACKAGES = capacity: array(TRUCKS) of real size: array(PACKAGES) of real cost: array(TRUCKS) of real can_use: array(PACKAGES,TRUCKS) of real x: array(PACKAGES,TRUCKS) of mpvar y: array(TRUCKS) of mpvar end-declarations capacity:= [100,200,100,200,100,200,100,200,100,200] size := [17,21,54,45,87,34,23,45,12,43, 54,39,31,26,75,48,16,32,45,55] cost := [1,1.8,1,1.8,1,1.8,1,1.8,1,1.8] can_use:=[0-1 matrix whether package can go on truck]

Formulation: Constraints Total := sum(i in TRUCKS) cost(i)*y(i) forall(i in TRUCKS) sum(j in PACKAGES) size(j)*x(j,i) <= capacity(i) ! (1) Packages fit forall (i in TRUCKS) sum (j in PACKAGES) x(j,i) <= NUM_PACKAGE*y(i) ! (2) use only ! paid for trucks forall (j in PACKAGES) sum(i in TRUCKS) can_use(j,i)*x(j,i) = 1 ! (3) every ! package on truck forall (i in TRUCKS) y(i) is_binary ! (4) no partial trucks forall (i in TRUCKS, j in PACKAGES) x(j,i) is_binary ! (5) no package splitting minimize(Total) end-model

“Improving the Formulation” Every integer programming will immediately spot the improvements: forall (i in TRUCKS) sum (j in PACKAGES) x(j,i) <= NUM_PACKAGE*y(i) ! (2) use only ! paid for trucks should be replaced with forall (i in TRUCKS, j in PACKAGES) x(j,i) <= y(i) !(2’) tighter formulation which we saw as “tighter” (though bigger)

Other improvements Integer programmers are good at spotting opportunities: forall(i in TRUCKS) sum(j in PACKAGES) size(j)*x(j,i) <= capacity(i) ! (1) Packages fit Can be strengthened with forall(i in TRUCKS) sum(j in PACKAGES) size(j)*x(j,i) <= capacity(i)*y(i) ! (1’) Packages fit

Results Weak Formulation: 11.2 sec, 31,825 nodes Strong Formulation: 22.1 sec, 50,631 nodes WHAT HAPPENED?

Automatic versus Manual Reformulations XPRESS-MP (ILOG’s CPLEX will work the same) “knows” about this form of tightening. It will do it automatically In fact, it will do it “better”, only including constraints that the linear relaxation points to as relevant Automatic reformulation trumps manual reformulation in this case!

Naïve code If you use a naïve code that doesn’t understand this, then tightened formulation is critical: Weak formulation: Unsolved after 3600 seconds (gap is 1.22 – 8.4) Strong formulation: 1851 seconds, 2.4 million nodes But who would use such a code for real work?

Gets more confusing Consider the constraint sum(i in TRUCKS) capacity(i)*y(i) >= sum (j in PACKAGES)size(j) ! (6) Have sufficient capacity Such a constraint does not tighten the formulation (it is a linear combination of existing constraints): fundamental mantra says don’t add. Solution time:.1 seconds, 1 node

What happened XPRESS (and other sophisticated codes) knows a lot about “knapsack” constraints and does automatic tightening on those Can’ identify knapsack constraint, but once identified by user, can tighten (a lot!).

Summary of model 1 Standard tightening methods by user makes things slower Creative addition of constraint that does not appear to tighten relaxation makes things much faster

Model 2: Traveling Tournament Problem Given an n by n distance matrix D= [d(i,j)] and an integer k find a double round robin (every team plays at every other team) schedule such that:  The total distance traveled by the teams is minimized (teams are assumed to start at home and must return home at the end of the tournament), and  No team is away more than k consecutive games, or home more than k consecutive games. (For the instances that follow, an additional constraint that if i is at j in slot t, then j is not at i in t+1.)

Sample Instance NL6: Six teams from the National League of (American) Major League Baseball. Distances: k is 3

Sample Solution Distance: (Easton May 7, 1999) Slot ATL NYM PHI MON FLA PIT NYM 1 FLA MON 2 MON NYM ATL FLA 7 ATL PIT

Simple Problem, yes? NL teams Feasible Solution: (Rottembourg and Laburthe May 2001), (Larichi, Lapierre, and Laporte July ), (Cardemil, July ), (Dorrepaal July 16, 2002), (Zhang, August ), (Cardemil, November ), (Van Hentenryck January 14, 2003), (Van Hentenryck February 26, 2003), (Van Hentenryck June 26, 2003), (Langford February 16, 2004), (Langford February 27, 2004), (Langford March 12, 2004), (Van Hentenryck May 13, 2004). Lower Bound: (Waalewign August 2001)

Formulation as IP Straightforward formulation is possible: plays(i,j,t) = 1 if i at j in slot t Need auxiliary variables location (i,j,t) = 1 if i in location j in slot t follows(i,j,k,t) = 1 I travels from j to k after slot t

Formulation Rest of formulation in paper (pages 9 and 10 in proceedings) Result is a mess  N=6  After 1800 seconds gap is 5434 – (optimal is 23,916) Anything XPRESS is doing is not helping enough!

Reformulation H H H X1 X2 X3 Y1 Y2H

Constraints One thing per time: X1+X2+Y1+Y2  H H H X1 X2 Y1 Y2H

Constraints No Away followed by Away X1+X3 @NY X2 X3

Rest of formulation Rest of formulation is straightforward (in proceedings, looking more complicated than it needs to) Result: initial relaxation (for n=6) 21,624.7 Optimal: 4136 seconds, 66,000 nodes

Strengthening the Constraints Stronger: X1+X2+X3+Y2 H X1 X2 X3 Y2H

Result Initial relaxation same, solution time a little longer What happened: “Strengthening” is type of clique inequality, known by XPRESS Without clique inequalities: unsolved after more than 36,000 seconds

Conclusions for Model 2 Initial formulation almost hopeless Manual reformulation needed to redefine variables Then, automatic reformulation can improve results tremendously

Questions What is the role of manual versus automatic reformulation?  Model 1: manual needed to identify hidden constraint  Model 2: manual needed to redefine the variables Is this an ever-moving line, or are some aspects intrinsically difficult to determine? How can software be developed to better  Do automatic reformulation  Provide flexibility to experiment with different reformulations/reformulation levels

Resources Introduction to Integer Programming (by Bob Bosch and me) and this talk  Will be at XPRESS-MP and ILOG’s OPL Studio provide great software to experiment with