1 Combinatorial Problems in Cooperative Control: Complexity and Scalability Carla P. Gomes and Bart Selman Cornell University Muri Meeting June 2002
2 Identify Phase Transitions In Problem Hardness Leverage Randomization In Computation Develop procedures that recognize and react to Structure in Problem Hardness Goal Principled dynamic control of communication and computational resources in large distributed autonomous systems, allowing for: Scalability Time critical applications Robustness guarantees - Overview Overall Approach Use findings in both the design and operation of complex (distributed) systems Hardness Aware Systems (Computationally)
3 Outline ROBOFLAG Drill – Computational Issues Capturing Structure in Combinatorial Problems Randomization and Approximations Conclusions
4 ROBOFLAG Drill Problem is hybrid, combining discrete and continuous components, with multiple constraints. Overall the Roboflag control problem provides an excellent test bed for the development of scalable techniques for complex optimization.
5 Problem Representation ROBOFLAG Drill Formulation by Raff D’Andrea and Matt Earl. Represented as a mixed logical system (MLD) in which the objective is to compute optimal control policies that minimize the total score of the game. Mathematical Formulation of the Optimization Problem Mixed Integer Linear Program
6. We are investigating how to scale up solutions of the ROBOFLAG Drill focusing on: - Mixed Integer Program (MIP) formulations - Randomization and Approximation methods - Combining MIP and constraint search techniques. - Portfolios of Algorithms
7 Scaling Up Mixed Integer Linear Program Formulations (MILP) Standard approach for solving MILP: Branch and Bound How can we improve upon Branch and Bound strategies? Ideas: Different search strategies for node selection Randomization Portfolios of algorithms
8 Branch & Bound: Depth First vs. Best bound Critical to performance of Branch & Bound is the way in which the next node to be expanded is selected. Standard approach: Best-bound --- select the node with the best LP bound Alternative: Depth-first --- often quickly reaches an integer solution (may take longer to produce an overall optimal value) Tradeoffs between these choices depend on underlying problem stucture (Gomes et al. 2001).
9 ROBOFLAG Testbed Hybrid node selection - Best Bound and Depth First Depth First search works well. Problems that could not be solved before with best bound using were solved with depth first. Current largest problem solved with CPLEX using Depth First Search (8 attackers and 3 defenders): Integer variables = 4040 Continuous variables 400 Constraints constraints Time secs (Matt Earl 2002)
10 Much room for improvement… We are not yet using other problem formulations, Nor are we yet exploiting randomization and parallelism. Doing so should allow us to solve problems at least one or two orders of magnitude larger. (100,000 to 500,000 vars and 1,000,000+ constraints) Also, we should be able to include more complex constraints.
11 Capturing Structure in Combinatorial Problems the importance of problem representation…
12 Completing Latin Squares: An Abstraction for Real World Applications Latin Square (Order 4) 32% preassignment Gomes and Selman 96 A Latin Square is an n-by-n matrix such that each row and column is a permutation of the same n colors
13 Switches in Fiber Optic Networks Dynamic wavelength routing in Fiber Optic Networks can be directly mapped into the Latin Square Problem. (Barry and Humblet 93, Cheung et al. 90, Green 92, Kumar et al. 99) each channel cannot be repeated in the same input port (row constraints); each channel cannot be repeated in the same output port (column constraints); CONFLICT FREE LATIN ROUTER Input ports Output ports Input PortOutput Port
14 LP Based Formulations
15 Assignment Formulation Cubic representation of QCP Columns Rows Colors
16 QCP Assignment Formulation Row/color line Column/color line Row/column line Max number of colored cells
17 Using a MIP formulation and Branch and Bound we can only find solutions for Latin Squares up to Order 15 (15 x 15) we can do better, even with an LP based formulation using a less obvious encoding
18 Packing formulation Max number of colored cells in the selected patterns s.t. one pattern per family a cell is covered at most by one pattern Families of patterns (partial patterns are not shown) Gomes and Shmoys 2002
19 Packing Formulation Definitions: Compatible matching for color k – any extension of a partial solution with respect to color k. family of all compatible matchings for color k - variable denoting each compatible matching M in |M| number of colored cells in a compatible matching
20 QCP Packing Formulation one pattern per color at most one pattern covering each cell Max number of colored cells
21 Any feasible solution to the packing LP relaxation is also a solution to the assignment LP relaxation The value of the assignment relaxation is at least the bound implied by the packing formulation => the packing formulation provides a tighter upper bound than the assignment formulation Limitation – size of formulation is exponential in n. (one may apply column generation techniques)
22 Approximation Based on Packing Formulation Randomization scheme: for each color K choose a pattern with probability (so that some matching is selected for each color) As a result we have a pattern per color. Problem: some patterns may overlap, even though in expectation, the constraints imply that the number of matchings in which a cell is involved is 1.
23 Packing formulation Max number of colored cells in the selected patterns s.t. one pattern per family a cell is covered at most by one pattern
24 (1-1/e)- Approximation Based on Packing Formulation Let’s assume that the PLS is completable Z*=h What is the expected number of cells uncolored by our randomized procedure due to overlapping conflicts? From we can compute So, the desired probability corresponds to the probability of a cell not be colored with any color, i.e.:
25 (1-1/e)- Approximation Based on Packing Formulation This expression is maximized when all the are equal therefore: So the expected number of uncolored cells is at most at least holes are expected to be filled by this technique. 1- 1/e ~ This is a very good guarantee for a polynomial time algorithm!
26 Another Formulation Constraint Satisfaction Problem
27 QCP as a CSP Variables - Constraints - row column
28 Exploiting Structure for Domain Reduction A very successful strategy for domain reduction in CSP is to exploit the structure of groups of constraints and treat them as global constraints. Example using Network Flow Algorithms: All-different constraints
29 Exploiting Structure in QCP ALLDIFF as Global Constraint Two solutions: we can update the domains of the column variables Analogously, we can update the domains of the other variables Matching on a Bipartite graph All-different constraint
30 Pure CSP approaches solve QCP instances up to order 33 (1089 variables) relatively well. (LP based – only up to order 15 – 125 variables)
31 We are exploring more direct encodings for the ROBOFLAG DRILL Representations avoiding discretization based on time. constraint based abstractions closer to the physical system, e.g., based movements / trajectories.
32 Randomization
33 Background Stochastic strategies have been very successful in the area of local search. Simulated annealing Genetic algorithms Tabu Search Walksat and variants. Limitation: inherent incomplete nature of local search methods.
34 Randomized variable and/or value selection – lots of different ways. Example: randomly breaking ties in variable and/or value selection. Compare with standard lexicographic tie-breaking. Note: No problem maintaining the completeness of the algorithm! Randomized backtrack search
35 Sample mean Erratic Behavior of Mean Number runs Empirical Evidence of Heavy-Tails (*) no solution found - reached cutoff: 2000 Time:(*)3011(*)7 Easy instance – 15 % preassigned cells Gomes et al Median = 1 !
36 Decay of Distributions Standard Exponential Decay e.g. Normal: Heavy-Tailed Power Law Decay e.g. Pareto-Levy: Power Law Decay Standard Distribution (finite mean & variance) Exponential Decay Infinite variance, infinite mean
37 Exploiting Heavy-Tailed Behavior Heavy Tailed behavior has been observed in several domains: QCP, Graph Coloring, Planning, Scheduling, Circuit synthesis, Decoding, etc. Consequence for algorithm design: Use restarts or parallel / interleaved runs to exploit the extreme variance performance. Restarts provably eliminate heavy-tailed behavior (Gomes et al. 2000) 70% unsolved 1-F(x) Unsolved fraction Number backtracks (log) 250 (62 restarts) 0.001% unsolved
38 Using randomization and restarts we can solve considerably larger instances up to order QCP instances up to order 40 (1600 variables). Note: this problem is highly exponential – instances of order 40 are much more difficult than instances of order 33! We are also experimenting with randomization in the ROBOFLAG DRILL
39 Hybrid MIP/CSP Approaches
40 CSP Model LP Model + LP Randomized Rounding Heavy-tails We want to maintain completeness How do we combine all these ingredients? A HYBRID COMPLETE CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH
41 HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH Central features of algorithm: Complete Backtrack search algorithm It maintains two formulations CSP model Relaxed LP model LP Randomized rounding for setting values at the top of the tree CSP + LP inference
42 Variable setting controlled by LP Randomized Rounding CSP & LP Inference Search & Inference controlled by CSP %LP Interleave-LP HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH Populate CSP Model Perform propagation Populate LP solver Solve LP Adaptive CUTOFF
43 1.Initialize CSP model and perform propagation of constraints (Ilog Solver); 2.Solve LP model (Ilog Cplex Barrier) LP provides good heuristic guidance and pruning information for the search. However solving the LP is relatively expensive. 3.Two parameters control the LP effort %LP – this parameter controls the percentage of variables set based on the LP rounding (%LP=0 pure CSP strategy) Interleave-LP – sets the frequency in which we re-solve the LP. 4.Randomized rounding scheme: rank variables according to the LP value. Select the highest ranked variable and set its value to 1 with probability p given by its LP value. With probability (1-p), randomly select a color form the colors allowed in the CSP model. 5.Perform propagation CSP propagation after each variable setting. (A total of Interleave-LP variables is assigned this way without resolving the LP) 6.Use a cutoff value to restart the sercah (keep increasing it to maintain completeness) HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH
44 Empirical Results
45 Time Performance Order 35
46 Performance With the hybrid strategy we also solve instances of order 40 in critically constrained area – out of reach for pure CSP; We even solved a few balanced instances of order 50 in the critically constrained order! more systematic experimentation is required to better understand limitations and strengths of approach.
47 Conclusions
48 Conclusions Approximations based on LP randomized rounding (variable/value setting) + constraint propagation --- very powerful. Combatting heavy-tails of backtrack search through randomization. Consequence: New ways of designing algorithms --- aim for strategies which have highly asymmetric distributions that can be exploited using restarts, portfolios of algorithms, and interleaved/parallel runs. General approach --- holds promise for a range of hard combinatorial problems.
49 Scaling up ROBOFLAG - Other Formulations for Solving the Control Optimization Problem Encodings that provide “tighter” relaxations for the LP problem. Approximate representations using abstractions (“synthesize larger movements / trajectories”). Avoid discretization based on time. Less compact representations may allow for more propagation and scale up better. Constraint Satisfaction Problem (CSP) formulations. Hybrid CSP/LP formulations. Approximations based on LP randomized rounding. Goal: At least two orders of magnitude scale-up over current state-of-the-art.
Check also: Check also: Demos, papers, etc.