Constraint Programming and Backtracking Search Algorithms

Constraint Programming and Backtracking Search Algorithms
Peter van Beek University of Waterloo

Acknowledgements Joint work with: Funding: Alejandro López-Ortiz
Abid Malik Jim McInnes Claude-Guy Quimper John Tromp Kent Wilken Huayue Wu Funding: NSERC IBM Canada

Outline Introduction Worst-case performance Practical performance
basic-block scheduling constraint programming randomization and restarts Worst-case performance bounds on expected runtime bounds on tail probability Practical performance parameterizing universal strategies estimating optimal parameters experiments Conclusions

Basic-block instruction scheduling
Schedule basic-block straight-line sequence of code with single entry, single exit Multiple-issue pipelined processors multiple instructions can begin execution each clock cycle delay or latency before results are available Find minimum length schedule Classic problem lots of attention in literature

Example: evaluate (a + b) + c
instructions A r1  a B r2  b C r3  c D r1  r1 + r2 E r1  r1 + r3 3 1 A B D C E dependency DAG

Example: evaluate (a + b) + c
optimal schedule A r1  a B r2  b C r3  c nop D r1  r1 + r2 E r1  r1 + r3 3 1 A B D C E dependency DAG

Constraint programming methodology
Model problem specify in terms of constraints on acceptable solutions constraint model: variables, domains, constraints Solve model backtracking search many improvements: constraint propagation, restarts, …

Constraint model 3 1 A B D C E variables dependency DAG A, B, C, D, E
domains {1, …, m} constraints D  A + 3 D  B + 3 E  C + 3 E  D + 1 gcc(A, B, C, D, E, width) 3 1 A B D C E dependency DAG

Constraint programming methodology
Model problem specify in terms of constraints on acceptable solutions constraint model: variables, domains, constraints Solve model backtracking search many improvements: constraint propagation, restarts, …

Solving instances of the model
B [1,6] [1,6] A 1 2 5 6 3 3 B D C [1,6] [1,6] C 1 3 D E [1,6] E

Constraint propagation: Bounds consistency
variable A B C D E domain [1, 6]  [1, 3]  [1, 2]  [1, 3]  [1, 2]  [1, 3]  [3, 3]  [4, 6]  [4, 5]  [4, 6]  [5, 6]  [6, 6] constraints D  A + 3 D  B + 3 E  C + 3 E  D + 1 gcc(A, B, C, D, E, 1)

B [1,2] [1,2] A 1 2 5 6 3 3 B D C [4,5] [3,3] C 1 3 D E [6,6] E

B [1,1] [1,2] A 1 2 5 6 3 3 B D C [4,5] [3,3] C 1 3 D E [6,6] E

B [1,1] [2,2] A 1 2 5 6 3 3 B D C [5,5] [3,3] C 1 3 D E [6,6] E

Restart strategies Observation: Backtracking algorithms can be brittle on some instances small changes to a heuristic can lead to great differences in running time A technique called randomization and restarts has been proposed to improve performance (Luby et al., 1993; Harvey, 1995; Gomes et al. 1997, 2000) A restart strategy (t1, t2, t3, …) is a sequence idea: a randomized backtracking algorithm is run for t1 steps. If no solution is found within that cutoff, the algorithm is restarted and run for t2 steps, and so on

Restart strategies Let f(t) be the probability a randomized backtracking algorithm A on instance x stops after taking exactly t steps f(t) is called the runtime distribution of algorithm A on instance x Given the runtime distribution of an instance, the optimal restart strategy for that instance is given by (t*, t*, t*, …), for some fixed cutoff t* (Luby, Sinclair, Zuckerman, 1993) A fixed cutoff strategy is an example of a non-universal strategy: designed to work on a particular instance

Universal restart strategies
In contrast to non-universal strategies, universal strategies are designed to be used on any instance Luby strategy (Luby, Sinclair, Zuckerman, 1993) Walsh strategy (Walsh, 1999) (1, 1, 2, 1, 1, 2, 4, 1, 1, 2, 1, 1, 2, 4, 8, 1, …)  grows linearly (1, r, r2, r3, …), r > 1  grows exponentially

Related work: Learning a good restart strategy
Gomes et al. (2000) Experiments on a sample of instances to informally choose a good strategy Zhan (2001) Extensive experiments to evaluate effect of geometric parameter Ó Nualláin, de Rijke, v. Benthem (2001) Deriving good restart strategies when instances drawn from two known runtime distributions Kautz et al. (2002a, 2002b) Deriving good restart strategies when instances drawn from n known runtime distributions Ruan, Horvitz, and Kautz (2003) Clusters runtime distributions; constructs a strategy using dynamic programming Gagliolo and Schmidhuber (2007) Online learning of a fixed cutoff strategy interleaved with Luby strategy Huang (2007) Extensive experiments; no strategy was best across all benchmarks

Pitfalls of non-universal restart strategies
Non-universal strategies are open to catastrophic failure strategy provably will fail on an instance failure is due to all cutoffs being too small Non-universal strategies learned by previous proposals can be unbounded worse than performing no restarts at all pitfall likely to arise whenever some instances are inherently harder to solve than others

Worst-case performance of universal strategies
For universal strategies, two worst-case bounds are of interest: worst-case bounds on the expected runtime of a strategy worst-case bounds on the tail probability of a strategy Luby strategy has been thoroughly characterized (Luby, Sinclair, Zuckerman, 1993) Walsh strategy has not been characterized

Worst-case bounds on expected runtime
Expected runtime of the Luby strategy is within a log factor of optimal (Luby, Sinclair, Zuckerman, 1993) We show: Expected runtime of the Walsh strategy (1, r, r2, …), r > 1, can be unbounded worse than optimal

Worst-case bounds on tail probability (I)
Tail probability: Probability an algorithm or a restart strategy runs for more than t steps, for some given t Tail probability of the Luby strategy decays superpolynomially as a function of t, no matter what the runtime distribution of the original algorithm (Luby, Sinclair, Zuckerman, 1993) P(T > 4000)

Worst-case bounds on tail probability (II)
Pareto heavy-tailed distributions can be a good fit to the runtime distributions of randomized backtracking algorithms (Gomes et al., 1997, 2000) We show: If the runtime distribution of the original algorithm is Pareto heavy-tailed, the tail probability of the Walsh strategy decays superpolynomially

Practical performance of universal strategies
Previous empirical evaluations have reported that the universal strategies can perform poorly in practice (Gomes et al., 2000; Kautz et al., 2002; Ruan et al. 2002, 2003; Zhan, 2001) We show: Performance of the universal strategies can be improved by Parameterizing the strategies Estimating the optimal settings for these parameters from a small sample of instances

Motivation Setting: a sequence of instances are to be solved over time
e.g., in staff rostering, at regular intervals on the calendar a similar problem must be solved e.g., in instruction scheduling, thousands of instances arise each time a compiler is invoked on some software project Useful to learn a good portfolio, in an offline manner, from a training set

Parameterizing the universal strategies
Two parameters: scale s geometric factor r Parameterized Luby strategy with, e.g., s = 2, r = 3 Parameterized Walsh strategy Advantages: Improve performance while retaining theoretical guarantees (2, 2, 2, 6, 2, 2, 2, 6, 2, 2, 2, 6, 18, …) (s, sr, sr2, sr3, …)

Estimating the optimal parameter settings
Discretize scale s into orders of magnitude, 10-1, …, 105 Discretize geometric r 2, 3, …, 10 (Luby) 1.1, 1.2, ..., 2.0 (Walsh) Choose values that minimizes performance measure on training set

Experimental setup Instruction scheduling problems for multiple-issue pipelined processors hard instances from SPEC 2000 and MediaBench suites gathered censored runtime distributions (10 minute time limit per instance) training set: 927 instances test set: 5,450 instances Solve using backtracking search algorithm randomized dynamic variable ordering heuristic capable of performing three levels of constraint propagation: Level = 0 Bounds consistency Level = 1 Singleton consistency using bounds consistency Level = 2 Singleton consistency using singleton consistency

Experiment 1: Time limit
Time limit: 10 minutes per instance Performance measure: Number of instances solved Learn parameter settings from training set, evaluate on test set

Experiment 1: Time limit

Experiment 2: No time limit
No time limit : run to completion Performance measure: Expected time to solve instances In our experimental runtime data, replaced timeouts by values sampled from tail of a Pareto distribution Learn parameter settings from training set, evaluate on test set

Experiment 2: No time limit

Conclusions Restart strategies Bigger picture
Theoretical performance: worst-case analysis of Walsh universal strategy Practical performance: approach for learning good universal restart strategies Bigger picture Application driven research: Instruction scheduling in compilers Can now solve optimally almost all instances that arise in practice

Constraint Programming and Backtracking Search Algorithms

Similar presentations

Presentation on theme: "Constraint Programming and Backtracking Search Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Constraint Programming and Backtracking Search Algorithms

Similar presentations

Presentation on theme: "Constraint Programming and Backtracking Search Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback