Download presentation
Presentation is loading. Please wait.
Published bySolomon Robertson Modified over 6 years ago
1
A Formalization of the Use of Bounds with Applications in Biology and Engineering
Sharlee Climer Department of Computer Science and Engineering Department of Biology Washington University in St. Louis This research was funded in part by NDSEG and Olin Fellowships, and by NSF grants IIS , ITR/EIA , and IIS
2
Washington University in St. Louis
Overview Introduction Limit crossing Cut-and-solve TSP Haplotyping 11/27/2018 Washington University in St. Louis
3
Washington University in St. Louis
upper bound The use of bounds optimal solution Used in a number of search strategies as well as a large number of algorithms for particular problems. Many of these algorithms use bounds implicitly and it is never stated that bounds have been used. lower bound 11/27/2018 Washington University in St. Louis
4
Washington University in St. Louis
Use of bounds Bounds have been extensively studied in both computer science and operations research Pruning rules in branch-and-bound search Previous efforts focused on relaxations Vast number of ways that bounds can be produced 11/27/2018 Washington University in St. Louis
5
Formulation and notation
Techniques presented can be applied to a variety of optimization problems We’ll use integer linear programs (IPs) as basic problem structure Without loss of generality, we consider only minimization problems 11/27/2018 Washington University in St. Louis
6
Integer Linear Programs
Great number of research and engineering problems CS applications: Traveling Salesman Problem Constraint Satisfaction Problem Robotic motion problems Clustering Multiple sequence alignment Haplotype inferencing VLSI circuit design Computer disk read head scheduling Derivation of physical structures of programs Delay-Tolerant Network routing Cellular radio network base station locations Minimum-energy multicast problem in wireless ad hoc networks In addition to STRIPS-style problems, IPs have been used to model a number of additional AI problems such as… Defend the use of IPs for the model. Remind general ideas presented should be applicable in other domains. 11/27/2018 Washington University in St. Louis
7
Integer Linear Programs
Minimize Z = Sci xi (objective function) Subject to: a set of linear constraints xi integer If xi integer constraints omitted, would have a linear program (LP) Minimize a linear expression. Define objective function, decision variables, and solution space. LPs are easy to solve, IPs usually are not. 11/27/2018 Washington University in St. Louis
8
Linear program example
Minimize Z = -11x + 4y Subject to: 3x + 8y <= 40 11x - 8y <= 16 x,y >= 0 Integrality not required. Easily solved using simplex. 11/27/2018 Washington University in St. Louis
9
Linear program example
Minimize Z = -11x + 4y y = 11/4 x + Z/4 Family of parallel lines with slope of 11/4 and unknown y-intercept 11/27/2018 Washington University in St. Louis
10
Linear program example
Optimal solution x = 4 y = 7/2 Z = -30 Optimal solution is always on a vertex or edge 11/27/2018 Washington University in St. Louis
11
Integer linear program
Minimize Z = -11x + 4y Subject to: 3x + 8y <= 40 11x - 8y <= 16 x,y >= 0 x,y integer Optimal solution x = 3 y = 3 Z = -21 Relaxing integrality is a lower bound. LP easy to solve. IP may be NP-hard. 11/27/2018 Washington University in St. Louis
12
The Traveling Salesman Problem
The Traveling Salesman Problem (TSP) is the problem of finding a minimum cost complete tour of a set of cities NP-hard 11/27/2018 Washington University in St. Louis
13
Optimal solution for 49-city TSP
11/27/2018 Washington University in St. Louis
14
The Traveling Salesman Problem
Minimize Z = SScij xij s.t.: Sxij = 1 for j = 1,…,n Sxij = 1 for i = 1,…,n SSxij <= |W| - 1, for all proper non empty subsets W of V xij = {0,1} 11/27/2018 Washington University in St. Louis
15
Branch-and-bound search
Branching rules Determine structure of search tree Relaxations Lower-bounding modification Pruning Heuristics to guide search 11/27/2018 Washington University in St. Louis
16
TSP: Omit subtour elimination constraints
Minimize Z = SScij xij s.t.: Sxij = 1 for j = 1,…,n Sxij = 1 for i = 1,…,n SSxij <= |W| - 1, for all proper non empty subsets W of V xij = {0,1} The assignment problem Can be solved in polynomial time Insert picture of 49-city TSP with subtours 11/27/2018 Washington University in St. Louis
17
TSP: Omit subtour elimination constraints
11/27/2018 Washington University in St. Louis
18
TSP: Relax integrality constraints
Minimize Z = SScij xij s.t.: Sxij = 1 for j = 1,…,n Sxij = 1 for i = 1,…,n SSxij <= |W| - 1, for all proper non empty subsets W of V xij = {0,1} 0 <= xij <= 1 Linear program (LP) relaxation Can be solved in polynomial time Insert picture of 49-city TSP with subtours 11/27/2018 Washington University in St. Louis
19
TSP: Relax integrality constraints
11/27/2018 Washington University in St. Louis
20
Branch-and-bound search
Incumbent solution 11/27/2018 Washington University in St. Louis
21
Washington University in St. Louis
Limit crossing A 2-step procedure for exploring the use of bounds Has been implicitly used in a number of algorithms and search strategies To our knowledge, hasn’t been formalized Broaden focus beyond traditional search 11/27/2018 Washington University in St. Louis
22
Washington University in St. Louis
Limit crossing 2 steps: (1) Find a simple upper or lower bound (2) Combine upper-bounding and lower- bounding modifications and solve If solution of the doubly-modified problem exceeds the simple upper bound, upper-bounding modification in step (2) is invalid If solution of doubly-modified problem is less than the simple lower bound, lower-bounding modification in step (2) is invalid 11/27/2018 Washington University in St. Louis
23
Washington University in St. Louis
Limit crossing Find a simple upper or lower bound that is tight Systematically apply modifications to produce doubly-modified problems Either modification can be difficult to solve Only need the combination of the two modifications to be relatively easy Not limiting ourselves to setting variable values for upper-bounding modification of doubly-modified problem. 11/27/2018 Washington University in St. Louis
24
Modifications to obtain bounds
Many possibilities for obtaining bounds have been previously overlooked Examine every aspect of problem description Modifications of IPs to produce bounds Relaxing or tightening constraints Modifying objective function Adding or deleting decision variables Use simple example problem to demonstrate. 11/27/2018 Washington University in St. Louis
25
Limit crossing strategies
Cut-and-solve [Climer and Zhang, Artificial Intelligence, to appear] An iterative search strategy Useful for general combinatorial optimization problems Backbone and fat identifier [Climer and Zhang, AAAI-02] Used to identify characteristic variables 11/27/2018 Washington University in St. Louis
26
Washington University in St. Louis
Cut-and-solve For each iteration: Step 1: A chunk of the solution space is cut away and solved Step 2: A relaxed solution is found for remaining solution space Iterate until relaxed solution is greater than or equal to incumbent Incumbent is guaranteed to be optimal 11/27/2018 Washington University in St. Louis
27
Washington University in St. Louis
Example x >= 0 y <= 3 y + 13/6 x <= 9 y – 5/13 x >= 1/14 y + 3/5 x >= 6/5 x,y integers 11/27/2018 Washington University in St. Louis
28
Washington University in St. Louis
Optimal solution Minimize Z = y – 4/5 x x = 2 y = 1 Z = -0.6 11/27/2018 Washington University in St. Louis
29
Washington University in St. Louis
Iteration 1, first step Cut away a chunk of the solution space: y – 17/3 x >= -14 and solve sparse problem 11/27/2018 Washington University in St. Louis
30
Washington University in St. Louis
Iteration 1, first step x = 3 y = 2 Z = -0.4 Incumbent solution is -0.4 11/27/2018 Washington University in St. Louis
31
Washington University in St. Louis
Iteration 1, second step Add new constraint: y – 17/3 x <= -14 to cut off chunk of solution space Relax integrality and solve 11/27/2018 Washington University in St. Louis
32
Washington University in St. Louis
Iteration 1, second step x = 2.6 y = 1.0 Z = -1.1 Incumbent solution is -0.4, so need to run another iteration 11/27/2018 Washington University in St. Louis
33
Washington University in St. Louis
Iteration 2, first step Cut away a chunk of the solution space and solve sparse problem 11/27/2018 Washington University in St. Louis
34
Washington University in St. Louis
Iteration 2, first step x = 2 y = 1 Z = -0.6 This solution is less than incumbent, so incumbent becomes -0.6 11/27/2018 Washington University in St. Louis
35
Washington University in St. Louis
Iteration 2, second step Add constraint to cut off solved chunk Relax integrality and solve 11/27/2018 Washington University in St. Louis
36
Washington University in St. Louis
Iteration 2, second step x = 1.0 y = 0.6 Z = -0.2 Incumbent value: Z = -0.6 Solution is greater than incumbent, so incumbent must be optimal 11/27/2018 Washington University in St. Louis
37
Cut-and-solve properties
Nominal memory requirements Keep new constraints and incumbent solution from one iteration to the next No subtrees in which to get lost Can be used as complete anytime solver Can use parallel processing 11/27/2018 Washington University in St. Louis
38
Washington University in St. Louis
Cut-and-solve Same as two steps of limit crossing Small chunk is solved to provide simple upper bound Doubly-modified problem Piercing cuts Relaxation Unusual upper-bounding modification 11/27/2018 Washington University in St. Louis
39
Washington University in St. Louis
Cut-and-solve We used generic algorithm for TSP [Artificial Intelligence, to appear] 7 real-world problem classes [Cirasella, Johnson, McGeoch, Zhang, Lecture Notes in Computer Science, 2000] 500 instances solved for each class and size Comparisons with: CDT [Carpaneto, Dell’Amico, and Toth, ACM Trans. On Math. Software, 1995] Concorde [Applegate et al. Cplex [ILOG STSPs are hard if very large, our code not designed for very large problems (arc lengths computed on the fly). A simple implementation, yet out performs state-of-the-art solvers on difficult instances. 11/27/2018 Washington University in St. Louis
40
Shortest common superstring
11/27/2018 Washington University in St. Louis
41
Tilted drilling machine (additive norm)
11/27/2018 Washington University in St. Louis
42
Tilted drilling machine (sup norm)
11/27/2018 Washington University in St. Louis
43
Washington University in St. Louis
Stacker crane 11/27/2018 Washington University in St. Louis
44
Computer disk read head
11/27/2018 Washington University in St. Louis
45
Pay phone coin collection
11/27/2018 Washington University in St. Louis
46
Washington University in St. Louis
No-wait flow shop 11/27/2018 Washington University in St. Louis
47
Largest problem size solved by each method
11/27/2018 Washington University in St. Louis
48
Moving beyond traditional tree search
Cut-and-solve Backbone & fat identifier 11/27/2018 Washington University in St. Louis
49
Haplotype inferencing
What are haplotypes? Why should we care about them? How can we infer haplotypes? 11/27/2018 Washington University in St. Louis
50
Haplotype inferencing
…TGGCACTTCCGAACTTTG… …TGGTACTTCCGAACATTG… …TGGCACTGCCGAACATTG… …TGGCACTGCCGAACTTTG… 11/27/2018 Washington University in St. Louis
51
Haplotype inferencing
…TGGCACTTCCGAACTTTG… …TGGTACTTCCGAACATTG… …TGGCACTGCCGAACATTG… …TGGCACTGCCGAACTTTG… 11/27/2018 Washington University in St. Louis
52
Haplotype inferencing
…C T T… …T T A… …C G A… …C G T… 11/27/2018 Washington University in St. Louis
53
Haplotype inferencing
…C T T… …0 0 1… …T T A… …1 0 0… …C G A… …0 1 0… …C G T… …0 1 1… 11/27/2018 Washington University in St. Louis
54
Haplotype inferencing
…0 0 1… …1 0 0… …0 1 0… …0 1 1… …2 0 2… …0 1 2… 11/27/2018 Washington University in St. Louis
55
Haplotype inferencing
If a site on a genotype is the product of two different nucleotides, it is heterozygous Else it is homozygous 2k-1 feasible resolutions for k heterozygous sites 11/27/2018 Washington University in St. Louis
56
Haplotype inferencing
Example: g1: g2: g3: g4: g5: g6: g7: g8: 11/27/2018 Washington University in St. Louis
57
Washington University in St. Louis
g1: g2: 01010 , , 11001 01011 , 11010 g3: g4: 01110 , , 10001 g5: g6: 01100 , , 01100 01101 , 10100 00100 , 11101 00101 , 11100 g7: g8: 11100 , , 00111 01111 , 00011 11/27/2018 Washington University in St. Louis
58
Washington University in St. Louis
11/27/2018 Washington University in St. Louis
59
Washington University in St. Louis
Why do we care? Genetic association studies use haplotypes Identify relationships between genes and diseases International HapMap Consortium Identify genotypes, use PHASE [Stephens and Donnelly, Am. J. of Hum. Gen., 2003] “haplotypes of extremely high quality” [The International HapMap Consortium, Nature, 2005] 11/27/2018 Washington University in St. Louis
60
How can we infer haplotypes?
Consider genotypes from a population Different objectives have been proposed Pure parsimony PHASE 11/27/2018 Washington University in St. Louis
61
Washington University in St. Louis
Pure parsimony Find minimum number of haplotypes that will resolve the set Exponential time (worst case) Gusfield cast as an IP [CPM 2003] Solved some instances with 30 sites and 50 individuals Doesn’t consider similarities of haplotypes 11/27/2018 Washington University in St. Louis
62
12 parsimonious solutions: 11 haplotypes
11/27/2018 Washington University in St. Louis
63
Washington University in St. Louis
PHASE Weights used to select haplotype pairs that have one already in the set Weights for haplotypes that are “similar” to those in the set Divide-and-conquer 11/27/2018 Washington University in St. Louis
64
PHASE solution: 11 haplotypes
11/27/2018 Washington University in St. Louis
65
Washington University in St. Louis
PHASE solution: S dij = 7 11/27/2018 Washington University in St. Louis
66
Haplotype inferencing
Recent study by Andres, Clark, Hixson, Boerwinkle, and Sing Computational methods including PHASE Poor performance Degree of uncertainty “highly error prone” 11/27/2018 Washington University in St. Louis
67
Washington University in St. Louis
Three challenges Find biologically meaningful model Space complexity Time complexity 11/27/2018 Washington University in St. Louis
68
Haplotype inferencing
PHASE Favors reduced cardinality Favors increased similarities Our method Favors reduced cardinality and increased similarities Combinatorial approach Use a single parameter d 11/27/2018 Washington University in St. Louis
69
Washington University in St. Louis
11 haplotypes 11/27/2018 Washington University in St. Louis
70
Washington University in St. Louis
S dij = 6 11/27/2018 Washington University in St. Louis
71
Washington University in St. Louis
Summary Limit crossing 2-step procedure for using bounds Explore every facet of model Cut-and-solve Generic algorithm for IPs TSP Outperformed other solvers for 5 out of 7 problem classes Haplotyping 11/27/2018 Washington University in St. Louis
72
Washington University in St. Louis
Future work Haplotyping Customized limit crossing approach Accommodate multi-allelic data Automatically reduce trio data Accept phased data Genome-wide association testing Combinatorial approaches to biological problems 11/27/2018 Washington University in St. Louis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.