Download presentation
Presentation is loading. Please wait.
1
Efficient Data Structures and a New Randomized Approach for Sorting Signed Permutations by Reversals Haim Kaplan and Elad Verbin
2
Unsigned Sorting by reversals Given a permutation, find a shortest sequence of reversals that transforms it to (1 2 … n) (3 4 2 5 1)34251) (
3
(3 4 -2 -5 1)34251) (--- Signed Sorting by reversals Given a signed permutation, find a shortest sequence of reversals that transforms it to (+1 +2 … +n)
4
Example (3 4 -2 -5 1) (-4 -3 -2 -5 1) (-4 -3 -2 -1 5) ( 1 2 3 4 5)
5
Motivation Biology: computing large-scale evolutionary distance (most parsimonious scenario) Heuristics for TSP
6
History of Unsigned SBR 1993: Conjectured to be NP-Hard and 2- approx. algorithm, Kececioglu and Sankoff 1997: Proven NP-Hard, Caprara 1999: Proven MAX-SNP Hard, Berman and Karpinski 2001: 1.375-approximation by Berman, Hannenhalli and Karpinski
7
History of Signed SBR 1993: Conjectured NP-Hard, Kececioglu and Sankoff 1995: Polynomial Algorithm, Pevzner and Hannenhalli 1996: O(n 2 (n)) Algorithm, Berman and Hannenhalli 1997: Simpler O(n 2 ) Algorithm, Kaplan, Shamir and Tarjan - Best to date. 2001: Very Simple cubic solution, Bergeron
8
Variants Computing just the distance (signed version) – linear (Bader et. al., 2001) Many, many, more
9
Sorting Signed Permutations
10
The Breakpoint Graph +3 +4 -2 -5 +1 x-x x a,x b x b,x a 0 3 a 3 b 4 a 4 b 2 b 2 a 5 b 5 a 1 a 1 b 6 Transform Permutation Blue Edges Between Adjacent Vertices Red Edges Between Consecutive elements
11
Calculating the distance dist=n+1-c+h+f dist=n+1-c Goal: Find a reversal that creates a cycle and keeps h+f=0 (a safe reversal) Eliminated in linear time h,f are small parameters of the breakpoint graph (usually h+f=0) Note: single decomposition to alternating cycles 0 3 a 3 b 4 a 4 b 2 b 2 a 5 b 5 a 1 a 1 b 6
12
Red edges can be : Right-to-Right Left-to-Left Oriented { Left-to-Right Right-to-Left Unoriented { Def:This reversal acts on the red edge Oriented Edges We consider only reversals that reverse between the endpoints of a red edge
13
Red edges can be : Oriented Edges c=1 Oriented c=0 Unoriented Thm: A reversal acting on a red edge creates a new cycle the edge is oriented
14
Safe reversals Safe oriented reversal: does not increase h+f Theorem(H-P): a permutation with h+f=0 always has a safe oriented reversal (or is id) SBR algorithms iteratively find a safe reversal. KST does this in linear time (total running time – O(n 2 )) H-P characterized safe oriented reversals using the overlap graph
15
What happens if we disregard safety? perms with h+f>0 are rare
16
Prop: if we ended up at id, then the sequence was a minimal sorting sequence, otherwise we can retry RandomWalk Each path is <n+2 steps Whoops.. NO CHILDREN H-P: all nodes with no children are either id or red Prop: oriented reversals never decrease h+f (i.e. red points only to red) Yey! id Failed but don’t know it yet h+f>0 =(1 -4 2 -3) ’s oriented reversals
17
After how many trials will RandomWalk succeed? Worst permutation – many Average permutation – very few pi 7 =(2 4 6 -1 -3 -5 7) Probability of success 50% chance of failure (permutation of Michal Ozery and Ron Shamir)
18
What is the average over all permutations of the expected number of trials until success? Theoretically – we don’t know (but we do know that red permutations are polynomially rare) Experimentally – 1.6, regardless of n
19
Behavior of RandomWalk on the average Empirical testing of RandomWalk on random permutations (selected uniformly without unoriented components) n Number of permutations tested % success at first trialAverage number of trials until success 10300M66.20%1.642 20100M63.86%1.607 5030M63.01%1.594 10010M62.86%1.593 2003M62.76%1.594 500600K62.69%1.596 1000200K62.62%1.596 200050K63.35%1.586 5000600063.7%1.58 10000200063%1.59
20
Implementing the Random Walk
21
Basic structure Our DS is based on a simple data structures of Fredman, Johnson, McGeoch & Hostheimer `95 (which are based on those by Chrobak, Szymacha & Krawczyk, `90). These structures were invented for implementing TSP heuristics These data structures allow us to maintain permutation under: Reversals Queries:, All operations taking logarithmic time.
22
Random Walk Repeatedly: Select a random oriented reversal Perform it and update the permutation
23
How fast can we run one RandomWalk iteration? Repeatedly: Select a random oriented reversal Perform it and update the permutation Can be done in O(n) time In the paper we show how to do it in so one Random Walk takes time
24
Further Questions Why does RandomWalk work (so well)? Are there variants of RandomWalk that work better (i.e. have good worst-case behavior – no bad cases)? Can RandomWalk be easily and efficiently implemented?
25
Further Questions Are there variants to RandomWalk that can be easily and efficiently implemented but maintain good average-case behavior (i.e. by waving the demand for a uniform selection of an oriented reversal)? Can we maintain safety or hurdelity too?
26
General Further Research Is there a subquadratic algorithm for SBR? What is the structure of the space of all sorting reversal sequences? (i.e. the permutation graph we saw before)
27
Fin.
28
Fredman et. al.’s DS Splay tree with the permutation in the leaves – inorder scan gives the permutation. Reverse bits at the nodes. If on means that the order of the subtree should be reverse, as should the signs of the elements.
29
ij Reverse(i,j): splay(j) Splay operations should keep the invariant that the tree indeed represents the perumutation
30
j Reverse(i,j): splay(i) i
31
i Reverse(i,j): j T1T1 T3T3 T2T2 -j -i T1T1 T3T3 T2RT2R i k T1T1 T3T3 T2T2 T4T4 j -j -k T1T1 T2RT2R T3RT3R T4T4 -i
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.