Download presentation
Presentation is loading. Please wait.
Published byGuadalupe Harbinson Modified over 9 years ago
1
Greedy Algorithms CS 6030 by Savitha Parur Venkitachalam
2
Outline Greedy approach to Motif searching Genome rearrangements Sorting by Reversals Greedy algorithms for sorting by reversals Approximation algorithms Breakpoint Reversal sort
3
Greedy motif searching Developed by Gerald Hertz and Gary Stormo in 1989 CONSENSUS is the tool based on greedy algorithm Faster than Brute force and Simple motif search algorithms An approximation algorithm with an unknown approximation ratio
4
Greedy motif search – Psuedocode
5
Greedy motif search – Steps Input – DNA Sequence, t (# sequences), n (length of one sequence), l (length of motif to search) Output – set of starting points of l-mers Performs an exhaustive search using hamming distance on first two sequences of the DNA Forms a 2 x l seed matrix with the two closest l-mers Scans the rest of t-2 sequences to find the l-mer that best matches the seed and add it to the next row of the seed matrix
6
Complexity Exhaustive search on first two sequences require l(n-l+1) 2 operations which is O(ln 2 ) The sequential scan on t-2 sequences requires l(n-l+1)(t-2) operations which is O(lnt) Thus running time of greedy motif search is O(ln 2 + lnt) If t is small compared to n algorithm behaves O(ln 2 )
7
Consensus tool Greedy motif algorithm may miss the optimal motif Consensus tool saves large number of seed matrices Consensus tool can check sequences in random Consensus tool is less likely to miss the optimal motif
8
Genome rearrangements Gene rearrangements results in a change of gene ordering Series of gene rearrangements can alter genomic architecture of a species 99% similarity between cabbage and turnip genes Fewer than 250 genomic rearrangements since divergence of human and mice
11
History of Chromosome X Rat Consortium, Nature, 2004
12
Types of Rearrangements Reversal 1 2 3 4 5 61 2 -5 -4 -3 6 Translocation 4 1 2 3 4 5 6 1 2 6 4 5 3 1 2 3 4 5 6 1 2 3 4 5 6 Fusion Fission
13
Greedy algorithms in Gene Rearrangements Biologists are interested in finding the smallest number of reversals in an evolutionary sequence gives a lower bound on the number of rearrangements and the similarity between two species Two greedy algorithms used - Simple reversal sort - Breakpoint reversal sort
14
Gene Order Gene order is represented by a permutation 1 ------ i-1 i i+1 ------ j-1 j j+1 ----- n Reversal ( i, j ) reverses (flips) the elements from i to j in ( i, j ) ↓ 1 ------ i-1 j j-1 ------ i+1 i j+1 ----- n
15
Reversal example = 1 2 3 4 5 6 7 8 (3,5) ↓ 1 2 5 4 3 6 7 8 (5,6) ↓ 1 2 5 4 6 3 7 8
16
Reversal distance problem Goal: Given two permutations, find the shortest series of reversals that transforms one into another Input: Permutations and Output: A series of reversals 1,… t transforming into such that t is minimum t - reversal distance between and d( , ) - smallest possible value of t, given and
17
Sorting by reversal Goal : Given a permutation, find a shortest series of reversals that transforms it into the identity permutation. Input: Permutation π Output : A series of reversals 1,… t transforming into identity permutation, such that t is minimum
18
Sorting by reversal - Greedy algorithm If sorting permutation = 1 2 3 6 4 5, the first three elements are already in order so it does not make any sense to break them. The length of the already sorted prefix of is denoted prefix( ) – prefix( ) = 3 This results in an idea for a greedy algorithm: increase prefix( ) at every step
19
Simple Reversal sort – Psuedocode A very generalized approach leads to analgorithm that sorts by moving ith element to ith position SimpleReversalSort( ) 1 for i 1 to n – 1 2 j position of element i in (i.e., j = i) 3 if j ≠i 4 * (i, j) 5 output 6 if is the identity permutation 7 return
20
Example – SimpleReversalSort not optimal Input – 612345 612345 ->162345 ->126345 ->123645->123465 - -> 123456 Greedy SimpleReversalSort takes 5 steps where as optimal solution only takes 2 steps 612345 -> 543216 -> 123456 An example of SimpleReversalSort is ‘Pancake Flipping problem’
21
Approximation Ratio These algorithms produce approximate solution rather than an optimal one Approximation ratio is of an algorithm A is given by A( ) / OPT( ) – For algorithm A that minimizes objective function (minimization algorithm): max | | = n A( ) / OPT( ) – For maximization algorithm: min | | = n A( ) / OPT( )
22
Breakpoints – A different face of greed In a permutation = 1 ---- n - if i and i+1 are consecutive numbers it is an adjacency - if i and i+1 are not consecutive numbers it is a breakpoint Example: = 1 | 9 | 3 4 | 7 8 | 2 | 6 5 Pairs (1,9), (9,3), (4,7), (8,2) and (2,6) form breakpoints Pairs (3,4) (7,8) and (6,5) form adjacencies b( ) - # breakpoints in permutation p Our goal is to eliminate all breakpoints and thus forming the identity permutation
23
Breakpoint Reversal Sort – Steps Put two elements 0 =0 and n + 1 =n+1 at the ends of Eliminate breakpoints using reversals Each reversal eliminates at most 2 breakpoints This implies reversal distance ≥ #breakpoints/2 = 2 3 1 4 6 5 0 2 3 1 4 6 5 7 b( ) = 5 0 1 3 2 4 6 5 7 b( ) = 4 0 1 2 3 4 6 5 7 b( ) = 2 0 1 2 3 4 5 6 7 b( ) = 0 Not efficient as it may run forever
24
Psuedocode – Breakpoint reversal Sort BreakPointReversalSort( ) 1 while b( ) > 0 2 Among all possible reversals, choose reversal minimizing b( ) 3 (i, j) 4 output 5 return
25
Using strips A strip is an interval between two consecutive breakpoints in a permutation Decreasing strip: strip of elements in decreasing order Increasing strip: strip of elements in increasing order 0 1 9 4 3 7 8 2 5 6 10 A single-element strip can be declared either increasing or decreasing. We will choose to declare them as decreasing with exception of the strips with 0 and n+1
26
Reducing breakpoints Choose the decreasing strip with the smallest element k in Find K-1 in the permutation Reverse the segment between k and k-1 Eg: = 1 4 6 5 7 8 3 2 0 1 4 6 5 7 8 3 2 9 b( ) = 5 0 1 2 3 8 7 5 6 4 9 b( ) = 4 0 1 2 3 4 6 5 7 8 9 b( ) = 2 0 1 2 3 4 5 6 7 8 9
27
ImprovedBreakpointReversalSort Sometimes permutation may not contain any decreasing strips So an increasing strip has to be reversed so that it becomes a decreasing strip Taking this into consideration we have an improved algorithm ImprovedBreakpointReversalSort( ) 1 while b( ) > 0 2 if has a decreasing strip 3 Among all possible reversals, choose reversal that minimizes b( ) 4 else 5 Choose a reversal that flips an increasing strip in 6 7 output 8 return
28
Example – ImprovedBreakPointSort There are no decreasing strips in , for: = 0 1 2 | 5 6 7 | 3 4 | 8 b( ) = 3 (6,7) = 0 1 2 | 5 6 7 | 4 3 | 8 b( ) = 3 (6,7) does not change the # of breakpoints (6,7) creates a decreasing strip thus guaranteeing that the next step will decrease the # of breakpoints.
29
Approximation Ratio - ImprovedBreakpointReversalSort Approximation ratio is 4 – It eliminates at least one breakpoint in every two steps; at most 2b( ) steps – Approximation ratio: 2b( ) / d( ) – Optimal algorithm eliminates at most 2 breakpoints in every step: d( ) b( ) / 2 – Performance guarantee: ( 2b( ) / d( ) ) [ 2b( ) / (b( ) / 2) ] = 4
30
References An Introduction to Bioinformatics Algorithms - Neil C.Jones and Pavel A.Pevzner http://bix.ucsd.edu/bioalgorithms/slides.php# Ch5 http://bix.ucsd.edu/bioalgorithms/slides.php# Ch5
31
Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.