Download presentation
Presentation is loading. Please wait.
1
Genome Rearrangements CSCI 7000-005: Computational Genomics Debra Goldberg debg@hms.harvard.edu
2
Turnip vs Cabbage Share a recent common ancestor Look and taste different
3
Turnip vs Cabbage Comparing mtDNA gene sequences yields no evolutionary information 99% similarity between genes These surprisingly identical gene sequences differed in gene order This study helped pave the way to analyzing genome rearrangements in molecular evolution
4
Turnip vs Cabbage: Different mtDNA Gene Order Gene order comparison:
5
Turnip vs Cabbage: Different mtDNA Gene Order Gene order comparison:
6
Turnip vs Cabbage: Different mtDNA Gene Order Gene order comparison:
7
Turnip vs Cabbage: Different mtDNA Gene Order Gene order comparison:
8
Turnip vs Cabbage: Gene Order Comparison Before After Evolution is manifested as the divergence in gene order
9
Transforming Cabbage into Turnip
10
Reversals 1 32 4 10 5 6 8 9 7 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 Blocks represent conserved genes. In the course of evolution or in a clinical context, blocks 1,…,10 could be misread as 1, 2, 3, -8, -7, -6, -5, -4, 9, 10. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Blocks represent conserved genes.
11
Types of Mutations
12
Types of Rearrangements Reversal 1 2 3 4 5 61 2 -5 -4 -3 6
13
Types of Rearrangements Translocation 4 1 2 3 4 5 6 1 2 6 4 5 3
14
Types of Rearrangements 1 2 3 4 5 6 1 2 3 4 5 6 Fusion Fission
15
What are the similarity blocks and how to find them? What is the architecture of the ancestral genome? What is the evolutionary scenario for transforming one genome into the other? Unknown ancestor ~ 75 million years ago Mouse (X chrom.) Human (X chrom.) Genome rearrangements
16
Why do we care?
17
SKY (spectral karyotyping)
18
Robertsonian Translocation 1314
19
Robertsonian Translocation Translocation of chromosomes 13 and 14 No net gain or loss of genetic material: normal phenotype. Increased risk for an abnormal child or spontaneous pregnancy loss 1314
20
Philadelphia Chromosome 9 22
21
Philadelphia Chromosome Seen in about 90% of patients with Chronic myelogenous leukemia (CML) A translocation between chromosomes 9 and 22 (part of 22 is attached to 9) 922
22
Inversion 16
23
Inversion Chromosome on the right is inverted near the centromere 16
24
Colon cancer
25
Colon Cancer
26
Comparative Mapping: Mouse vs Human Genome Humans and mice have similar genomes, but their genes are ordered differently ~245 rearrangements –Reversals –Fusions –Fissions –Translocations
27
Waardenburg’s Syndrome: Mouse Provides Insight into Human Genetic Disorder Characterized by pigmentary dysphasia Gene implicated linked to human chromosome 2 It was not clear where exactly on chromosome 2
28
Waardenburg’s syndrome and splotch mice A breed of mice (with splotch gene) had similar symptoms caused by the same type of gene as in humans Scientists identified location of gene responsible for disorder in mice Finding the gene in mice gives clues to where same gene is located in humans
29
Reversals: Example = 1 2 3 4 5 6 7 8 1 2 5 4 3 6 7 8
30
Reversals: Example = 1 2 3 4 5 6 7 8 1 2 5 4 3 6 7 8 1 2 5 4 6 3 7 8
31
Reversals and Gene Orders Gene order represented by permutation 1 ------ i-1 i i+1 ------ j-1 j j+1 ----- n 1 ------ i-1 j j-1 ------ i+1 i j+1 ----- n Reversal ( i, j ) reverses (flips) the elements from i to j in ,j)
32
Reversal Distance Problem Goal: Given two permutations, find shortest series of reversals to transform one into another Input: Permutations and Output: A series of reversals 1,… t transforming into such that t is minimum t - reversal distance between and d( , ) = smallest possible value of t, given
33
Sorting By Reversals Problem Goal: Given a permutation, find a shortest series of reversals that transforms it into the identity permutation (1 2 … n ) Input: Permutation Output: A series of reversals 1, … t transforming into the identity permutation such that t is minimum min t =d( ) = reversal distance of
34
Sorting by reversals Example: 5 steps
35
Sorting by reversals Example: 4 steps What is the reversal distance for this permutation? Can it be sorted in 3 steps?
36
Pancake Flipping Problem Chef prepares unordered stack of pancakes of different sizes The waiter wants to sort (rearrange) them, smallest on top, largest at bottom He does it by flipping over several from the top, repeating this as many times as necessary Christos Papadimitrou and Bill Gates flip pancakes
37
Sorting By Reversals: A Greedy Algorithm Example: permutation = 1 2 3 6 4 5 First three elements are already in order prefix( ) = length of already sorted prefix – prefix( ) = 3 Idea: increase prefix( ) at every step
38
Doing so, can be sorted 1 2 3 6 4 5 1 2 3 4 6 5 1 2 3 4 5 6 Number of steps to sort permutation of length n is at most (n – 1) Greedy Algorithm: An Example
39
Greedy Algorithm: Pseudocode SimpleReversalSort( ) 1 for i 1 to n – 1 2 j position of element i in (i.e., j = i) 3 if j ≠i 4 * (i, j) 5 output 6 if is the identity permutation 7 return
40
Analyzing SimpleReversalSort SimpleReversalSort does not guarantee the smallest number of reversals and takes five steps on = 6 1 2 3 4 5 : Step 1: 1 6 2 3 4 5 Step 2: 1 2 6 3 4 5 Step 3: 1 2 3 6 4 5 Step 4: 1 2 3 4 6 5 Step 5: 1 2 3 4 5 6
41
But it can be sorted in two steps: = 6 1 2 3 4 5 –Step 1: 5 4 3 2 1 6 –Step 2: 1 2 3 4 5 6 So, SimpleReversalSort( ) is not optimal Optimal algorithms are unknown for many problems; approximation algorithms used Analyzing SimpleReversalSort (cont ’ d)
42
Approximation Algorithms These algorithms find approximate solutions rather than optimal solutions The approximation ratio of an algorithm A on input is: A( ) / OPT( ) where A( ) -solution produced by algorithm A OPT( ) - optimal solution of the problem
43
Approximation Ratio / Performance Guarantee Approximation ratio (performance guarantee) of algorithm A: max approximation ratio of all inputs of size n –For algorithm A that minimizes objective function (minimization algorithm): max | | = n A( ) / OPT( ) –For maximization algorithm: min | | = n A( ) / OPT( )
44
= 2 3 … n-1 n A pair of elements i and i + 1 are adjacent if i+1 = i + 1 For example: = 1 9 3 4 7 8 2 6 5 (3, 4) or (7, 8) and (6,5) are adjacent pairs Adjacencies and Breakpoints
45
There is a breakpoint between any adjacent element that are non-consecutive: = 1 9 3 4 7 8 2 6 5 Pairs (1,9), (9,3), (4,7), (8,2) and (2,5) form breakpoints of permutation b( ) - # breakpoints in permutation Breakpoints: An Example
46
Adjacency & Breakpoints An adjacency – consecutive A breakpoint – not consecutive π = 5 6 2 1 3 40 5 6 2 1 3 4 7 adjacencies breakpoints Extend π with π 0 = 0 and π n+1 = n+1
47
Add 0 =0 and n + 1 =n+1 at ends of Example: Extending with 0 and 10 Note: A new breakpoint was created after extending Extending Permutations = 1 9 3 4 7 8 2 6 5 = 0 1 9 3 4 7 8 2 6 5 10
48
Each reversal eliminates at most 2 breakpoints. = 2 3 1 4 6 5 0 2 3 1 4 6 5 7 b( ) = 5 0 1 3 2 4 6 5 7 b( ) = 4 0 1 2 3 4 6 5 7b( ) = 2 0 1 2 3 4 5 6 7 b( ) = 0 Reversal Distance and Breakpoints This implies: reversal distance ≥ #breakpoints / 2
49
Sorting By Reversals: A Better Greedy Algorithm BreakPointReversalSort( ) 1 while b( ) > 0 2 Among all possible reversals, choose reversal minimizing b( ) 3 (i, j) 4 output 5 return Problem: this algorithm may work forever
50
Strips Strip: an interval between two consecutive breakpoints in a permutation –Decreasing strip: elements in decreasing order (e.g. 6 5) –Increasing strip: elements in increasing order (e.g. 7 8) 0 1 9 4 3 7 8 2 5 6 10 –Consider single-element strips decreasing except strips 0 and n+1 are increasing
51
Reducing the Number of Breakpoints Theorem 1: If permutation contains at least one decreasing strip, then there exists a reversal which decreases the number of breakpoints (i.e. b( ) < b( ) )
52
Things To Consider For = 1 4 6 5 7 8 3 2 0 1 4 6 5 7 8 3 2 9 b( ) = 5 –Choose decreasing strip with the smallest element k in (k = 2 in this case)
53
Things To Consider For = 1 4 6 5 7 8 3 2 0 1 4 6 5 7 8 3 2 9 b( ) = 5 –Choose decreasing strip with the smallest element k in (k = 2 in this case)
54
Things To Consider For = 1 4 6 5 7 8 3 2 0 1 4 6 5 7 8 3 2 9 b( ) = 5 –Choose decreasing strip with the smallest element k in (k = 2 in this case) –Find k – 1 in the permutation
55
Things To Consider For = 1 4 6 5 7 8 3 2 0 1 4 6 5 7 8 3 2 9 b( ) = 5 –Choose decreasing strip with the smallest element k in (k = 2 in this case) –Find k – 1 in the permutation –Reverse segment between k and k-1: 0 1 2 3 8 7 5 6 4 9 b( ) = 4
56
Reducing the Number of Breakpoints Again –If there is no decreasing strip, there may be no reversal that reduces the number of breakpoints (i.e. b( ) ≥ b( ) for any reversal ). –By reversing an increasing strip ( # of breakpoints stay unchanged ), we will create a decreasing strip at the next step. Then the number of breakpoints will be reduced in the next step (theorem 1).
57
Things To Consider (cont’d) There are no decreasing strips in , for: = 0 1 2 5 6 7 3 4 8 b( ) = 3 (3,4)= 0 1 2 5 6 7 4 3 8 b( ) = 3 (3,4) does not change the # of breakpoints (3,4) creates a decreasing strip thus guaranteeing that the next step will decrease the # of breakpoints.
58
ImprovedBreakpointReversalSort ImprovedBreakpointReversalSort( ) 1 while b( ) > 0 2 if has a decreasing strip 3 Among all possible reversals, choose reversal that minimizes b( ) 4 else 5 Choose a reversal that flips an increasing strip in 6 7 output 8 return
59
ImprovedBreakPointReversalSort is an approximation algorithm with a performance guarantee of at most 4 –It eliminates at least one breakpoint in every two steps; at most 2b( ) steps –Approximation ratio: 2b( ) / d( ) –Optimal algorithm eliminates at most 2 breakpoints in every step: d( ) b( ) / 2 –Performance guarantee: ( 2b( ) / d( ) ) [ 2b( ) / (b( ) / 2) ] = 4 Performance Guarantee
60
Signed Permutations Up to this point, reversal sort algorithms sorted unsigned permutations But genes have directions… so we should consider signed permutations 5’5’ 3’3’ = 1 -2 - 3 4 -5
61
Signed Permutation Genes are directed fragments of DNA Genes in the same position but different orientations do not have same gene order These two permutations are not equivalent gene sequences 1 2 3 4 5 -1 2 -3 -4 -5
62
Signed permutations are easier! Polynomial time (optimal) algorithm is known
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.