Presentation is loading. Please wait.

Presentation is loading. Please wait.

5. Lecture WS 2003/04Bioinformatics III1 Genome Rearrangements Compare to other areas in bioinformatics we still know very little about the rearrangement.

Similar presentations


Presentation on theme: "5. Lecture WS 2003/04Bioinformatics III1 Genome Rearrangements Compare to other areas in bioinformatics we still know very little about the rearrangement."— Presentation transcript:

1 5. Lecture WS 2003/04Bioinformatics III1 Genome Rearrangements Compare to other areas in bioinformatics we still know very little about the rearrangement events that produced the existing varieties of genomic architectures... Some material of this lecture borrowed from: Nipun Mehra, www.stanford.edu/class/cs374/Notes/lec17.pptwww.stanford.edu/class/cs374/Notes/lec17.ppt www.sna.csie.ndhu.edu.tw/~lung/seminar/20020502.ppt Bafna V., and P.A. Pevzner. "Sorting by reversals: genome rearrangements in plant organelles and evolutionary history of X Chromosome." Hannenhalli S., and P.A. Pevzner. "Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals.“ “Computational Molecular Biology” book by P.A. Pevzner, MIT press, chapter 10

2 5. Lecture WS 2003/04Bioinformatics III2 Processes of Evolution - Substitution - Insertion - Deletion - Translocation - Inversion/ Reversal - Duplication

3 5. Lecture WS 2003/04Bioinformatics III3 What is a reversal = inversion ? Break and Invert A T G C C T G T A C T A T A C G G A C A T G A T A T G T A C A G G C T A T A C A T G T C C G A T Purines (A, G) and Pyrimidines (C, T) switch strands Many organisms have highly similar genes but very different gene orders. Very prominent in Prokaryotes, Mitochondrial DNA and Mamallian X-chromosome.

4 5. Lecture WS 2003/04Bioinformatics III4 Types of Genome Rearrangements Two genomes may have many genes in common, but the genes may be arranged in a different sequence or be moved between chromosomes. Such differences in gene orders are the results of rearrangement events that are common in molecular evolution. For example, in unichromosomal genomes, the most common rearrangement events are reversals, in which a contiguous interval of genes is put into the reverse order. For multichromosomal genomes, the most common rearrangement events are reversals, translocations, fissions, and fusions. The pairwise genome rearrangement problem is to find an optimal scenario transforming one genome to another via these rearrangement events.

5 5. Lecture WS 2003/04Bioinformatics III5 Representation of a genome We consider a unichromosomal genome to bef a sequence of n genes. The genes are represented by numbers 1, 2,..., n. The two orientations of gene i are represented by i and -i. A genome is represented as a signed permutation of the numbers 1, 2,..., n. For example, a unichromosomal genome with n = 5 genes is 5 -3 4 2 -1

6 5. Lecture WS 2003/04Bioinformatics III6 Multichromosal Genome A multichromosomal genome consists of n genes spread over m chromosomes. We represent it as a signed permutation of 1, 2,..., n, with delimiters "$" or ";" inserted between the chromosomes. For example, a genome with 12 genes spread over 3 chromosomes is 7 -2 8 3 $ 5 9 -6 -1 12 $ 11 4 10 $ The order of the chromosomes and the direction of the chromosomes do not matter in the multichromosomal algorithms. Thus, we could represent this same genome by flipping the first chromosome (reverse the order of its entries and negate them) and then moving the last chromosome to the beginning: 11 4 10 $ -3 -8 2 -7 $ 5 9 -6 -1 12 $

7 5. Lecture WS 2003/04Bioinformatics III7 Unichromosomal genomes: sorting by reversal A reversal in a signed permutation is an operation that takes an interval in a permutation, reverses the order of the numbers, and changes all their signs. For example, 5 1 3 2 -9 7 -4 6 8 5 1 -7 9 -2 -3 -4 6 8 The reversal distance between two genomes is the minimum number of reversals it takes to get from one genome to the other. For a given pair of genomes, the reversal distance is unique, but there are usually many possible reversal scenarios with this distance. However, it is possible that this mathematical notion of reversal distance can underestimate the actual number of steps that occurred biologically.

8 5. Lecture WS 2003/04Bioinformatics III8 Multichromosomal genomes: rearrangement operations We treat four elementary rearrangement events in multichromosomal genomes: reversals, translocations, fusions, and fissions. Reversal: An interval within a single chromosome may be reversed in the same fashion as a reversal acts in the unichromosomal case: 7 -2 8 3 $ 7 -2 8 3 $ 5 9 -6 -1 12 $ 5 9 -12 1 6 $ 11 4 10 $ 11 4 10 $ Note: When the programs are run in unichromosomal mode, the genomes 3 1 2 and -2 -1 -3 are considered different (one reversal apart, distance=1), while in multichromosomal mode, those same genomes are considered equivalent (distance=0) because we have simply flipped an entire chromosome, which gives an equivalent genome in the multichromosomal mode.

9 5. Lecture WS 2003/04Bioinformatics III9 Two chromosomes "A B" and "C D" may be rearranged into "A D" and "C B". (The letters A, B, C, D stand for sequences of genes.) Because flipping chromosomes does not alter a genome (only its representation is altered), "A -C" and "-B D" is another possible translocation. (-B means to reverse the order of the genes in sequence B and negate each one.) For example, a translocation on chromosomes 1 and 3 is 7 -2 8 3 $ 7 -2 8 -4 -11 $ 5 9 -6 -1 12 $  5 9 -6 -1 12 $ 11 4 10 $-3 10 $ Translocation

10 5. Lecture WS 2003/04Bioinformatics III10 Fussion & Fission Fusion: Two chromosomes may be fused together into a single chromosome. Due to chromosome flippings, there are four distinct fusions between each pair of chromosomes. Here is one of the fusions between chromosomes 1 and 3: 7 -2 8 3 $ 7 -2 8 3 -10 -4 -11 $ 5 9 -6 -1 12 $ 5 9 -6 -1 12 $ 11 4 10 $ Fission: A chromosome may be broken into two chromosomes between any pair of genes: 7 -2 8 3 $ 7 -2 8 3 $ 5 9 -6 -1 12 $ 5 9 $ 11 4 10 $ -6 -1 12 $ 11 4 10 $

11 5. Lecture WS 2003/04Bioinformatics III11 Signed and unsigned genomes Most comparative mapping techniques determine the physical locations and relative order of genes in each chromosome, but do not determine which of two orientations each gene has. Current sequencing methods do provide the orientations. It turns out that the genome rearrangement problem (uni- and multichromosomal) for unsigned permutations is NP-hard, but the same problems for signed data can be done in polynomial time. Fortunately, with many genomes currently being sequenced, it is likely that many comparative maps (corresponding to unsigned permutations) will soon be replaced by sequencing data (corresponding to signed permutations).

12 5. Lecture WS 2003/04Bioinformatics III12 Multichromosomal genomes: rearrangement operations For example, to turn the unsigned genome 1 2 3 4 5 into the unsigned genome 1 4 3 2 5 requires one unsigned reversal. An assignment of signs may be designed in the source and destination genomes that give a signed reversal scenario requiring this same number of steps. Here, we get 1 2 3 4 5 1 -4 -3 -2 5 which also takes one step. Note that there may be other sign assignments taking this minimum number of steps.

13 5. Lecture WS 2003/04Bioinformatics III13 Multichromosomal genomes: rearrangement operations It is possible that correctly signed data would have increased the number of steps: 1 2 3 4 5 1 -4 -3 -2 5 1 -4 3 -2 5 If the data collection method did not determine signs, it is impossible to know mathematically whether the one step or two step scenario is more biologically accurate; the mathematical problem the genome rearrangement programs solve is to find the signs giving the minimum possible distance.

14 5. Lecture WS 2003/04Bioinformatics III14 X-Alignments Implication: The reversals took place equidistant from the center of chromosome. Those along the diagonal are orthologs between species. Those along anti-diagonal are duplicates separated by inversion, within species. The “X” Factor discovered by Eisen et al Alignment of whole genomes of prokaryotes like bacteria revealed X-like patterns in dot plots – called X-alignments.

15 5. Lecture WS 2003/04Bioinformatics III15 A biological model case 8765432111109 4328715611109 cabbage turnip Palmer and Herbon found that the mitochondrial genomes in cabbage and turnip had very similar gene sequences, but with fairly different gene orders. How to design a „transformation“ of cabbage into turnip? Mitochondrial DNA of cabbage and turnip are composed of five conserved blocks of genes that are shuffled in cabbage as compared to turnip. Every conserved block has a direction that is shown by a + or – sign.

16 5. Lecture WS 2003/04Bioinformatics III16 Inversion, Transposition and inverted Transposition inversion transposition inverted transposition

17 5. Lecture WS 2003/04Bioinformatics III17 Oriented/Unoriented Blocks Remember that the unoriented case results in an NP-Hard problem, whereas the oriented case can be solved in polynomial time. 21375486 12345678 8765432111109 4328715611109 UNORIENTED BLOCKS ORIENTED BLOCKS Polynomial Time NP-Hard

18 5. Lecture WS 2003/04Bioinformatics III18 Sorting by Reversals 8765432111109 8765432111109 8234567111109 4328715611109 8234517611109 4328517611109 4328715611109 4328715611109 Cabbage Turnip

19 5. Lecture WS 2003/04Bioinformatics III19 Permutation (  ) : an ordered arrangement of the set { 1,2,…,n} Reversal (  ) :a rearrangement that inverts a block in  {3 4 7 6 1 5 2 }  (3,6) ={3 4 5 1 6 7 2} Signed Permutation (  ): a permutation where the elements are oriented a reversal switches element orientation {+3 -4 + 7 -6 +1 -5 +2 }  (3,6) ={+3 -4 +5 -1 +6 -7 +2}

20 5. Lecture WS 2003/04Bioinformatics III20 easy to do by eye... 8765432111109 8765432111109 8234567111109 4328715611109 8234517611109 4328517611109 4328715611109 4328715611109  11 1212 123123   1  2….  t =   =   t ….  2  1

21 5. Lecture WS 2003/04Bioinformatics III21 Formal Approach: Sorting by Reversals The order of genes in 2 organisms is represented by permutations  =  1  2 ...  n and  =  1  2...  n. A reversal of an interval [i,j] is the permutation 1 2... i-1 i i+1... j-1 j j+1... n 1 2... i-1 j j-1... i+1 i j+1... n  (i,j) has the effect of reversing the order of  i  i+1...  j and transforming  1...  i-1  i...  j  j+1...  n into   (i,j) =  1...  i-1  j...  i  j+1...  n. Given permutations  and , the reversal distance problem is to find a series of reversals  1  2...  t such that   1  2...  t =  and t is minimal. t is called the reversal distance between  and .

22 5. Lecture WS 2003/04Bioinformatics III22 Breakpoint Graph Sort a permutation is a hard problem. Breakpoints were introduced by Watterson et al. (1982) and by Nadeau and Taylor (1984) and correlations were noticed between the reversal distance and the number of breakpoints. Let i  j if |i – j| = 1. Extend a permutation  =  1  2...  n by adding  0 = 0 and  n+1 = n + 1. We call a pair of elements (  i,  i+1 ), 0  i  n, of  an adjacency if  i   i+1, and a breakpoint if  i   i+1. 2 3 1 4 6 5 7 0 2 3 1 4 6 5 7 8 adjacencies breakpoints As the identity permutation has no breakpoints, sorting by reversals corresponds to eliminating breakpoints. An observation that every reversal can eliminate at most 2 breakpoints implies that the reversal distance d(  )  b(  ) / 2 where b(  ) is the number of breakpoints in . However, this is a clear overestimate.

23 5. Lecture WS 2003/04Bioinformatics III23 Breakpoint Graph The breakpoint graph of a permutation  is an edge-colored graph G(  ) with n + 2 vertices {  0,  1...  n,  n+1 }  {0, 1,..., n, n+1}. We join vertices  i and  i+1 by a black edge for 0  i  n. We join vertices  i and  j by a gray edge if  i   j. Black path 0 2 3 1 4 6 5 7 Grey path 0 2 3 1 4 6 5 7 Superposition of black and grey paths forms the breakpoint graph: A breakpoint graph is obtained by a super- position of a black path traversing the vertices 0, 1,..., n, n+1 in the order given by the permutation  and a gray path traversing the vertices in the order given by the identity permutation.

24 5. Lecture WS 2003/04Bioinformatics III24 Cycle decomposition A cycle in an edge-colored graph G is called alternating if the colors of every two consecutive edges of this cycle are distinct. In the following, cycles will mean alternating cycles. Cycle decomposition of the breakpoint graph: 0 2 3 1 4 6 5 7 A vertex v in a graph G is called balanced if the number of black edges incident to v equals the number of grey edges incident to v. A balanced graph is a graph in which every vertex is balanced. G(  ) is a balanced graph. Therefore, there exists a cycle decomposition of G(  ) into edge-disjoint alternating cycles (every edge in the graph belongs to exactly one cycle in the decomposition). Cycles in an edge decomposition may be self-intersecting. The previous breakpoint graph can be decomposed into 4 cycles, one of which is self-intersecting.

25 5. Lecture WS 2003/04Bioinformatics III25 Cycle decomposition What is the decomposition of the breakpoint graph into a maximum number c(  ) of edge-disjoint alternating cycles? Here, c(  ) = 4. Cycle decompositions play an important role in estimating reversal distances. When a reversal is applied to a permutation, the number of cycles in a maximum decomposition can change by at most one (while the number of breakpoints can change by two). Bafna&Pevzner (1996) proved the bound: d(  )  n + 1 - c(  ) Which is much tighter than the bound in terms of breakpoints d(  )  b(  ) / 2. For many biological problems, d(  ) = n + 1 - c(  ). Therefore, the reversal distance problem reduces to the problem of finding the maximal cycle decomposition.

26 5. Lecture WS 2003/04Bioinformatics III26 Effects of reversals on cycles (A) For reversals acting on two cycles,  (b – c) = 1. (B) For reversals acting on an unoriented cycle,  (b – c) = 0. (C) For reversals acting on an oriented cycle,  (b – c) = -1 Hannenvalli, Pevzner, Journal of the ACM 46, 1 (1999)

27 5. Lecture WS 2003/04Bioinformatics III27 Effect of reversals on gray edges (a) A proper reversal on an oriented gray edge. (b) A nonproper reversal on an unoriented gray edge. Hannenvalli, Pevzner, Journal of the ACM 46, 1 (1999)

28 5. Lecture WS 2003/04Bioinformatics III28 Transform signed into unsigned permutation (a) Optimal sorting of a permutation (3 5 8 6 4 7 9 2 1 10 11) by 5 reversals. (b) Breakpoint graph of this permutation: black edges connect adjacent vertices that are not consecutive, gray edges connect consecutive vertices that are not adjacent. (c) Transformation of a signed permutation into an unsigned permutation  and the breakpoint graph G(  ); (d) Interleaving graph H  with two oriented and one unoriented unoriented component. Hannenvalli, Pevzner, Journal of the ACM 46, 1 (1999)

29 5. Lecture WS 2003/04Bioinformatics III29 The Problems Minimum Sorting by Reversals (MinSortRv): Given a permutation , what is the shortest sequence (  1  2….  t ) of reversals that sorts  ? (Distance: d(  ))  Complexity remains open. (NP-Hard) Minimum Signed Sorting by Reversals (SignedSortRv): Given a signed permutation , what is the shortest sequence (  1  2….  t ) of reversals that sorts  ?  Solvable in polynomial time.

30 5. Lecture WS 2003/04Bioinformatics III30 KS93 -Kececioglu and Sanko “ Exact and approximation algorithms for the inversion distance between two chromosomes ", 4th CPM - studied MinSortRv - introduced notion of breakpoints - 2 approximation algorithm BP93 -Bafna and Pevzner “Genome Rearrangements and Sorting by Reversals", 34th FOCS - breakpoint graph and cycle decomposition - introduced signed sorting SignedSortRv - 3/2 approx algorithm for SignedSortRv - 7/4 approx algorithm for MinSortRv HP95 - Hannenhali and Pevzner “Transforming Cabbage into Turnip”, 27th STOC - SignedSortRv resolved - O(n 4 ) algorithm - introduced hurdles and fortresses - d(  ) = b(  ) - c(  ) + h(  ) + f(  ) Important developments

31 5. Lecture WS 2003/04Bioinformatics III31 KS93- Breakpoints Extend  to include element 0 (L) on the left and element n+1 (R) on the right. A breakpoint occurs between two adjacent elements that do not differ by 1 Example:  = { 3 5 6 7 2 1 4 8 } has 5 breakpoints, (b(  ) = 5). R  3  5 6 7  2 1  4  8 L Breakpoints partition sequence into strips that are increasing or decreasing. Reversals add or remove breakpoints.  Sorted permutation has 0 breakpoints. i-reversal (i = 0,1, 2): a reversal that decreases number of breakpoints by i. Theorem (KS): Let  contain a decreasing strip. Then  has a 1- or 2-reversal. If every reversal that removes a breakpoint of  results in a permutation with no decreasing strips, then  has a 2-reversal.

32 5. Lecture WS 2003/04Bioinformatics III32 Algorithm KS(  ) i  0 while  contains a breakpoint i  i+1   the reversal that removes the most breakpoints, resolving ties in favor of reversals that leave a decreasing strip     return Optimal reversal distance is at least b(  )/2 KS returns a solution that is at most 2*optimal = b(  )

33 5. Lecture WS 2003/04Bioinformatics III33 BP93 – Breakpoint Graph Vertices: elements of  (plus 0 (L) and n+1 (R) ) 231 6 54LR + - 66 THE DIAGRAM OF REALITY AND DESIRE

34 5. Lecture WS 2003/04Bioinformatics III34 Construction of a diagram of reality and desire 32145 LR L-3+3+2-2+1-4+4+5-5R L-3+3+2-2+1-4+4+5-5R Reality edges L-3+3+2-2+1-4+4+5-5R Desire edges 12345 LR Desire Reality

35 5. Lecture WS 2003/04Bioinformatics III35 L-3+3+2-2+1-4+4+5-5R LR -3 +3 +2 -2 +1 -4 +4 +5 -5

36 5. Lecture WS 2003/04Bioinformatics III36 c(  ) = number of cycles in a maximum cycle decomposition Observation: reversals affect c(  ). Example: {L [+1 -1] –2 +2 +3 -3 R} - removes 2 breakpoints and 1 cycle. effect of reversals on c(  ) L+1-2+2+3-3R L+1-2+2+3-3R

37 5. Lecture WS 2003/04Bioinformatics III37 d(  ) >= b(  ) - c(  ) Cycles of length 4 are eliminated by 2-reversals. Let c 4 (  ) = number of 4-cycles. (c(  ) - c 4 (  )) : Cycles of length > 4 include at least three breakpoints d(  ) >= b(  ) – c 4 (  ) - (c(  ) - c 4 (  )) / 3

38 5. Lecture WS 2003/04Bioinformatics III38 Algorithm BP(  ) while contains a breakpoint if has no decreasing strips if a 4-cycle C remains Find cycle C’ that crosses C 0-reversal on C’, 2-reversal on C else Regular 0-reversal else Regular greedy choice Algorithm BP produces a solution that is at most (3*optimal)/2

39 5. Lecture WS 2003/04Bioinformatics III39 A F B C D E Interleaving Graph HP95 – Hurdles and Fortress

40 5. Lecture WS 2003/04Bioinformatics III40 Hurdles A hurdle is a bad component that does not separate any other two bad components. Separation is an important concept, in that a reversal through reality edges in different components A and C will result in every component B, that separates A and C being twisted. A bad component becomes good when twisted. Bad Components Non-Hurdles Hurdles Simple Hurdles Super Hurdles B A F C D E

41 5. Lecture WS 2003/04Bioinformatics III41 Fortress A permutation a is called fortress f(  ) when its reality and desire diagram contains an odd number of hurdles and all of them are super hurdles. Fortresses are permutations that require one extra reversal to sort, due to their special structure A smallest possible fortress.

42 5. Lecture WS 2003/04Bioinformatics III42 Algorithm HP(  ) If there is a good component in RD(  ) then pick two divergent edges e,f in this component, making sure the corresponding reversal does not create any bad components Return the reversal characterized by e and f Else if h(  ) is even then Return merging of two opposite hurdles else if h(  ) is odd and there is a simple hurdle return a reversal cutting this hurdle else // fortress return merging of any two hurdles d(  )  b(  ) - c(  ) + h(  ) + f(  ) h(  ): number of hurdles f(  ): 0/1, according to  being a fortress or not

43 5. Lecture WS 2003/04Bioinformatics III43 Hurdles (a)Unoriented component U separates U‘ and U‘‘ by virtue of the edge (0, 1) (b)Hurdle U does not separate U‘ and U‘‘. Hannenvalli, Pevzner, Journal of the ACM 46, 1 (1999)

44 5. Lecture WS 2003/04Bioinformatics III44 Effects of reversals on cycles Hannenvalli, Pevzner, Journal of the ACM 46, 1 (1999) Reversal on a cycle C (i) deletes vertex C from the interleaving graph; (ii) changes the orientation of vertices in V(C); (iii) complements the subgraph induced by V(C).

45 5. Lecture WS 2003/04Bioinformatics III45 Merging hurdles Hannenvalli, Pevzner, Journal of the ACM 46, 1 (1999)

46 5. Lecture WS 2003/04Bioinformatics III46 Hannenvalli-Pevzner algorithm Hannenvalli, Pevzner, Journal of the ACM 46, 1 (1999)

47 5. Lecture WS 2003/04Bioinformatics III47 Improvements of Hannenhalli-Pevzner algorithm Several websites offer programs to sort permutations by reversals. At their roots is the Hannenhalli-Pevzner algorithm for sorting signed permutations by reversals. Successive authors improved the algorithm. By the Hannenhalli&Pevzner algorithm, the distance computation is performed in time O(n 4 ). improvements in the algorithm developed by Haim Kaplan, Ron Shamir and Robert E. Tarian bring the time to compute distance down to O(n 2 ). GRAPPA is written by a multitude of authors. It reduces the distance computation time to O(n) using improvements by David A. Bader, Bernard M.E. Moret and Mi Yan. The main purpose of GRAPPA is to construct phylogenetic trees for multiple signed unichromosomal genomes; the distance computation on which we are focused here is but a mere subroutine in that context.

48 5. Lecture WS 2003/04Bioinformatics III48 Algorithm by Kaplan, Shamir, Tarjan The algorithm has three main stages: 1. Pre-process the permutation. This pre-processing contains 3 sub stages: 1a. Unsign the permutation, e.g., p will be unsigned to the permutation 0, (7,8), (4,3), (1,2), (5,6), (12,11), (9,10), 13. 1b. Define the Overlap graph of the permutation 1c. Find the connected components of the overlap graph 2. Clear the hurdles. A hurdle is a problematic connected component of the overlap graph. In this stage each reversal merges two hurdles in distinct connected components into one non-hurdle component. 3. Generate a sequence of safe reversals. A safe reversal is defined as a reversal that reduces b-c (the number of breakpoints minus the number of cycles) without creating new hurdles.

49 5. Lecture WS 2003/04Bioinformatics III49 Multichromosomal genomes: more tricky Word problems and insertions/deletions So far we did not consider "word problems" in which some genes are repeated, 1 2 -1 3 4 nor did we allow gaps in the numbering (as may arise from insertion/deletion), 1 3 -9 -7 5 Distinguish between microrearrangements (e.g. intrachromosomal rearrangements with a span < 1 Mb) and macrorearrangements (e.g. intrachromosomal rearrangements of larger span as well as interchromosomal rearrangements). The existing rearrangement algorithms do not distinguish between these two types of rearrangements. First identify conserved synteny blocks (segments that can be converted into conserved segments by microrearrangements).

50 5. Lecture WS 2003/04Bioinformatics III50 Genome Rearrangements: Synteny (a) Human and mouse synteny blocks of conserved gene order. Every block corresponds to a rectangle, with a diagonal showing whether the arrangements of anchors in human and mouse (within the synteny block) are the same or reversed. (b) Combining anchors into clusters by the GRIMM-Synteny algorithm at G = 100 kb. The edges in the anchor graph connect the closest ends of the anchors. The anchors are color- coded by the resulting clusters. At G = 1 Mb, this forms a single cluster, which in turn forms a synteny block (the lower right block in the human 18/mouse 17 rectangle in a). Pevzner, Tesler, Genome Res 13, 37 (2003)

51 5. Lecture WS 2003/04Bioinformatics III51 From Anchors to Breakpoint Graphs X-chromosome: from local similarities, to synteny blocks, to breakpoint graph, to rearrangement scenario. (a) Dot-plot of anchors. Anchors are enlarged for visibility. (b) Clusters of anchors. (c) Rectified clusters. (d) Synteny blocks. (e) Synteny blocks (symbolic representation as genome rearrangement units). (f) 2D breakpoint graph superimposed on synteny blocks. The projections of the 2D graph onto the human and mouse axes form the conventional breakpoint graphs. (g) 2D breakpoint graph. The four cycles in the breakpoint graph are shown by different colors. (h) A most parsimonious rearrangement scenario for human and mouse X-chromosomes. Pevzner, Tesler, Genome Res 13, 37 (2003)

52 5. Lecture WS 2003/04Bioinformatics III52 Genome Rearrangements Construction of the breakpoint graph from synteny blocks. (a) Solid path through human. (b) Dotted path through mouse. (c) Superposition of paths. (d) Remove blocks to obtain cycles. Pevzner, Tesler, Genome Res 13, 37 (2003)

53 5. Lecture WS 2003/04Bioinformatics III53 Multichromosomal breakpoint graph Multichromosomal breakpoint graph of the whole human and mouse genomes. The conventional chromosome order and orientation are not suitable for such graphs; an optimal chromosome order and orientation were determined by the algorithm in Tesler (2002). Three "null chromosomes," N1, N2, N3, were added to mouse to equalize the number of chromosomes in the two genomes. Pevzner, Tesler, Genome Res 13, 37 (2003)

54 5. Lecture WS 2003/04Bioinformatics III54 Summary Computational studies on genome rearrangements assume that evolution took the path of shortest reversal distance. Due to algorithmic improvements, the computational costs could be significantly reduced. It may be very advantageous to simultaneously analyze more than 2 genomes at the same time!


Download ppt "5. Lecture WS 2003/04Bioinformatics III1 Genome Rearrangements Compare to other areas in bioinformatics we still know very little about the rearrangement."

Similar presentations


Ads by Google