Cleber V. G. Mira Analysis of Sorting by Transpositions based on Algebraic Formalism RECOMB 2004 João Meidanis
Genomes as Permutations ● Permutation ● Genome ( 1 2... n )( n... 2 1 ) Complementary Cycles 11 22 nn 1 2 Complementary Strands n = (-0 0)(1 –1)...(n –n) Consider the permutation which complements the signal of an element.
Working with Transpositions ● Since we are working with transpositions, we will consider only one of the strands: = ( 1 2... n ) 1 2... k = ● Sorting by transpositions: ● Circular order: ( i ) = i+1 ( n ) = 1
Product of Permutations = ( ) = (6 4 2 ) E = {1, 2, 3, 4, 5, 6} (1) = 1 (1) = 3 (3) = 3 (3) = 2 (2) = 6 (6) = 6 (6) = 4 (4) = 4 } (4) = 2 (2) = 5 (5) = 5 (5) = 1 = ( )
Applying a Transposition ( i j k ) ( 1... i... j-1 j... k-1 k... n ) = ( 1... i-1 j... k-i i... j-1 k... n ) In the Algebraic approach: (i, j, k) = ( i j k ) (1, 4, 5) = (4 1 5) = ( ) = (4 1 5) ( ) = ( )
2-cycle Decomposition ● Every permutation has a 2-cycle decomposition. = ( ) = (4 3)(3 2)(2 1)(1 5) ● Odd cycles have an even number of 2- cycles in their 2-cycle decomposition. ● The norm of is the minimum number of cycles in the 2-cycle decomposition of .
3-cycle Decompositions ● Permutations whose norm is even have a minimum decomposition on 3-cycles. ● The 3-norm is the minimum number of cycles in the 3-cycles decomposition of . = ( ) = (0 3 4)(4 6 2)(2 7 1)(1 5 8) | | 3 = 4
Building a 3-cycle Decomposition ● It is possible to find a 3-cycle decomposition of through its 2-cycle decomposition. = ( ) = (0 3)(3 4)(4 6)(6 2)(2 7)(7 1)(1 5)(5 8) = ( ) = (0 3 4)(4 6 2)(2 7 1)(1 5 8) (0 3)(3 4) = (0 3 4) (4 6)(6 2) = (4 6 2) (2 7)(7 1) = (2 7 1) (1 5)(5 8) = (1 5 8)
Lower Bound ● The 3-norm of a permutation is a lower bound for the transposition distance of . 1 2... k = 1 2... k = -1 k ≥ | -1 | 3 D t ( , ) ≥ | -1 | 3
Splits ● A split is a transposition which is not applicable to the genome , i.e. the product of this transposition and the genome is not a genome. Ex.: (1 2 3) is not applicable to ( ) since: (1 2 3)( ) = ( )(2 7)(3 4 6) It is not a genome!!
Split+Transposition Distance ● If we consider the problem of sorting genomes by splits and transpositions then the split+transposition distance of a genome to is: D st ( , ) = | | 3
Bibliography ● V. Bafna and P. A. Pevzner, 1995, Sorting by Transpositions. In: “Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms”, San Francisco, USA, pp ● J. Meidanis and Z. Dias, 2000, An Alternative Algebraic Formalism for Genome Rearrangements. In: “Comparative Genomics:Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment and Evolution of Gene Families” D. Sankoff and J.H. Nadeau, editors, pp