Genome Rearrangements Anne Bergeron, Comparative Genomics Laboratory Université du Québec à Montréal Belle marquise, vos beaux yeux me font mourir d'amour. Vos yeux beaux d'amour me font, belle marquise, mourir. Me font vos beaux yeux mourir, belle marquise, d'amour.
1. General introduction to genome rearrangements Examples of rearranged genomes 2. Measures of distance Rearrangement operations The Hannenhalli-Pevzner distance equation 3. A unifying view of genome rearrangements The Double-Cut-and-Join operation The adjacency graph and the distance equation
1. General introduction to genome rearrangements Examples of rearranged genomes 2. Measures of distance Rearrangement operations The Hannenhalli-Pevzner distance equation 3. A unifying view of genome rearrangements The Double-Cut-and-Join operation The adjacency graph and the distance equation
Example of rearranged genomes : Mitochondrial Genomes Bombyx mori Homo sapiens Mitochondria are small, oval shaped organelles surrounded by two highly specialized membranes. Animal mitochondrial genomes are normally circular, ~16 kB in length, and encode: 13 proteins 22 tRNAs and 2 rRNAs.
RWLFSTNHKDIGTLYLLFGAWAGVLGTALSLLIRAELGQPGNLLGNDHIYNVIVTAHAFVMIFFMVMPIMIGGFGNWLVPLMIGAPDMAFPRMNNM KWIYSTNHKDIGTLYFIFGIWSGMIGTSLSLLIRAELGNPGSLIGDDQIYNTIVTAHAFIMIFFMVMPIMIGGFGNWLVPLMLGAPDMAFPRMNNM :*::***********::** *:*::**:**********:**.*:*:*:***.*******:**********************:************* SFWLLPPSLLLLLASAMVEAGAGTGWTVYPPLAGNYSHPGASVDLTIFSLHLAGVSSILGAINFITTIINMKPPAMTQYQTPLFVWSVLITAVLLLLSLP SFWLLPPSLMLLISSSIVENGAGTGWTVYPPLSSNIAHSGSSVDLAIFSLHLAGISSIMGAINFITTMINMRLNNMSFDQLPLFVWAVGITAFLLLLSLP *********:**::*::** ************:.* :*.*:****:********:***:********:***: *: * *****:* ***.******* VLAAGITMLLTDRNLNTTFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIVTYYSGKKEPFGYMGMVWAMMSIGFLGFIVWAHHMFTVGMDVD VLAGAITMLLTDRNLNTSFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIISQESGKKETFGCLGMIYAMLAIGLLGFIVWAHHMFTVGMDID ***..************:***************************************:: *****.** :**::**::**:****************:* TRAYFTSATMIIAIPTGVKVFSWLATLHGSNMKWSAAVLWALGFIFLFTVGGLTGIVLANSSLDIVLHDTYYVVAHFHYVLSMGAVFAIMGGFIHWFPLF TRAYFTSATMIIAVPTGIKIFSWLATMHGTQINYNPNILWSLGFVFLFTVGGLTGVILANSSIDITLHDTYYVVAHFHYVLSMGAVFAIIGGFINWYPLF *************:***:*:******:**:::::.. :**:***:**********::*****:**.***********************:****:*:*** SGYTLDQTYAKIHFTIMFIGVNLTFFPQHFLGLSGMPRRYSDYPDAYTTWNILSSVGSFISLTAVMLMIFMIWEAFASKRKVLMVEEPSMNLE TGLSLNSYMLKIQFFTMFIGVNMTFFPQHFLGLAGMPRRYSDYPDSYISWNMISSLGSYISLLSVMMMLIIIWESMINQRINLFSLNLPSSIE :* :*:. **:* ******:**********:***********:* :**::**:**:*** :**:*:::***::.:* *: :..:* Here is an alignment of the cytochrome c oxidase I of, respectively, Homo sapiens and Bombyx mori. RWLFSTNHKDIGTLYLLFGAWAGVLGTALSLLIRAELGQPGNLLGNDHIYNVIVTAHAFVMIFFMVMPIMIGGFGNWLVPLMIGAPDMAFPRMNNM KWIYSTNHKDIGTLYFIFGIWSGMIGTSLSLLIRAELGNPGSLIGDDQIYNTIVTAHAFIMIFFMVMPIMIGGFGNWLVPLMLGAPDMAFPRMNNM :X::XXXXXXXXXXX::XX X:X::XX:XXXXXXXXXX:XX.X:X:X:XXX.XXXXXXX:XXXXXXXXXXXXXXXXXXXXXX:XXXXXXXXXXXXX SFWLLPPSLLLLLASAMVEAGAGTGWTVYPPLAGNYSHPGASVDLTIFSLHLAGVSSILGAINFITTIINMKPPAMTQYQTPLFVWSVLITAVLLLLSLP SFWLLPPSLMLLISSSIVENGAGTGWTVYPPLSSNIAHSGSSVDLAIFSLHLAGISSIMGAINFITTMINMRLNNMSFDQLPLFVWAVGITAFLLLLSLP XXXXXXXXX:XX::X::XX XXXXXXXXXXXX:.X :X.X:XXXX:XXXXXXXX:XXX:XXXXXXXX:XXX: X: X XXXXX:X XXX.XXXXXXX VLAAGITMLLTDRNLNTTFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIVTYYSGKKEPFGYMGMVWAMMSIGFLGFIVWAHHMFTVGMDVD VLAGAITMLLTDRNLNTSFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIISQESGKKETFGCLGMIYAMLAIGLLGFIVWAHHMFTVGMDID XXX..XXXXXXXXXXXX:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:: XXXXX.XX :XX::XX::XX:XXXXXXXXXXXXXXXX:X TRAYFTSATMIIAIPTGVKVFSWLATLHGSNMKWSAAVLWALGFIFLFTVGGLTGIVLANSSLDIVLHDTYYVVAHFHYVLSMGAVFAIMGGFIHWFPLF TRAYFTSATMIIAVPTGIKIFSWLATMHGTQINYNPNILWSLGFVFLFTVGGLTGVILANSSIDITLHDTYYVVAHFHYVLSMGAVFAIIGGFINWYPLF XXXXXXXXXXXXX:XXX:X:XXXXXX:XX:::::.. :XX:XXX:XXXXXXXXXX::XXXXX:XX.XXXXXXXXXXXXXXXXXXXXXXX:XXXX:X:XXX SGYTLDQTYAKIHFTIMFIGVNLTFFPQHFLGLSGMPRRYSDYPDAYTTWNILSSVGSFISLTAVMLMIFMIWEAFASKRKVLMVEEPSMNLE TGLSLNSYMLKIQFFTMFIGVNMTFFPQHFLGLAGMPRRYSDYPDSYISWNMISSLGSYISLLSVMMMLIIIWESMINQRINLFSLNLPSSIE :X :X:. XX:X XXXXXX:XXXXXXXXXX:XXXXXXXXXXX:X :XX::XX:XX:XXX :XX:X:::XXX::.:X X: :..:X 73% identity over more than 500 amino acids. Example of rearranged genomes : Mitochondrial Genomes
A lowly worm Charles Darwin, But the order of the genes differs from species to species. The 37 genes of animal mitochondria are highly conserved. Example of rearranged genomes : Mitochondrial Genomes
COX1COX2ATP6ATP8COX3ND3ND2ND4LND4ND5CYTBRNSRNLND1 ND6 Homo sapiens mitochondrial genome (proteins and rRNAs) COX1COX2ATP6ATP8COX3ND3ND2ND6CYTB ND5ND4ND4LRNSRNLND1 Bombyx mori mitochondrial genome (proteins and rRNAs) ND4LND4ND5RNSRNLND1 ND6 ND5ND4ND4LRNSRNLND1 The invariant parts COX1 stands for the gene cytochrome c oxidase I. COX1 stands for the gene cytochrome c oxidase I. Example of rearranged genomes : Mitochondrial Genomes
COX1COX2ATP6ATP8COX3ND3ND2ND4LCYTB Homo sapiens mitochondrial genome (proteins and rRNAs) COX1COX2ATP6ATP8COX3ND3ND2CYTB ND4L ND4ND5RNSRNLND1 ND6 ND5ND4RNSRNLND1 Bombyx mori mitochondrial genome (proteins and rRNAs) The modified parts ND4 ND5 ND6 RNS RNL ND1 Example of rearranged genomes : Mitochondrial Genomes
Fruit Fly Mosquito Silkworm Locust Tick Centipede Example of rearranged genomes : Mitochondrial Genomes of 6 Arthropoda Identical ‘runs’ of genes have been grouped.
(Art work by Guillaume Bourque, scientific work by Guillaume Bourque, Pavel Pevzner and Glenn Tesler, 2004) Example of rearranged genomes : mammal X chromosomes Sixteen large synteny blocks are ordered differently in the X chromosomes of the human, mouse and rat. Blocks have similar gene content and order. Note that the estimated number of genes in the X chromosome is 2000.
(Art work by Guillaume Bourque, scientific work by Guillaume Bourque, Pavel Pevzner and Glenn Tesler, 2004) Example of rearranged genomes : mammal X chromosomes
Problem: Given two or more genomes, How do we measure their similarity and/or distance with respect to gene order and gene content? Sub-problem: How do we know that two genes or blocks are the "same" in two different species?
1. General introduction to genome rearrangements Examples of rearranged genomes 2. Measures of distance Rearrangement operations The Hannenhalli-Pevzner distance equation 3. A unifying view of genome rearrangements The Double-Cut-and-Join operation The adjacency graph and the distance equation
Rearrangement operations affect gene order and gene content. There are various types: Inversions Transpositions Reverse transpositions Translocations, fusions and fissions Duplications and losses Others... Rearrangement operations Any set of operations yields a distance between genomes, by counting the minimum number of operations needed to transform one genome into the other.
Rearrangement operations Inversions
Rearrangement operations Inversions
Rearrangement operations Inversions
Example: Mitochondrial Genomes of 6 Arthropoda Fruit Fly Mosquito Silkworm Locust Tick Centipede An inversion.
Rearrangement operations Transpositions
Rearrangement operations Transpositions
Rearrangement operations Transpositions
Example: Mitochondrial Genomes of 6 Arthropoda Fruit Fly Mosquito Silkworm Locust Tick Centipede A transposition
Rearrangement operations Reverse transpositions
Rearrangement operations Reverse transpositions
Rearrangement operations Reverse transpositions
Example: Mitochondrial Genomes of 6 Arthropoda Fruit Fly Mosquito Silkworm Locust Tick Centipede A reverse transposition
Rearrangement operations Translocations, fusions and fissions
Rearrangement operations Translocations, fusions and fissions
Rearrangement operations Translocations, fusions and fissions
Rearrangement operations Translocations, fusions and fissions
Rearrangement operations Translocations, fusions and fissions
Rearrangement operations Translocations, fusions and fissions
[Source: Linda Ashworth, LLNL] DOE Human Genome Program Report From 24 chromosomes To 21 chromosomes
1. General introduction to genome rearrangements Examples of rearranged genomes 2. Measures of distance Rearrangement operations The Hannenhalli-Pevzner distance equation 3. A unifying view of genome rearrangements The Double-Cut-and-Join operation The adjacency graph and the distance equation
The Hannenhalli-Pevzner distance equation In 1995, Hannenhalli and Pevzner found a formula to compute the minimum number of inversions, translocations, fusions or fissions necessary to transform a multichromosomal genome into another. Sketch of the approach: Cap the chromosomes Concatenate all the chromosomes Sort the resulting genome by inversions
1. General introduction to genome rearrangements Examples of rearranged genomes 2. Measures of distance Rearrangement operations The Hannenhalli-Pevzner distance equation 3. A unifying view of genome rearrangements The Double-Cut-and-Join operation The adjacency graph and the distance equation
Acts on up to 4 gene extremities:,,, Reminder The Double-Cut-and-Join operation Yancopoulos et al. 2005
Linear chromosomes Translocation The Double-Cut-and-Join operation Reminder
Fusion Fission Inversion Fission Fusion Linear and circular chromosomes The Double-Cut-and-Join operation Reminder
Circular chromosomes Fusion Fission Inversion Fission Fusion The Double-Cut-and-Join operation Reminder
1. General introduction to genome rearrangements Examples of rearranged genomes 2. Measures of distance Rearrangement operations The Hannenhalli-Pevzner distance equation 3. A unifying view of genome rearrangements The Double-Cut-and-Join operation The adjacency graph and the distance equation 4. Breakpoint reuse Breakpoint reuse estimates Minimizing breakpoint reuse
The adjacency graph and the distance equation Genome A Genome B Joint work with Julia Mixtacki and Jens Stoye
The adjacency graph and the distance equation Genome A Genome B Joint work with Julia Mixtacki and Jens Stoye
The adjacency graph and the distance equation Genome A Genome B Joint work with Julia Mixtacki and Jens Stoye
The adjacency graph and the distance equation Genome A Genome B Joint work with Julia Mixtacki and Jens Stoye
The adjacency graph and the distance equation Genome A Genome B Joint work with Julia Mixtacki and Jens Stoye
The adjacency graph and the distance equation Genome A Genome B Joint work with Julia Mixtacki and Jens Stoye
The adjacency graph and the distance equation Genome A Genome B C = number of cycles I = number of odd paths G = number of “genes” D = G - (C + I/2) D = 6 - (1 + 2/2) = 4 Joint work with Julia Mixtacki and Jens Stoye