A Unifying View of Genome Rearrangement

Slides:



Advertisements
Similar presentations
Coloring Warm-Up. A graph is 2-colorable iff it has no odd length cycles 1: If G has an odd-length cycle then G is not 2- colorable Proof: Let v 0, …,
Advertisements

A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions Tzvika Hartman Weizmann Institute.
Sorting by reversals Bogdan Pasaniuc Dept. of Computer Science & Engineering.
Connectivity - Menger’s Theorem Graphs & Algorithms Lecture 3.
 Theorem 5.9: Let G be a simple graph with n vertices, where n>2. G has a Hamilton circuit if for any two vertices u and v of G that are not adjacent,
Simple Graph Warmup. Cycles in Simple Graphs A cycle in a simple graph is a sequence of vertices v 0, …, v n for some n>0, where v 0, ….v n-1 are distinct,
Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,
Edge-connectivity and super edge-connectivity of P 2 -path graphs Camino Balbuena, Daniela Ferrero Discrete Mathematics 269 (2003) 13 – 20.
Introduction Sorting permutations with reversals in order to reconstruct evolutionary history of genome Reversal mutations occur often in chromosomes where.
Greedy Algorithms And Genome Rearrangements
Genome Rearrangements. Basic Biology: DNA Genetic information is stored in deoxyribonucleic acid (DNA) molecules. A single DNA molecule is a sequence.
Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S.
1 Michal Ozery-Flato and Ron Shamir 2 The Genomic Sorting Problem HOW?
Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals Journal of the ACM, vol. 46, No. 1, Jan 1999, pp
CTIS 154 Discrete Mathematics II1 8.2 Paths and Cycles Kadir A. Peker.
Genome Rearrangement SORTING BY REVERSALS Ankur Jain Hoda Mokhtar CS290I – SPRING 2003.
1 Genome Rearrangements João Meidanis São Paulo, Brazil December, 2004.
Chapter 4 Graphs.
7-1 Chapter 7 Genome Rearrangement. 7-2 Background In the late 1980‘s Jeffrey Palmer and colleagues discovered a remarkable and novel pattern of evolutionary.
Genome Rearrangement By Ghada Badr Part II. 2  Genomes can be modeled by each gene can be assigned a unique number and is exactly found once in the genome.
A Simplified View of DCJ-Indel Distance Phillip Compeau A Simplified View of DCJ- Indel Distance Phillip Compeau University of California-San Diego Department.
Using Dijkstra’s Algorithm to Find a Shortest Path from a to z 1.
Genome Rearrangements Tseng Chiu Ting Sept. 24, 2004.
Chapter 2 Graph Algorithms.
1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:
Trees and Distance. 2.1 Basic properties Acyclic : a graph with no cycle Forest : acyclic graph Tree : connected acyclic graph Leaf : a vertex of degree.
Genome Rearrangements Anne Bergeron, Comparative Genomics Laboratory Université du Québec à Montréal Belle marquise, vos beaux yeux me font mourir d'amour.
16. Lecture WS 2004/05Bioinformatics III1 V16 – genome rearrangement Important information – contained in the order in which genes occur on the genomes.
A Simpler 1.5-Approximation Algorithm for sorting by transposition Tzvika Hartman.
Genome Rearrangements [1] Ch Types of Rearrangements Reversal Translocation
Chap. 7 Genome Rearrangements Introduction to Computational Molecular Biology Chap ~
Sorting by Cuts, Joins and Whole Chromosome Duplications
CS 200 Algorithms and Data Structures
Chapter 1 Fundamental Concepts Introduction to Graph Theory Douglas B. West July 11, 2002.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
MAT 2720 Discrete Mathematics Section 8.2 Paths and Cycles
Genome Rearrangement By Ghada Badr Part I.
Genome Rearrangements. Turnip vs Cabbage: Look and Taste Different Although cabbages and turnips share a recent common ancestor, they look and taste different.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
Minimum Spanning Trees Text Read Weiss, §9.5 Prim’s Algorithm Weiss §9.5.1 Similar to Dijkstra’s Algorithm Kruskal’s Algorithm Weiss §9.5.2 Focuses on.
CSE 421 Algorithms Richard Anderson Autumn 2015 Lecture 5.
Tumor Genomes Compromised genome stability Mutation and selection Chromosomal aberrations –Structural: translocations, inversions, fissions, fusions. –Copy.
The graph is neither Eulerian or Semi – Eulerian as it has 4 ODD vertices.
Trees.
De Bruijn sequences 陳柏澍 Novembers Each of the segments is one of two types, denoted by 0 and 1. Any four consecutive segments uniquely determine.
Conservation of Combinatorial Structures in Evolution Scenarios
Genome Rearrangement and Duplication Distance
Maximum Likelihood Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Tree of 68 Eukaryotes By: Yu Lin Fei Hu , Jijun Tang Bernard.
Genome Rearrangements
11.2 Surface area of prisms and cylinders
Lecture 3: Genome Rearrangements and Duplications
Can you draw this picture without lifting up your pen/pencil?
Minimum Spanning Trees
Greedy Algorithms And Genome Rearrangements
Walks, Paths, and Circuits
Multiple Genome Rearrangement
Richard Anderson Autumn 2016 Lecture 5
5 The Mathematics of Getting Around
Discrete Mathematics Lecture 13_14: Graph Theory and Tree
Richard Anderson Winter 2009 Lecture 6
FanChang Hao, Melvin Zhang, and Hon Wai Leong Review for TCBB
Double Cut and Join with Insertions and Deletions
Graphs G = (V, E) V are the vertices; E are the edges.
Kruskal’s Algorithm AQR.
Richard Anderson Lecture 5 Graph Theory
Richard Anderson Winter 2019 Lecture 5
JAKUB KOVÁĆ, ROBERT WARREN, MARÍLIA D.V. BRAGA and JENS STOYE
Rearrangement Phylogeny of Genomes in Contig form
GRAPHS.
Homework Solutions.
Presentation transcript:

A Unifying View of Genome Rearrangement Anne Bergeron, Julia Mixtacki and Jens Stoye Presentation by Colleen Maquiling January 2017

Plan Introduction Graphs Genome Adjacency Graph Sorting with DCJ Conclusion

A B Distance 1 2 3 4 5 6 … Introduction “Given two genomes A and B, what is the shortest sequence of rearrangement operations that transforms A into B?” Distance A B 1 2 3 4 5 6 …

Double cut and join Introduction Inversions Translocations Fissions Fusions Transpositions Double cut and join

Graphs cycles {p} p q {p,q} paths

Graphs: Double Cut and Join The double cut and join (DCJ) operation acts on two vertices u and v of a graph with vertices of degree one or two in the following ways:

Graphs: DCJ (Paths) Translocation u={p,q} v={r,s}

Graphs: DCJ (Paths) Translocation u={p,q} v={r}

Graphs: DCJ (Paths) Path fusion/fission u={q} w={q,r} v={r}

Graphs: DCJ (Path/Cycle) inversion integration excisions

Graphs: DCJ (Path/Cycle) inversion linearization circularization

Graphs: DCJ (Cycles) inversion integration excisions

Graphs: Lemma 1 The application of a single DCJ operation changes the number of circular or linear components by at most one.

Genomes Adjacencies: {ah, bt}, {ah, bh}, {at, bt}, {at, bh} A gene is an oriented sequence of DNA that starts with a tail and ends with a head. b a {ah} {at} {bh} {bt} Adjacencies: {ah, bt}, {ah, bh}, {at, bt}, {at, bh} Telomeres: {ah}, {at}…

Genomes: Graph Representation A = {{at}, {ah, ct}, {ch, dh}, {dt}, {bh, et}, {eh, bt}, {ft}, {fh, gt}, {gh}} 7 genes: a, b, c, d, e, f, g

Genomes: Sorting & Distance Problem Given two genomes A and B who have the same genes (not in the same order), find a shortest sequence of DCJ operations that transforms A into B. The length of such a sequence is called the DCJ distance between A and B, denoted by dDCJ (A, B).

Genomes: Sorting Example A = {{at}, {ah, ct}, {ch, dh}, {dt}, {bh, et}, {eh, bt}, {ft}, {fh, gt}, {gh}} B = {{ah, bt}, {bh, at}, {ct}, {ch, dt}, {dh}, {et}, {eh}, {fh, gt}, {gh, ft}} ah bt bh at ct ch dt dh et eh fh gt gh ft

Genomes: Sorting Example A = {{at}, {ah, ct}, {ch, dh}, {dt}, {bh, et}, {eh, bt}, {ft}, {fh, gt}, {gh}}integration {{at}, {ah, bt}, {ch, dh}, {dt}, {bh, et}, {eh, ct}, {ft}, {fh, gt}, {gh}} excision {{et}, {ah, bt}, {ch, dh}, {dt}, {bh, at}, {eh, ct}, {ft}, {fh, gt}, {gh}} inversion {{et}, {ah, bt}, {ch, dt}, {dh}, {bh, at}, {eh, ct}, {ft}, {fh, gt}, {gh}} path fission {{et}, {ah, bt}, {ch, dt}, {dh}, {bh, at}, {eh}, {ct}, {ft}, {fh, gt}, {gh}} circulization B = {{ah, bt}, {bh, at}, {ct}, {ch, dt}, {dh}, {et}, {eh}, {fh, gt}, {gh, ft}}

Adjacency Graph The adjacency graph AG(A,B) is a graph whose set of vertices are the adjacencies and telomeres of A and B. For each u ∈ A and v ∈ B there are |u ∩ v| edges between u and v.

Adjacency Graph: Example A = {{at}, {ah, ct}, {ch, dh}, {dt}, {bh, et}, {eh, bt}, {ft}, {fh, gt}, {gh}} B = {{ah, bt}, {bh, at}, {ct}, {ch, dt}, {dh}, {et}, {eh}, {fh, gt}, {gh, ft}}

Sorting by DCJ : Lemma 2 Let A and E be two genomes defined on the same set of N genes, then we have A=E if and only if N=C+I/2 where C is the number of cycles and I the number of odd paths in AG(A,E).

N=C+I/2 Sorting by DCJ : Lemma 2 A = {{at}, {ah, ct}, {ch, dh}, {dt}, {bh, et}, {eh, bt}, {ft}, {fh, gt}, {gh}} at ah ct ch dh dt bh et eh bt ft fh gt gh at ah ct ch dh dt bh et eh bt ft fh gt gh E = {{at}, {ah, ct}, {ch, dh}, {dt}, {bh, et}, {eh, bt}, {ft}, {fh, gt}, {gh}} N=C+I/2

Sorting by DCJ : Lemma 3 The application of a single DCJ operation changes the number of odd paths in the adjacency graph by –2, 0, or 2. odd odd odd odd -2 even even -2 even

Sorting by DCJ : Lemma 3 even even even even +2 odd odd even odd even even even even even +2 odd odd even odd even odd odd even odd

Sorting by DCJ : Lemma 4 Let A and B be two genomes defined on the same set of N genes, then we have dDCJ(A,B) ≥ N−(C+I/2) where C is the number of cycles and I the number of odd paths in AG(A,B). Lemma 1: The application of a single DCJ operation changes the number of circular or linear components by at most one. Lemma 2: Let A and B be two genomes defined on the same set of N genes, then we have A=B if and only if N=C+I/2 where C is the number of cycles and I the number of odd paths in AG(A,B). Lemma 3: The application of a single DCJ operation changes the number of odd paths in the adjacency graph by –2, 0, or 2.

Sorting by DCJ : Algorithm A = {{at}, {ah, ct}, {ch, dh}, {dt}, {bh, et}, {eh, bt}, {ft}, {fh, gt}, {gh}} 2 odd paths 1 cycle B = {{ah, bt}, {bh, at}, {ct}, {ch, dt}, {dh}, {et}, {eh}, {fh, gt}, {gh, ft}} ah bt bh at ct ch dt dh et eh fh gt gh ft B = {{ah, bt}, {bh, at}, {ct}, {ch, dt}, {dh}, {et}, {eh}, {fh, gt}, {gh, ft}}

Sorting by DCJ : Algorithm (Adjacencies) u=ah, ct v=eh, bt u=ah bt x=ah bt A v=ct, eh x=ah, bt ct eh B ct eh

Sorting by DCJ : Algorithm (Telomeres) ch dt dh fh gt gh ft ah bt bh at et ct eh v=ct, eh A B p=ct eh v=ct, eh v=ct p=ct eh A B

Sorting by DCJ : Algorithm Result A = {{ah, bt}, {bh, at}, {ct}, {ch, dt}, {dh}, {et}, {eh}, {fh, gt}, {gh, ft}} ah bt bh at ct ch dt dh et eh fh gt gh ft B = {{ah, bt}, {bh, at}, {ct}, {ch, dt}, {dh}, {et}, {eh}, {fh, gt}, {gh, ft}} dDCJ(A,B)=N−(C+I/2)=7-(1+2/2)=5

Sorting by DCJ : Theorem 1 Let A and B be two genomes defined on the same set of N genes, then we have dDCJ(A,B) = N−(C+I/2) where C is the number of cycles and I the number of odd paths in AG(A,B).

Conclusion Transposition dDCJ(A,B) Efficient algorithm