FanChang Hao, Melvin Zhang, and Hon Wai Leong Review for TCBB

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Graph Algorithms
Advertisements

A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions Tzvika Hartman Weizmann Institute.
Sorting by reversals Bogdan Pasaniuc Dept. of Computer Science & Engineering.
School of CSE, Georgia Tech
Greedy Algorithms CS 466 Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix of.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
1 Counting Perfect Matchings of a Graph It is hard in general to count the number of perfect matchings in a graph. But for planar graphs it can be done.
Gene an d genome duplication Nadia El-Mabrouk Université de Montréal Canada.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Introduction Sorting permutations with reversals in order to reconstruct evolutionary history of genome Reversal mutations occur often in chromosomes where.
Greedy Algorithms And Genome Rearrangements
Genome Rearrangements CIS 667 April 13, Genome Rearrangements We have seen how differences in genes at the sequence level can be used to infer evolutionary.
Sorting Signed Permutations By Reversals (The Hannenhalli – Pevzner Theory) Seminar in Bioinformatics – ©Shai Lubliner.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Of Mice and Men Learning from genome reversal findings Genome Rearrangements in Mammalian Evolution: Lessons From Human and Mouse Genomes and Transforming.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.
1 Michal Ozery-Flato and Ron Shamir 2 The Genomic Sorting Problem HOW?
Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals Journal of the ACM, vol. 46, No. 1, Jan 1999, pp
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Genome Rearrangement SORTING BY REVERSALS Ankur Jain Hoda Mokhtar CS290I – SPRING 2003.
1 Sorting by Transpositions Based on the First Increasing Substring Concept Advisor: Professor R.C.T. Lee Speaker: Ming-Chiang Chen.
Using Homogeneous Weights for Approximating the Partial Cover Problem
Efficient Data Structures and a New Randomized Approach for Sorting Signed Permutations by Reversals Haim Kaplan and Elad Verbin.
7-1 Chapter 7 Genome Rearrangement. 7-2 Background In the late 1980‘s Jeffrey Palmer and colleagues discovered a remarkable and novel pattern of evolutionary.
GRAPH Learning Outcomes Students should be able to:
Genome Rearrangements …and YOU!! Presented by: Kevin Gaittens.
Genome Rearrangements Tseng Chiu Ting Sept. 24, 2004.
1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:
A Simpler 1.5-Approximation Algorithm for sorting by transposition Tzvika Hartman.
Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals.
Genome Rearrangements [1] Ch Types of Rearrangements Reversal Translocation
Chap. 7 Genome Rearrangements Introduction to Computational Molecular Biology Chap ~
Sorting by Cuts, Joins and Whole Chromosome Duplications
Chap. 7 Genome Rearrangements Introduction to Computational Molecular Biology Chapter 7.1~7.2.4.
Genome Rearrangement By Ghada Badr Part I.
Interchange and Weighted-Interchange Rearrangement Distances in Strings Joint work of: Amihood Amir, Tzvika Hartman, Oren Kapah and Avivit Levy.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
Tzvika Hartman Elad Verbin Bar Ilan University Tel Aviv University
Introduction to Algorithms
Amihood Amir, Gary Benson, Avivit Levy, Ely Porat, Uzi Vishne
Main algorithm with recursion: We’ll have a function DFS that initializes, and then calls DFS-Visit, which is a recursive function and does the depth first.
Haim Kaplan and Uri Zwick
Hamiltonian cycle part
Algorithms and Complexity
Greedy (Approximation) Algorithms and Genome Rearrangements
Lecture 3: Genome Rearrangements and Duplications
Lecture 10 Algorithm Analysis
Enumerating Distances Using Spanners of Bounded Degree
Bart M. P. Jansen June 3rd 2016, Algorithms for Optimization Problems
Instructor: Shengyu Zhang
CSCI2950-C Lecture 4 Genome Rearrangements
Greedy Algorithms And Genome Rearrangements
Problem Solving 4.
Richard Anderson Autumn 2016 Lecture 5
A Unifying View of Genome Rearrangement
Richard Anderson Winter 2009 Lecture 6
Algorithms (2IL15) – Lecture 7
Lecture 14 Shortest Path (cont’d) Minimum Spanning Tree
Double Cut and Join with Insertions and Deletions
Greedy Algorithms And Genome Rearrangements
Clustering.
Distance-preserving Subgraphs of Interval Graphs
Lecture 13 Shortest Path (cont’d) Minimum Spanning Tree
Richard Anderson Winter 2019 Lecture 5
JAKUB KOVÁĆ, ROBERT WARREN, MARÍLIA D.V. BRAGA and JENS STOYE
Locality In Distributed Graph Algorithms
Treewidth meets Planarity
Presentation transcript:

FanChang Hao, Melvin Zhang, and Hon Wai Leong Review for TCBB A 2-Approximation Algorithm for Sorting Signed Permutations by Reversals, Transpositions, Trans-Reversals, and Block-Interchanges FanChang Hao, Melvin Zhang, and Hon Wai Leong Review for TCBB Ron Zeira 5.8.15

Outline Introduction Bridge structures Algorithm Motivation Genome rearrangement model Breakpoint graph Bridge structures Algorithm Discussion and summary

Genome rearrangements

Motivation: evolution Human genome project

Computational Genomics Prof. Ron Shamir & Prof. Roded Sharan School of Computer Science, Tel Aviv University

GR distance problem Distance dop(A,B) – minimal number of operations between genomes A and B. Operations: Reversals Translocations Transpositions Others…

Genomes A genome is a signed permutation of genes: Goal: sort A into the identity permutation: Add dummy genes (telomeres): A=( 1 -3 -2 4 6 -5 ) ID=( 1 2 3 4 5 6 ) A=( 0 1 -3 -2 4 6 -5 7 ) ID=( 0 1 2 3 4 5 6 7 )

Reversals A reversal (inversion) rev(i,j) (0≤i<j<n+1) reverses the segment ai+1…aj . rev(1,3):

Transpositions A transposition tp(i,j,k) (0≤i<j<k<n+1) transposes the two segments ai+1…aj and aj+1…ak . tp(4,5,6):

Transreversal A transreversal tr(i,j,k) (0≤i<j<k<n+1) reverses one of the segments ai+1…aj and aj+1…ak then transposes them. A left/right transreversal reverses the left/right segment. right-tr(4,5,6) left-tr(4,5,6)

Block-interchange A block-interchange bi(i,j,k,l) (0≤i<j<k<l<n+1) interchanges the two segments ai+1…aj and ak+1…al . bi(1,3,4,6)

Distance problem Given a genome A find d(A), the minimal number of reversals, transpositions, transreversals and block-interchanges that transform A into ID. A=( 0 1 -3 -2 4 6 -5 7 )

Previous results Hannenhalli and Pevzner (95) – reversal distance is polynomial. Transposition distance is NP-hard. Several 1.5-approximations. He and Chen – 2-approximation for sorting by reversals and block-interchanges. Hartman and Sharan – 1.5-approximation for sorting by transpositions and transreversals.

Breakpoint graph vertices Gene +x  ordered pair of vertices (xt,xh) Gene -x  ordered pair of vertices (xh,xt) Dummy genes: 00h, n+1(n+1)t

Breakpoint graph edges Gray edges (i-1h,it) (dotted round) represent adjacencies in ID. Black edges [R(ai-1),L(ai)] (solid straight) represent adjacencies in A.

Breakpoint graph properties Each vertex has degree 2: 1 black, 1 gray Disjoint cycles: short=2, long>2 An alternating path <v1, vm>=v1,v2,…,vm Gray/black paths.

Reversals Reversal rev(i,j): Cut black edges [R(ai),L(ai+1)],[R(aj),L(aj+1)]. Add black edges [R(ai), R(aj)],[L(ai),L(aj+1)].

Transpositions Transposition tp(i,j,k) : Cut [R(ai),L(ai+1)],[R(aj),L(aj+1)],[R(ak),L(ak+1)] Add [R(ai),L(aj+1)],[R(ak),L(ai+1)],[R(aj),L(ak+1)]

Right transreversals Right transreversal right-tp(i,j,k) : Cut [R(ai),L(ai+1)],[R(aj),L(aj+1)],[R(ak),L(ak+1)] Add [R(ai),R(aK)],[L(aj+1),L(ai+1)],[R(aj),L(ak+1)]

Left transreversals Left transreversal left-tp(i,j,k) : Cut [R(ai),L(ai+1)],[R(aj),L(aj+1)],[R(ak),L(ak+1)] Add [R(ai),L(aj+1)],[L(ak),R(aj)],[L(ai+1),L(ak+1)]

Block-interchanges block-interchange bi(i,j,k,l): Cut [R(ai),L(ai+1)],[R(aj),L(aj+1)],[R(ak),L(ak+1)],[R(al),L(al+1)] Add[R(ai), L(ak+1)],[R(al),L(aj+1)],[R(ak), L(ai+1)],[R(aj), L(al+1)]

Increasing cycles in BP c(A) - #cycles of BG of A. Max for ID: c(ID)=n+1. Let Δ(ρ) - increase of cycles from operation ρ. Sequence of operations ρ1,… ρd.

Idea Search for ops ρ with max Δ(ρ). An operation ρ is proper if it max Δ(ρ): Reversals: Δ(rev)=1. Transpositions: Δ(tp)=2. Transreversals: Δ(tr)=2. Block-interchanges: Δ(bi)=2 (this work). Model proper ops with bridge structures in BG.

More notations loc(u) – location of vertex u in V(GA). Even/odd locations. u<v if loc(u)<loc(v). Black edges [u,v]<[w,x] if u<w. 0 1 2 3 4 5 6 7 8 9 10 11 12 13

More notations Gray edge (u,v) is oriented if |loc(u)-loc(v)| is even and unoriented otherwise. Oriented graph has >= 1 oriented edge. O.w. unoriented. Oriented path/cycle. 0 1 2 3 4 5 6 7 8 9 10 11 12 13

More notations Gray edges (u1,v1),(u2,v2) in cycle C with u1<u2 are crossing if u1<u2<v1<v2 or v1<v2<u1<u2. Similar for crossing gray paths. 0 1 2 3 4 5 6 7 8 9 10 11 12 13

L-bridges Models proper reversals. Black edges [a,b],[c,d] in cycle C, the gray edges (a,c),(b,d) are oriented and crossing. Reversing b…c increases cycles by 1. Generalizes to gray paths.

T-bridges A black edge e=[u,v] is twisted if the two gray edges (u,*), (v,*) are crossing. T-bridge models proper tps and trs with Δ=2. Black edges e1=[a,b], e2=[c,d], e3=[e,f] in cycle C s.t. e1< e2< e3 and e2 is twisted.

tp-T Both gray edges (c,*), (d,*) are unoriented. Corresponds to proper tp of b..c and d..e.

Inaccurate definition T-bridge only for cycle of length 6? T-bridge but not proper tp: A correct definition should consider the orientation of the paths from c/d.

tr-T If the gray edge (d,*) is unoriented, gray edge (c,*) oriented – left-tr If the gray edge (d,*) is oriented, gray edge (c,*) unoriented – right-tr

Proper bi Lemma: Δ(bi)≤2 . Proof: 4 black edges on the same cycle. After bi 4 cycles

X-bridge e1<e2 black edges on C1, f1<f2 black edges on C2. C1 and C2 are interleaving if e1<f1<e2<f2 or f1<e1<f2<e2. X-bridge models proper bi. Four black edges on interleaving cycles define an x-bridge.

Inaccurate definition X-bridge but not proper bi: A correct definition should on unorienetd interleaving cycles.

Lower bound Theorem: Genome A, Proof: Δ(ρ)≤2 for every ρ. Optimal sequence ρ1,… ρd

Upper bound Show that we can always find either an L-,T- or X-bridge in every BG. Classify BG: Oriented graph  has L-bridge. Unoriented with crossing paths  has T-bridge. Unoriented without crossing paths  has x-bridge.

Claim 1 Claim 1: oriented graph  L-bridge Pick an oriented edge and close a cycle with a crossing path.

Claim 2 Claim 2: Unoriented with crossing paths  T-bridge. Pick two crossing paths p1=<u1,v1>, p2=<u2,v2> in a cycle C. Depending on the parity of the u1,u2 contract C into a tp-T-bridge.

Claim 3 Claim 3: Unoriented without crossing paths  X-bridge. Gu et al.: If cycle C has no crossing edges then there is a cycle C’ interleaving with C. Pick leftmost and rightmost black edges in C an C’ to form an X-bridge.

2-approximation Genome Sorting by Bridges (GSB) algorithm iteratively searches for L-,T- or X-bridges. Theorem: A GSB algorithm gives a 2-approximation. Proof: m(A) – #ops done by GSB. Since Δ(ρ)≥1 for every ρ: From previous theorem:

GSB-X Naïve implementation O(n6). In a different paper they show O(n3).

Discussion notes GSB actually does not use transreversals. Not all proper operations are covered by the bridges. Example of proper bi but not x-bridge:

Future directions Analyze running time. Better approximation. Different weights. Multiple gene copies.

Pros Nice result. Extensive rearrangement model. Paper is mostly self contained.

Cons Inaccurate bridge definitions. Many minor comments. A lot of grammar mistakes. Innovation is debatable.