Arc-Segment Alignment for RNA Secondary Structure 指導教授:楊昌彪 學生姓名:彭永興.

Slides:



Advertisements
Similar presentations
Parallel BioInformatics Sathish Vadhiyar. Parallel Bioinformatics  Many large scale applications in bioinformatics – sequence search, alignment, construction.
Advertisements

Longest Common Subsequence
DYNAMIC PROGRAMMING ALGORITHMS VINAY ABHISHEK MANCHIRAJU.
Longest Common Rigid Subsequence Bin Ma and Kaizhong Zhang Department of Computer Science University of Western Ontario Ontario, Canada.
Dynamic Programming: Sequence alignment
Chapter 7 Dynamic Programming.
Sequence Alignment Tutorial #2
Chapter 7 Dynamic Programming 7.
§ 8 Dynamic Programming Fibonacci sequence
©CMBI 2005 Sequence Alignment In phylogeny one wants to line up residues that came from a common ancestor. For information transfer one wants to line up.
Longest Common Subsequence (LCS) Dr. Nancy Warter-Perez.
Space Efficient Alignment Algorithms and Affine Gap Penalties
Sequence Alignment Algorithms in Computational Biology Spring 2006 Edited by Itai Sharon Most slides have been created and edited by Nir Friedman, Dan.
Sequencing and Sequence Alignment
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez.
4 -1 Chapter 4 The Sequence Alignment Problem The Longest Common Subsequence (LCS) Problem A string : S 1 = “ TAGTCACG ” A subsequence of S 1 :
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2005.
Introduction to Bioinformatics Algorithms Sequence Alignment.
Knapsack Problem Section 7.6. A 8 lbs $ B 6 lbs $ C 4 lbs $ D 2 lbs $ E 1 lb $ lbs capacity A 8 lbs $ B 6 lbs $6.
Longest Common Subsequence (LCS) Dr. Nancy Warter-Perez June 22, 2005.
1 Pseudo-polynomial time algorithm (The concept and the terminology are important) Partition Problem: Input: Finite set A=(a 1, a 2, …, a n } and a size.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2004.
Algorithms for Regulatory Motif Discovery Xiaohui Xie University of California, Irvine.
7 -1 Chapter 7 Dynamic Programming Fibonacci Sequence Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, … F i = i if i  1 F i = F i-1 + F i-2 if.
Sequence Alignment II CIS 667 Spring Optimal Alignments So we know how to compute the similarity between two sequences  How do we construct an.
7 -1 Chapter 7 Dynamic Programming Fibonacci sequence (1) 0,1,1,2,3,5,8,13,21,34,... Leonardo Fibonacci ( ) 用來計算兔子的數量 每對每個月可以生產一對 兔子出生後,
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 20, 2003.
Recap 3 different types of comparisons 1. Whole genome comparison 2. Gene search 3. Motif discovery (shared pattern discovery)
Developing Sequence Alignment Algorithms in C++ Dr. Nancy Warter-Perez May 21, 2002.
Introduction to Bioinformatics Algorithms Sequence Alignment.
CMPT-825 (Natural Language Processing) Presentation on Zipf’s Law & Edit distance with extensions Presented by: Kaustav Mukherjee School of Computing Science,
Dynamic Programming. Pairwise Alignment Needleman - Wunsch Global Alignment Smith - Waterman Local Alignment.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 10, 2005.
Alignment methods II April 24, 2007 Learning objectives- 1) Understand how Global alignment program works using the longest common subsequence method.
Case Study. DNA Deoxyribonucleic acid (DNA) is a nucleic acid that contains the genetic instructions used in the development and functioning of all known.
Sequence comparison: Local alignment Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Developing Pairwise Sequence Alignment Algorithms
Sequence Alignment.
Traceback and local alignment Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington.
Brandon Andrews.  Longest Common Subsequences  Global Sequence Alignment  Scoring Alignments  Local Sequence Alignment  Alignment with Gap Penalties.
Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.
Comp. Genomics Recitation 2 12/3/09 Slides by Igor Ulitsky.
Pairwise Sequence Alignment (I) (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 22, 2005 ChengXiang Zhai Department of Computer Science University.
7 -1 Chapter 7 Dynamic Programming Fibonacci sequence Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, … F i = i if i  1 F i = F i-1 + F i-2 if.
1 Longest Common Subsequence Problem and Its Approximation Algorithms Kuo-Si Huang ( 黃國璽 )
Introduction to Bioinformatics Algorithms Sequence Alignment.
Dynamic Programming: Sequence alignment CS 466 Saurabh Sinha.
7 -1 Chapter 7 Dynamic Programming Fibonacci sequence Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, … F i = i if i  1 F i = F i-1 + F i-2 if.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
Approximate Alignment Vasileios Hatzivassiloglou University of Texas at Dallas.
1 CPSC 320: Intermediate Algorithm Design and Analysis July 28, 2014.
1 Sequence Alignment Input: two sequences over the same alphabet Output: an alignment of the two sequences Example: u GCGCATGGATTGAGCGA u TGCGCCATTGATGACCA.
CS 3343: Analysis of Algorithms Lecture 18: More Examples on Dynamic Programming.
Space Efficient Alignment Algorithms and Affine Gap Penalties Dr. Nancy Warter-Perez.
DNA, RNA and protein are an alien language
Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.
Local Exact Pattern Matching for Non-fixed RNA Structures Mika Amit, Rolf Backofen, Steffen Heyne, Gad M. Landau, Mathias Mohl, Christina Schmiedl, Sebastian.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
CSC 213 Lecture 19: Dynamic Programming and LCS. Subsequences (§ ) A subsequence of a string x 0 x 1 x 2 …x n-1 is a string of the form x i 1 x.
An Improved Search Algorithm for Optimal Multiple-Sequence Alignment Paper by: Stefan Schroedl Presentation by: Bryan Franklin.
9/6/07BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST1 BCB 444/544 Lab 3 BLAST Scoring Matrices & Alignment Statistics Sept6.
Genome alignment Usman Roshan.
Sequence Alignment Kun-Mao Chao (趙坤茂)
Bioinformatics: The pair-wise alignment problem
Sequence Alignment Using Dynamic Programming
Sequence Alignment 11/24/2018.
Intro to Alignment Algorithms: Global and Local
The Longest Common Subsequence Problem
Sequence comparison: Local alignment
Sequence Alignment Kun-Mao Chao (趙坤茂)
Presentation transcript:

Arc-Segment Alignment for RNA Secondary Structure 指導教授:楊昌彪 學生姓名:彭永興

The Longest Common Subsequence (LCS) Problem A string : S 1 = “ TAGTCACG ” A subsequence of S 1 : deleting 0 or more symbols from S 1 (not necessarily consecutive). e.g. G, AGC, TATC, AGACG Common subsequences of S 1 = “ TAGTCACG ” and S 2 = “ AGACTGTC ” : GG, AGC, AGACG Longest common subsequence (LCS) : S 1 : TAGTCACG S 2 : AGACTGTC LCS : AGACG

Sequence Alignment S 1 = TAGTCACG S 2 = AGACTGTC  ----TAGTCACG TAGTCAC-G-- AGACT-GTC--- -AG--ACTGTC Which one is better? We can set different gap penalties as parameters for different purposes.

After matrix A has been found, we can trace back to find the LCS. TAGTCACG AGACTGTC LCS:AGACG

The Structure of RNA

Arc Annotation for RNA Secondary Structure

How to Compare two RNA Secondary Structure Longest Arc-Preserving Common Subsequence  O(n 5 ) for LAPCS(nested, nested) LAPCS(crossing, crossing) is NP-Hard Arc-Segment Alignment (Our Method)  O(n 2 ) for ASA(nested, nested) ASA(crossing,crossing) may be solved in polynomial time

Our Comparison Algorithm (1)Given two RNA 2 nd structure S1,S2 with length m and n, find the “Sequence of Arc segment” A1 from S1, A2 from S2 (2)Solve the Alignment for A1,A2 using the Arc-segment alignment (3)From the answer, we known how to deal with the arc parts, then we know how to deal with the other parts of the RNA sequence

Arc-Segment Alignment ASA checks “if the segment match”, not like original LCS which checks if the character match. Therefore, we need a threshold to define what the “match” means To check if two segments are matched  Arc Size + Arc location + Sub-ASA(recursive) ASA would perform simple sequence alignment if one of the RNA sequence does not contain any arcs

Example for ASA(nested, nested) part1 G T G A T AA

Example for ASA(nested, nested) part2 A A T T Perform Original Sequence Alignment for segments

Advantage of ASA Time complexity is only O(n2) if we want to solve nested-nested comparison It emphasizes on the arcs, so it can reflect more structure similarity than LAPCS It may solve crossing-crossing comparison in polynomial time if being correctly modified It is reflexible because we can set different threshold and different weight for score factor