Computational Prediction of RNA and DNA Secondary Structure Anne Condon Bioinformatics, and Empirical and Theoretical Algorithmics (BETA) Laboratory The.

Slides:



Advertisements
Similar presentations
RNA Secondary Structure Prediction
Advertisements

Design of Experiments Lecture I
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Gibbs sampling for motif finding in biological sequences Christopher Sheldahl.
1 Structure of search space, complexity of stochastic combinatorial optimization algorithms and application to biological motifs discovery Robin Gras INRIA.
6 -1 Chapter 6 The Secondary Structure Prediction of RNA.
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
Dynamics of RNA-based replicator networks Camille STEPHAN-OTTO ATTOLINI, Christof FLAMM, Peter STADLER Institut for Theoretical Chemistry and Structural.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Introduction to Bioinformatics - Tutorial no. 9 RNA Secondary Structure Prediction.
RNA Folding Xinyu Tang Bonnie Kirkpatrick. Overview Introduction to RNA Previous Work Problem Hofacker ’ s Paper Chen and Dill ’ s Paper Modeling RNA.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
Nature’s Algorithms David C. Uhrig Tiffany Sharrard CS 477R – Fall 2007 Dr. George Bebis.
Zhi John Lu, Jason Gloor, and David H. Mathews University of Rochester Medical Center, Rochester, New York Improved RNA Secondary Structure Prediction.
MicroRNA The Computational Challenge Bioinformatics Seminar, March 9, 2005 By Yaron Levy.
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction.
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
Discovery of RNA Structural Elements Using Evolutionary Computation Authors: G. Fogel, V. Porto, D. Weekes, D. Fogel, R. Griffey, J. McNeil, E. Lesnik,
Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered.
Finding Common RNA Pseudoknot Structures in Polynomial Time Patricia Evans University of New Brunswick.
Structural Alignment of Pseudoknotted RNAs Banu Dost, Buhm Han, Shaojie Zhang, Vineet Bafna.
CISC667, F05, Lec19, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) RNA secondary structure.
RNA Folding Kinetics Bonnie Kirkpatrick Dr. Nancy Amato, Faculty Advisor Guang Song, Graduate Student Advisor.
Predicting RNA Structure and Function
Mesoscopic Modeling of RNA Structure and Dynamics Hin Hark Gan A. Fundamentals of RNA structure 1. Hierarchical folding 2. Folding timescales B. Issues.
Protein Side Chain Packing Problem: A Maximum Edge-Weight Clique Algorithmic Approach Dukka Bahadur K.C, Tatsuya Akutsu and Tomokazu Seki Proceedings of.
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
RNA Secondary Structure Prediction Introduction RNA is a single-stranded chain of the nucleotides A, C, G, and U. The string of nucleotides specifies the.
Protein Structure Alignment by Incremental Combinatorial Extension (CE) of the Optimal Path Ilya N. Shindyalov, Philip E. Bourne.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Non-coding RNA gene finding problems. Outline Introduction RNA secondary structure prediction RNA sequence-structure alignment.
Escaping local optimas Accept nonimproving neighbors – Tabu search and simulated annealing Iterating with different initial solutions – Multistart local.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Guiding Motif Discovery by Iterative Pattern Refinement Zhiping Wang, Mehmet Dalkilic, Sun Kim School of Informatics, Indiana University.
Verification of the Crooks fluctuation theorem and recovery of RNA folding free energies D. Collin, F. Ritort, C. Jarzynski, S. B. Smith, I. Tinoco, Jr.
Strand Design for Biomolecular Computation
Predicting Secondary Structure of All-Helical Proteins Using Hidden Markov Support Vector Machines Blaise Gassend, Charles W. O'Donnell, William Thies,
Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory Mixed Integer Problems Most optimization algorithms deal.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
Slide 1 Tutorial: Optimal Learning in the Laboratory Sciences Overview December 10, 2014 Warren B. Powell Kris Reyes Si Chen Princeton University
RNA secondary structure RNA is (usually) single-stranded The nucleotides ‘want’ to pair with their Watson-Crick complements (AU, GC) They may ‘settle’
RNA Structure Prediction
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
Structural Alignment of Pseudo-knotted RNA
Ramakrishna Lecture#2 CAD for VLSI Ramakrishna
Motif Search and RNA Structure Prediction Lesson 9.
RNA Structure Prediction
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Rapid ab initio RNA Folding Including Pseudoknots via Graph Tree Decomposition Jizhen Zhao, Liming Cai Russell Malmberg Computer Science Plant Biology.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
4.2 - Algorithms Sébastien Lemieux Elitra Canada Ltd.
Characterization of Transition Metal-Sensing Riboswitches
Lecture 8.2.
Stochastic Context-Free Grammars for Modeling RNA
Vienna RNA web servers
Mirela Andronescu February 22, 2005 Lab 8.3 (c) 2005 CGDN.
Predicting RNA Structure and Function
RNA Secondary Structure Prediction
Stochastic Context-Free Grammars for Modeling RNA
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
Paradigms for Computational Nucleic Acid Design
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001)
Chem 291C Draft Sample Preliminary Seminar
Presentation transcript:

Computational Prediction of RNA and DNA Secondary Structure Anne Condon Bioinformatics, and Empirical and Theoretical Algorithmics (BETA) Laboratory The Department of Computer Science, UBC

RNA plays varied roles in the cell… “… a familiar performer has turned up in a stunning variety of guises. RNA, long upstaged by its more glamorous sibling, is turning out to have star qualities of its own” – J. Couzin, Science, 2003

… in sequence analysis, “…signature sequencing [permits] application of powerful statistical techniques for discovery of functional relationships among genes” – S. Brenner et al.... address tag cDNA to be sequenced TTAC AATC TACT ATCA ACAT TCTA CTTT CAAA

…nanotechnology, “Linking biological molecules with nanotubes and nano wires [is] a method for biological sensing and for controlling nanoscale assembly” – R. Hamers, Chemistry, U. Wisconsin A C C A G G T U R. Hamers

and beyond… “..rather than examining in detail what occurs in nature (biological organisms), we take the engineering approach of asking, "what can we build?" E. Winfree

… hold promise in therapeutics,

and hold clues to our understanding of primitive life “there are strong reasons to conclude that DNA and protein based life was preceded by a simpler life form based primarily on RNA” – G. F. Joyce

To understand the function of RNA molecules, we need to understand their structure.

DNA, RNA secondary structure 5'3'

our goals predict from base sequence the secondary structure of –a single molecule –a small group of molecules –the most stable molecule in a combinatorial set of molecules design a molecule or molecules with a certain secondary structure

overview background –how is RNA secondary structure prediction currently done? challenges –how might it be done better? projects –what is the BETA-lab doing about it?

approaches to RNA secondary structure prediction comparative sequence analysis prediction from base sequence –find minimum free energy (mfe) structure

free energy model free energy of structure (at fixed temperature, ionic concentration) = sum of loop energies standard model uses experimentally determined thermodynamic parameters where available; extrapolations for long loops

free energy model free energy of structure (at fixed temperature, ionic concentration) = sum of loop energies standard model uses experimentally determined thermodynamic parameters where available; extrapolations for long loops

on the mfe approach mfe approach ignores folding pathway, metal ions, nonstandard bonds “some species can remain kinetically trapped in nonequilibrium states… we expect that most RNA’s exist naturally in their thermodynamically most stable configurations” – Tinoco and Bustamante, J. Mol. Biol

why is mfe secondary structure prediction hard? mfe structure can be found by calculating free energy of all possible structures but, number of potential structures grows exponentially with the number, n, of bases structures can be arbitrarily complex success for restricted classes of structures

predicting mfe pseudoknot free structures dynamic programming avoids explicit enumeration of all pseudoknot free structures ( Zuker & Stiegler 1981 ) suboptimal folds, probabilities of base pairings can also be calculated software: mfold, Vienna package

dynamic programming (Zuker and Steigler) based on the “more is less” principle: by calculating more than you need, less work is needed overall construct mfe structure for whole strand from mfe structures for substrands running time is O(n 3 )

dynamic programming (Zuker and Steigler) W(i,j): mfe structure of substrand from i to j V(i,j): mfe structure of substrand from i to j, in which ith and jth bases are paired ij W(i,j) ij V(i,j)

= ijk ij W(i,j) W(i,k) ij V(i,j) k+1 W(k+1,j) recurrences min

recurrences ij ijkl i ji+1j-1kk+1 min = ij

= ijkijijk+1 recurrences min = ij ijijkli ji+1j-1kk+1 min

current approaches: pseudoknotted structures loop energies augmented to fit known data mfe algorithms have been extended to handle certain pseudoknotted structures ( Akutsu, 2000; Rivas and Eddy, 1999 ) heuristic approaches also used ( STAR software, van Batenburg et al. )

= ijk ij ijk+1 W(i,j): Rivas and Eddy algorithm i jrkl min

recurrence for gapped structures = kl j i ik l j kl j i r kl j ir kljisk l j is rs i jkl min

Rivas and Eddy algorithm running time is O(n 6 ) “ we lack a systematic a priori characterization of the class of configurations that this algorithm can solve ” ( Rivas and Eddy, 1999 )

heuristic approaches space of structures is explored, guided by random decisions as well as goal of minimizing free energy can incorporate better energy models, model folding pathways, and be substantially faster than mfe algorithms no guarantee that optimal solution is found

STAR algorithm (van Batenburg et al.) simulates RNA folding by stepwise addition and removal of stems to the structure formed at previous steps choice of stem to add/remove is random, biased by “fitness” of resulting structure process is carried out on large population of structures, not just one

prediction challenges predict structures formed from multiple strands predict which strand in a combinatorial set has minimum free energy structure predict pseudoknotted structures

BETA-Lab Projects predict structures formed from multiple strands predict which strand in a combinatorial set has minimum free energy structure predict pseudoknotted structures PairFold CombFold PseudoSpotter

PairFold ( Andronescu ) predicts minimum energy pseudoknot free secondary structure of a pair of strands energy model: loop plus initiation energies from Turner, Santa Lucia labs

CombFold ( Andronescu et al. ) predicts which strand in a combinatorial set has minimum free energy structure uses dynamic programming to avoid examining all combinatorial possibilities TTAC AATCT ACT ATCA ACAT TCTA CTTT CAAA

CombFold ( Andronescu et al. ) predicts which strand in a combinatorial set has minimum free energy structure TCTAATCATACTAATCCTTTTACTATCATTAC has a slightly stable secondary structure at 37 o C TTAC AATCT ACT ATCA ACAT TCTA CTTT CAAA

PseudoSpotter ( Ren et al. ) heuristic algorithm for predicting pseudoknotted structures

PseudoSpotter ( Ren et al. )

1

1 2

1 2 3

PseudoSpotter algorithm initial structure pool contains the empty structure repeat –for each structure in the pool, identify promising stems (hotspots) that can be added –augment each structure with each of its hotspots, thereby expanding the pool until all structures are saturated several structures are output

comparison of pseudoknot algorithms Quality of solutions found over 32 sample strands taken from RDB, Pseudobase OptimalSubopt, Close Fail R&E185 9 STAR1913 Pseudo- spotter 217 4

web page

next steps better energy parameters for model, model evaluation and refinement develop better heuristic algorithms for predicting pseudoknotted structures improved mfe algorithms for natural pseudoknotted structures suboptimal foldings, partition function

how to get better energy models? compare variations of standard free energy model in terms of prediction quality perform lab experiments to find energy parameters incorporate tertiary structural elements (nonstandard base pairs or motifs, folding pathway, metal ions)

1 2 how to improve heuristic algorithm? maybe here?

how to improve mfe algorithm? we have a simple characterization of the class of structures that the Rivas and Eddy algorithm can handle perhaps an O(n 5 ) algorithm for natural structures?

acknowledgements Dr. Holger HoosBETA-lab co-founder Rosalia Aguirrez-HernandezRNA design Mirela AndronescuPairfold, Combfold Viann ChanXIST RNA structure Jihong RenPseudospotter Sohrab Shahstructural motif finding Dan TulpanDNA word design Shelly Zhao Drs. Rob Corn, Lloyd Smith Combfold, RNA structure analysis University of Wisconsin

Bioinformatics, and Empirical and Theoretical Algorithmics (BETA) Laboratory