RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Slides:



Advertisements
Similar presentations
RNA Secondary Structure Prediction
Advertisements

3 Types of RNA.
Towards RNA structure prediction: 3D motif prediction and knowledge-based potential functions Christian Laing Tamar Schlick’s lab Courant Institute of.
6 - 1 Chapter 6 The Secondary Structure Prediction of RNA.
RNA Structure Prediction
6 -1 Chapter 6 The Secondary Structure Prediction of RNA.
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
Pattern Discovery in RNA Secondary Structure Using Affix Trees (when computer scientists meet real molecules) Giulio Pavesi& Giancarlo Mauri Dept. of Computer.
RNA secondary structure prediction and runtime optimization Greg Goldgof October 5, 2006 CS374 Presentation Stanford University.
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction.
Computational biology seminar
RNA Secondary Structure Prediction
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
RNA structure analysis Jurgen Mourik & Richard Vogelaars Utrecht University.
Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered.
. Class 5: RNA Structure Prediction. RNA types u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules.
CISC667, F05, Lec19, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) RNA secondary structure.
1 Ref: Ch. 5 Mount: Bioinformatics i.Protein synthesis: ribosomal RNA transfer RNA messenger RNA ii.Catalysis e.g. ribozymes iii.Regulatory molecules 17.1.
Predicting RNA Structure and Function
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
Dynamic Programming (cont’d) CS 466 Saurabh Sinha.
RNA-Seq and RNA Structure Prediction
RNA informatics Unit 12 BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.
Non-coding RNA gene finding problems. Outline Introduction RNA secondary structure prediction RNA sequence-structure alignment.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Structure and function of nucleic acids.. Heat. Heat flows through the boundary of the system because there exists a temperature difference between the.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
RNA folding & ncRNA discovery I519 Introduction to Bioinformatics, Fall, 2012.
1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing systems.
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
© Wiley Publishing All Rights Reserved. RNA Analysis.
Gene expression DNA  RNA  Protein DNA RNA Protein Replication Transcription Translation Degradation Initiation Elongation Processing Export Initiation.
Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost.
RNA secondary structure RNA is (usually) single-stranded The nucleotides ‘want’ to pair with their Watson-Crick complements (AU, GC) They may ‘settle’
RNA Structure Prediction
Roles of RNA mRNA (messenger) rRNA (ribosomal) tRNA (transfer) other ribonucleoproteins (e.g. spliceosome, signal recognition particle, ribonuclease P)
CS5263 Bioinformatics RNA Secondary Structure Prediction.
Questions?. Novel ncRNAs are abundant: Ex: miRNAs miRNAs were the second major story in 2001 (after the genome). Subsequently, many other non-coding genes.
Prediction of Secondary Structure of RNA
Doug Raiford Lesson 7.  RNA World Hypothesis  RNA world evolved into the DNA and protein world  DNA advantage: greater chemical stability  Protein.
ASHA S V BCH Ribozyme, or RNA enzyme, is a RNA molecule that act as enzymes, often found to catalyze cleavage of either its own or other RNAs.
Lecture 11. RNA Secondary Structure Prediction
Motif Search and RNA Structure Prediction Lesson 9.
Tracking down ncRNAs in the genomes. How to find ncRNA gene The stability of ncRNA secondary structure is not sufficiently different from the predicted.
RNA Structure Prediction
Rapid ab initio RNA Folding Including Pseudoknots via Graph Tree Decomposition Jizhen Zhao, Liming Cai Russell Malmberg Computer Science Plant Biology.
Lecture 8. Molecular structures The Chinese University of Hong Kong BMEG3102 Bioinformatics.
Internal loops within the RNA secondary structure can be worked out in an almost quadratic time stRNAgology, Haifa, 2006.
RNAs. RNA Basics transfer RNA (tRNA) transfer RNA (tRNA) messenger RNA (mRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) ribosomal RNA (rRNA) small interfering.
Stochastic Context-Free Grammars for Modeling RNA
Lecture 21 RNA Secondary Structure Prediction
Mirela Andronescu February 22, 2005 Lab 8.3 (c) 2005 CGDN.
Lab 8.3: RNA Secondary Structure
Predicting RNA Structure and Function
RNA Secondary Structure Prediction
RNA Secondary Structure Prediction
Stochastic Context-Free Grammars for Modeling RNA
Dynamic Programming (cont’d)
Predicting the Secondary Structure of RNA
Comparative RNA Structural Analysis
RNA Secondary Structure Prediction
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
CISC 467/667 Intro to Bioinformatics (Spring 2007) RNA secondary structure CISC667, S07, Lec19, Liao.
RNA enzymes: Putting together a large ribozyme
Presentation transcript:

RNA structure prediction

RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons –Medicine

Base pairs C-G stronger than U-A Non-standard G-U

Base-pairs are usually coplanar are almost always stacked stems – continuous stacks 3D structure of a stack is a helix hairpin Stacking

Predictable structures

Hard-to-predict structures Pseudoknots, kissing hairpins, hairpin-bulge

Secondary structure notations

Tertiary structure

RNAi

Structure representation

Main approaches to RNA secondary structure prediction Energy minimization –dynamic programming approach –does not require prior sequence alignment –require estimation of energy terms contributing to secondary structure Comparative sequence analysis –Using sequence alignment to find conserved residues and covariant base pairs. –most trusted

Dotplot

Think! Make a dotplot of an RNA molecule –Sequence : GGGAAAUCC What is the secondary structure?

Dynamic programming approach Nussinov algorithm

Dynamic programming approach a) i,j is paired E(i,j) = E(i+1,j-1) +  (ri,rj) b) i is unpaired E(i,j) = E(i+1,j) c) j is unpaired E(i,j) = E(i,j-1) d) bifurcation E(i,j) = E(i,k)+E(k+1,j) i+1 j-1 i+1 j j i j-1 i j i i k k+1 a)b) c) d) Let E(i,j) = minimum energy for subchain starting at i and ending at j  (ri,rj) = energy of pair ri, rj (rj = base at position j)

RNA secondary structure algorithm Given: RNA sequence x 1,x 2,x 3,x 4,x 5,x 6,…,x L Initialization: E(i, i-1) = 0 for i = 2 to L E(i, i) = 0 for i = 1 to L Recursion: for n = 2 to L # iteration over length E(i,j) = min {E(i+1, j), E(i, j-1), E(i+1, j-1)+  (ri,rj), min i<k<j {E(i,k)+E(k+1, j)} } Cost: O(n 3 )

Example Let  (ri,rj) = -1 if ri,rj form a base pair and 0 otherwise Input : GGAAAUCC GGAAAUCC G0 G00 A00 A00 A00 U00 C00 C00 E(i,j) = lowest energy conformation for subchain from i to j i j Here we should have min energy for AAAUC

Example-continued GGAAAUCC G00 G000 A000 A000 A00 U000 C000 C00 GGA (i=2, j=3) min {0, 0, 0+  (GA) } = 0 AAU (i=5, j=6) min { 0, 0, 0+  (AU) } = -1 0 i j

Recovering the structure from the DP table Complexity O(n 3 ) Main difference to sequence alignment – we are tracing back a tree-like structure not a single optimal path (bifurcation introduces branch points). Method 1: Leave pointers as you compute the table: for each element of the table store (at most two) pointers to the subsequences used in the solution. Method 2: Recover history based on numerical values in the table. –Stacking – check value along diagonal –Bifurcation - find k such that E(i,k)+E(k+1,j) = E(i,j)

More realistic energy function

Stacking energies

Even more realistic energy function Loops have destabilizing effect structure (d) should have lower energy that (b). Destabilizing contribution of loops should depend on the loop length (k). Stacking has additional stabilizing contribution .  (k)  (k)  (k) 

More realistic energy function requires slightly more involved recurrence E(i,j) = min{ E(i+1,j), E(i,j-1), min{E(i,k)+E(k+1,j), L(i,j)} where L(i,j) = {  (ri,rj) +  (j-i-1) if L(i,j) is a hairpin loop;  (ri,rj) +  i  j-1   if hairpin min k {  (ri,rj) +  (k)+E(i+k+1,j-1)} if i-bulge min k {  (ri,rj) +  (k)+E(i+1,j-k-1)} if j-bulge min k1,k2 {  (ri,rj) +  (k1+k2)+E(i+k1+1,j-k2-1)} if internal loop } Extra “min” gives O(n 4 ) algorithm

Covariance method In a correct multiple alignment RNAs, conserved base pairs are often revealed by the presence of frequent correlated compensatory mutations. Two boxed positions are covarying to maintain Watson- Crick complementary. This covariation implies a base pair which may then be extended in both directions. GCCUUCGGGC GACUUCGGUC GGCUUCGGCC

Alignment

Quantities measure of pairwise sequence covariation Mutual information M ij between two aligned columns i, j M ij =  i,j f x i x j log 2 (f x i x j /f x i f x j ) Where f x i x j frequency of the pair (observed) f x i frequency of nucleotide x i at position i Observations: 0 <= M ij <=2 i,j uncorrelated M ij = 0

MI: examples A A C G U U G C f Ai =.5 f Ci =.25 f Gi =.25 f Uj =.5 f Cj =.25 f Gj =.25 f AU =.5 f CG =.25 f GC =.25 M ij =  x i x j f x i x j log 2 (f x i x j /f x i f x j ) =.5 log 2 (.5/(.5*.5))+2*.25 log 2 (.25/(.25*.25))=.5 *1 +.5*2 = 1.5 A A A A U U U U M ij = 1 log 1 = 0 U A C G A U G C M ij = 4*.25 log 4 = 2 i j

Other methods HMMs Stochastic context free grammars

Conclusion RNA secondary structure prediction –Single sequence: Dot-plot Nussinov dynamic programming Energy function –Covariance analysis Mutual information Hidden Markov Models SCFGs

Finding “most probable structure” S – structure then, E(S) free energy of S p(S) = exp(-E(S)/kT)/Q Q =  x exp(-E(x)/kT) ) partition function Problem: computing Q Method to compute Q – dynamic programming (similar as presented before but scores are replaced with probabilities and min energy with sum of probabilities).

tRNA

Answer