RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.

Slides:



Advertisements
Similar presentations
B. Knudsen and J. Hein Department of Genetics and Ecology
Advertisements

RNA Secondary Structure Prediction
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Stochastic Context Free Grammars for RNA Modeling CS 838 Mark Craven May 2001.
6 - 1 Chapter 6 The Secondary Structure Prediction of RNA.
Predicting the 3D Structure of RNA motifs Ali Mokdad – UCSF May 28, 2007.
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
RNA Secondary Structure aagacuucggaucuggcgacaccc uacacuucggaugacaccaaagug aggucuucggcacgggcaccauuc ccaacuucggauuuugcuaccaua aagccuucggagcgggcguaacuc.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
RNA secondary structure prediction and runtime optimization Greg Goldgof October 5, 2006 CS374 Presentation Stanford University.
Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction.
Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.
RNA Secondary Structure Prediction
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
RNA structure analysis Jurgen Mourik & Richard Vogelaars Utrecht University.
Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered.
. Class 5: RNA Structure Prediction. RNA types u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules.
Finding Common RNA Pseudoknot Structures in Polynomial Time Patricia Evans University of New Brunswick.
CISC667, F05, Lec19, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) RNA secondary structure.
Predicting RNA Structure and Function
11-1 Matrix-chain Multiplication Suppose we have a sequence or chain A 1, A 2, …, A n of n matrices to be multiplied –That is, we want to compute the product.
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
RNA Secondary Structure aagacuucggaucuggcgacaccc uacacuucggaugacaccaaagug aggucuucggcacgggcaccauuc ccaacuucggauuuugcuaccaua aagccuucggagcgggcguaacuc.
Dynamic Programming (cont’d) CS 466 Saurabh Sinha.
RNA-Seq and RNA Structure Prediction
RNA informatics Unit 12 BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.
Non-coding RNA gene finding problems. Outline Introduction RNA secondary structure prediction RNA sequence-structure alignment.
Comparative Genomics & Annotation The Foundation of Comparative Genomics The main methodological tasks of CG Annotation: Protein Gene Finding RNA Structure.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Structure and function of nucleic acids.. Heat. Heat flows through the boundary of the system because there exists a temperature difference between the.
RNA Secondary Structure What is RNA? Definition of RNA secondary Structure RNA molecule evolution Algorithms for base pair maximisation Chomsky’s Linguistic.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
RNA folding & ncRNA discovery I519 Introduction to Bioinformatics, Fall, 2012.
1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing systems.
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost.
Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2012 Lecture 17.
RNA secondary structure RNA is (usually) single-stranded The nucleotides ‘want’ to pair with their Watson-Crick complements (AU, GC) They may ‘settle’
RNA Structure Prediction
Roles of RNA mRNA (messenger) rRNA (ribosomal) tRNA (transfer) other ribonucleoproteins (e.g. spliceosome, signal recognition particle, ribonuclease P)
RNA Structure Prediction Including Pseudoknots Based on Stochastic Multiple Context-Free Grammar PMSB2006, June 18, Tuusula, Finland Yuki Kato, Hiroyuki.
Exploiting Conserved Structure for Faster Annotation of Non-Coding RNAs without loss of Accuracy Zasha Weinberg, and Walter L. Ruzzo Presented by: Jeff.
CS5263 Bioinformatics RNA Secondary Structure Prediction.
Prediction of Secondary Structure of RNA
Doug Raiford Lesson 7.  RNA World Hypothesis  RNA world evolved into the DNA and protein world  DNA advantage: greater chemical stability  Protein.
Hidden Markov Models in Bioinformatics
RNA Structure Prediction RNA Structure Basics The RNA ‘Rules’ Programs and Predictions BIO520 BioinformaticsJim Lund Assigned reading: Ch. 6 from Bioinformatics:
Motif Search and RNA Structure Prediction Lesson 9.
Tracking down ncRNAs in the genomes. How to find ncRNA gene The stability of ncRNA secondary structure is not sufficiently different from the predicted.
Rapid ab initio RNA Folding Including Pseudoknots via Graph Tree Decomposition Jizhen Zhao, Liming Cai Russell Malmberg Computer Science Plant Biology.
Internal loops within the RNA secondary structure can be worked out in an almost quadratic time stRNAgology, Haifa, 2006.
Lecture 8.21 Lecture 8.2: RNA Jennifer Gardy Centre for Microbial Diseases and Immunity Research University of British Columbia
RNAs. RNA Basics transfer RNA (tRNA) transfer RNA (tRNA) messenger RNA (mRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) ribosomal RNA (rRNA) small interfering.
molecule's structure prediction
Hidden Markov Models BMI/CS 576
RNA sequence-structure alignment
Stochastic Context-Free Grammars for Modeling RNA
Lecture 21 RNA Secondary Structure Prediction
Predicting RNA Structure and Function
RNA Secondary Structure Prediction
RNA Secondary Structure Prediction
Stochastic Context-Free Grammars for Modeling RNA
Dynamic Programming (cont’d)
Comparative RNA Structural Analysis
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
CISC 467/667 Intro to Bioinformatics (Spring 2007) RNA secondary structure CISC667, S07, Lec19, Liao.
Dynamic Programming II DP over Intervals
Presentation transcript:

RNA structure prediction

RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons –Medicine

Base pairs C-G stronger than U-A Non-standard G-U

Base-pairs are usually coplanar are almost always stacked stems – continuous stacks 3D structure of a stack is a helix hairpin Stacking

Predictable structures

Hard-to-predict structures Pseudoknots, kissing hairpins, hairpin-bulge

Secondary structure notations

Structure representation

Tertiary structure

RNAi

Known RNA Structures httpp:// Rfam – database of RNA alignments and secondary structure models Scor - database of RNA experimentally solved structures

Main approaches to RNA secondary structure prediction Energy minimization –dynamic programming approach –does not require prior sequence alignment –require estimation of energy terms contributing to secondary structure Comparative sequence analysis –Using sequence alignment to find conserved residues and covariant base pairs. –most trusted

Dotplot

Think! Make a dotplot of an RNA molecule –Sequence : GGGAAAUCC What is the secondary structure?

Dynamic programming approach Nussinov algorithm

Dynamic programming approach a) i,j is paired E(i,j) = E(i+1,j-1) +  (ri,rj) b) i is unpaired E(i,j) = E(i+1,j) c) j is unpaired E(i,j) = E(i,j-1) d) bifurcation E(i,j) = E(i,k)+E(k+1,j) i+1 j-1 i+1 j j i j-1 i j i i k k+1 a)b) c) d) Let E(i,j) = minimum energy for subchain starting at i and ending at j  (ri,rj) = energy of pair ri, rj (rj = base at position j)

RNA secondary structure algorithm Given: RNA sequence x 1,x 2,x 3,x 4,x 5,x 6,…,x L Initialization: E(i, i-1) = 0 for i = 2 to L E(i, i) = 0 for i = 1 to L Recursion: for n = 2 to L # iteration over length E(i,j) = min {E(i+1, j), E(i, j-1), E(i+1, j-1)+  (ri,rj), min i<k<j {E(i,k)+E(k+1, j)} } Cost: O(n 3 )

Example Let  (ri,rj) = -1 if ri,rj form a base pair and 0 otherwise Input : GGAAAUCC GGAAAUCC G0 G00 A00 A00 A00 U00 C00 C00 E(i,j) = lowest energy conformation for subchain from i to j i j Here we should have min energy for AAAUC

Example-continued GGAAAUCC G00 G000 A000 A000 A00 U000 C000 C00 GGA (i=2, j=3) min {0, 0, 0+  (GA) } = 0 AAU (i=5, j=6) min { 0, 0, 0+  (AU) } = -1 0 i j

Recovering the structure from the DP table Main difference to sequence alignment – we are tracing back a tree-like structure not a single optimal path (bifurcation introduces branch points). Method 1: Leave pointers as you compute the table: for each element of the table store (at most two) pointers to the subsequences used in the solution. Method 2: Recover history based on numerical values in the table. Stacking – check value along diagonal Bifurcation - find k such that E(i,k)+E(k+1,j) = E(i,j)

Base-pairs are usually coplanar are almost always stacked stems – continuous stacks 3D structure of a stack is a helix hairpin Stacking

More realistic energy function

Stacking energies

Covariance method In a correct multiple alignment RNAs, conserved base pairs are often revealed by the presence of frequent correlated compensatory mutations. Two boxed positions are covarying to maintain Watson- Crick complementary. This covariation implies a base pair which may then be extended in both directions. GCCUUCGGGC GACUUCGGUC GGCUUCGGCC

Alignment

Measure of pairwise sequence covariation Mutual information M ij between two aligned columns i, j M ij =  i,j f x i x j log 2 (f x i x j /f x i f x j ) where f x i x j frequency of the pair (observed) f x i frequency of nucleotide x i at position i Observations: 0 <= M ij <=2 i,j uncorrelated M ij = 0

MI: examples A A C G U U G C f Ai =.5 f Ci =.25 f Gi =.25 f Uj =.5 f Cj =.25 f Gj =.25 f AU =.5 f CG =.25 f GC =.25 M ij =  x i x j f x i x j log 2 (f x i x j /f x i f x j ) =.5 log 2 (.5/(.5*.5))+2*.25 log 2 (.25/(.25*.25))=.5 *1 +.5*2 = 1.5 A A A A U U U U M ij = 1 log 1 = 0 U A C G A U G C M ij = 4*.25 log 4 = 2 i j

Other methods HMMs Stochastic context free grammars Allow for modeling complex structures. Allow incorporation of additional info: –Phylogenetic distances –Biochemical properties

sno-RNA HMM

Stochastic Grammars S -> aSa -> abSba -> abaaba i. Start with S. Production rules:S --> (0.3)aT (0.7)bS T --> (0.2)aS (0.4)bT (0.2)  S -> aT -> aaS –> aabS -> aabaT -> aaba ii.  S--> (0.3)aSa (0.5)bSb (0.1)aa (0.1)bb *0.3 *0.2 *0.7 *0.3 *0.2 *0.5 *0.1 Derivation:

Conclusion RNA secondary structure prediction –Single sequence: Dot-plot Nussinov dynamic programming Energy function –Covariance analysis Mutual information Hidden Markov Models SCFGs