Zhi John Lu, Jason Gloor, and David H. Mathews University of Rochester Medical Center, Rochester, New York Improved RNA Secondary Structure Prediction.

Slides:



Advertisements
Similar presentations
B. Knudsen and J. Hein Department of Genetics and Ecology
Advertisements

The Discovery of Novel ncRNA in Genomes Andrew Uzilov David Mathews.
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
RNA Structure Prediction
6 -1 Chapter 6 The Secondary Structure Prediction of RNA.
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
Introduction to Bioinformatics - Tutorial no. 9 RNA Secondary Structure Prediction.
RNA Folding Xinyu Tang Bonnie Kirkpatrick. Overview Introduction to RNA Previous Work Problem Hofacker ’ s Paper Chen and Dill ’ s Paper Modeling RNA.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
Predicting Coaxial Stacking by Free Energy Minimization David Mathews Department of Biochemistry & Biophysics University of Rochester Medical Center.
Expected accuracy sequence alignment
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction.
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
Computational approaches for RNA energy parameter estimation Mirela Andronescu Department of Computer Science Supervisors Anne Condon Holger Hoos Committee.
[Bejerano Fall10/11] 1.
Finding the optimal pairwise alignment We are interested in finding the alignment of two sequences that maximizes the similarity score given an arbitrary.
Structure Mapping Working Group. RNA Secondary Structure Experimental Constraints: Enzymatic Cleavage –Paired nucleotides –Unpaired nucleotides FMN Cleavage.
An Investigation into Selection Constraints in RNA Genes Naila Mimouni, Rune Lyngsoe and Jotun Hein Department of Statistics, Oxford University Aim A robust.
[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
RNA-Seq and RNA Structure Prediction
Non-coding RNA gene finding problems. Outline Introduction RNA secondary structure prediction RNA sequence-structure alignment.
Genome Informatics 2005 ~ 220 participants 1 keynote speaker: David Haussler 47 talks 121 posters.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Link Reconstruction from Partial Information Gong Xiaofeng, Li Kun & C. H. Lai
Structure and function of nucleic acids.. Heat. Heat flows through the boundary of the system because there exists a temperature difference between the.
COMPARATIVE or HOMOLOGY MODELING
Strand Design for Biomolecular Computation
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Computational Prediction of RNA and DNA Secondary Structure Anne Condon Bioinformatics, and Empirical and Theoretical Algorithmics (BETA) Laboratory The.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos.
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
KIAS July 2006 RNA secondary structure Ground state and the glass transition of the RNA secondary structure RNA folding: specific versus nonspecific pairing.
RNA secondary structure RNA is (usually) single-stranded The nucleotides ‘want’ to pair with their Watson-Crick complements (AU, GC) They may ‘settle’
RNA Structure Prediction
[BejeranoWinter12/13] 1 MW 11:00-12:15 in Beckman B302 Prof: Gill Bejerano TAs: Jim Notwell & Harendra Guturu CS173 Lecture 6:
Roles of RNA mRNA (messenger) rRNA (ribosomal) tRNA (transfer) other ribonucleoproteins (e.g. spliceosome, signal recognition particle, ribonuclease P)
RNA Structure Prediction Including Pseudoknots Based on Stochastic Multiple Context-Free Grammar PMSB2006, June 18, Tuusula, Finland Yuki Kato, Hiroyuki.
Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University.
[BejeranoFall15/16] 1 MW 1:30-2:50pm in Clark S361* (behind Peet’s) Profs: Serafim Batzoglou & Gill Bejerano CAs: Karthik Jagadeesh.
Doug Raiford Lesson 7.  RNA World Hypothesis  RNA world evolved into the DNA and protein world  DNA advantage: greater chemical stability  Protein.
The Chinese University of Hong Kong
RNA Structure Prediction RNA Structure Basics The RNA ‘Rules’ Programs and Predictions BIO520 BioinformaticsJim Lund Assigned reading: Ch. 6 from Bioinformatics:
Pre-mRNA secondary structures influence exon recognition Michael Hiller Bioinformatics Group University of Freiburg, Germany.
Beyond ab initio modelling… Comparative and Boltzmann equilibrium Yann Ponty, CNRS/Ecole Polytechnique with invaluable help from Alain Denise, LRI/IGM,
Motif Search and RNA Structure Prediction Lesson 9.
Tracking down ncRNAs in the genomes. How to find ncRNA gene The stability of ncRNA secondary structure is not sufficiently different from the predicted.
Expected accuracy sequence alignment Usman Roshan.
RNA Structure Prediction
Rapid ab initio RNA Folding Including Pseudoknots via Graph Tree Decomposition Jizhen Zhao, Liming Cai Russell Malmberg Computer Science Plant Biology.
Poster Design & Printing by Genigraphics ® Esposito, D., Heitsch, C. E., Poznanovik, S. and Swenson, M. S. Georgia Institute of Technology.
Internal loops within the RNA secondary structure can be worked out in an almost quadratic time stRNAgology, Haifa, 2006.
Jason Gans Los Alamos National Laboratory Improved Assay-dependent Searching of Nucleic Acid Sequence Databases.
bacteria and eukaryotes
Halfway Feedback (yours)
Stochastic Context-Free Grammars for Modeling RNA
Vienna RNA web servers
CS273A Lecture 3: Non Coding Genes MW 12:50-2:05pm in Beckman B100
Predicting RNA Structure and Function
RNA Secondary Structure Prediction
Stochastic Context-Free Grammars for Modeling RNA
Profs: Serafim Batzoglou, Gill Bejerano TAs: Cory McLean, Aaron Wenger
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
Paradigms for Computational Nucleic Acid Design
Fig. 2 E2F1 affects alternative splicing of E2F target genes.
Presentation transcript:

Zhi John Lu, Jason Gloor, and David H. Mathews University of Rochester Medical Center, Rochester, New York Improved RNA Secondary Structure Prediction by Maximizing Expected Pair Accuracy

AAUUGCGGGAAAGGGGUCAA CAGCCGUUCAGUACCAAGUC UCAGGGGAAACUUUGAGAUG GCCUUGCAAAGGGUAUGGUA AUAAGCUGACGGACAUGGUC CUAACCACGCAGCCAAGUCC UAAGUCAACAGAUCUUCUGU UGAUAUGGAUGCAGUUCA RNA Secondary and Tertiary Structure: Cate, et al. (Cech & Doudna). (1996) Science 273:1678. Waring & Davies. (1984) Gene 28: 277.

Gibbs Free Energy Change: K i = = = K i /K j = The structure with the lowest  G° is the most favored at a given temperature.

Nearest Neighbor Model for Free Energy Change of a Sample Hairpin Loop: Mathews et al., J. Mol. Biol., 1999, 288: 911. Mathews et al., PNAS, 2004, 101: 7287.

RNA Secondary Structure Prediction Accuracy: Percentage of Known Base Pairs Correctly Predicted: Mathews, Disney, Childs, Schroeder, Zuker, & Turner PNAS 101: 7287.

Limitations to Prediction of the Minimum Free Energy Structure: A minimum free energy structure provides the single best guess for the secondary structure. Assumes that: –RNA is at equilibrium –RNA has a single conformation –RNA thermodynamic parameters are without error Non-nearest neighbor effects Some sequence-specific stabilities are averaged

A Method that Looks at the Probability of a Structure could be more Informative: A partition function can be used to determine the probability of a structure at equilibrium.

The Partition Function, Q:

So, what is Q good for? where k is the sum over all structures with the i-j base pair.

Accuracy: Sensitivity – what percentage of known pairs occur in the predicted structure. Positive Predictive Value (PPV) – what percentage of predicted pairs occur in the known structure. PPV ≤ Sensitivity because the structures determined by comparative sequence analysis do not have all pairs and there is a tendency to over- predict base pairs by free energy minimization.

Applying P i,j to Structure Prediction: Sensitivity Positive Predictive Value (PPV) PPV P BP ≥ 99% PPV P BP ≥ 95% PPV P BP ≥ 90% PPV P BP ≥ 70% PPV P BP > 50% Mathews. RNA. 10: (2004).

Percent of Predicted BP above Threshold: PPV P BP ≥ 99% PPV P BP ≥ 95% PPV P BP ≥ 90% PPV P BP ≥ 70% PPV P BP > 50% Mathews. RNA. 10: (2004).

Color Annotation: E. coli 5S rRNA

Structures Constructed from Highly Probable Pairs: P BP ≥ 99%P BP ≥ 90%P BP ≥ 70% P BP > 50%

“Maximizing Expected Accuracy:”

CONTRAfold: “Statistical learning method” to predict P i,j Generate structures: Where: Bioinformatics. 22: e90-e98. (2006).

Implement Maximum Expectation: Zhi John Lu, Jason Gloor, David Mathews Implement dynamic programming algorithm using partition function prediction of P i,j. Also implement suboptimal structure prediction. –Alternative hypotheses.

Sensitivity and PPV vs.  :

Comparison: Type of RNA: MaxExpect:Free Energy Minimization:CONTRAfold a : Sensitivity (%)PPV (%)Sensitivity (%)PPV (%)Sensitivity (%)PPV (%) SSU rRNA b 62.1±23.1 (56.0±14.9) 58.0±25.0 (52.0±14.2) 61.4±23.7 (45.5±15.2) 54.8±25.3 (38.3±14.9) 60.2±23.3 (46.2±14.5) 45.7±21.6 (34.0±13.0) LSU rRNA b 74.6±11.9 (46.9±14.1) 68.4±11.6 (43.3±14.9) 72.4±17.2 (55.1±11.5) 65.0±16.3 (47.1±12.2) 71.7±17.9 (57.4±14.9) 55.2±14.8 (43.3±11.5) 5S rRNA72.5± ± ± ± ± ±23.1 Group I intron71.2± ± ± ± ± ±11.6 Group II intron87.0± ± ± ± ± ±20.4 RNase P63.3± ± ± ± ± ±14.1 SRP65.9± ± ± ± ± ±19.2 tRNA85.8± ± ± ± ± ±18.3 Average c 72.8± ± ± ± ± ±7.2

Summary: Maximizing expected accuracy can predict structures with greater sensitivity and positive predictive value than free energy minimization. Maximizing expected accuracy using an underlying thermodynamic model is more accurate than an underlying statistical model.

Methanococcus thermolithotrophicus 5S rRNA (Szymanski et al., 1998): MaxExpect Predicted Structure:

Minimum Free Energy Structure: CONTRAfold Predicted Structure: