Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zhi John Lu, Jason Gloor, and David H. Mathews University of Rochester Medical Center, Rochester, New York Improved RNA Secondary Structure Prediction.

Similar presentations


Presentation on theme: "Zhi John Lu, Jason Gloor, and David H. Mathews University of Rochester Medical Center, Rochester, New York Improved RNA Secondary Structure Prediction."— Presentation transcript:

1 Zhi John Lu, Jason Gloor, and David H. Mathews University of Rochester Medical Center, Rochester, New York Improved RNA Secondary Structure Prediction by Maximizing Expected Pair Accuracy

2 AAUUGCGGGAAAGGGGUCAA CAGCCGUUCAGUACCAAGUC UCAGGGGAAACUUUGAGAUG GCCUUGCAAAGGGUAUGGUA AUAAGCUGACGGACAUGGUC CUAACCACGCAGCCAAGUCC UAAGUCAACAGAUCUUCUGU UGAUAUGGAUGCAGUUCA RNA Secondary and Tertiary Structure: Cate, et al. (Cech & Doudna). (1996) Science 273:1678. Waring & Davies. (1984) Gene 28: 277.

3 Gibbs Free Energy Change: K i = = = K i /K j = The structure with the lowest  G° is the most favored at a given temperature.

4 Nearest Neighbor Model for Free Energy Change of a Sample Hairpin Loop: Mathews et al., J. Mol. Biol., 1999, 288: 911. Mathews et al., PNAS, 2004, 101: 7287.

5 RNA Secondary Structure Prediction Accuracy: Percentage of Known Base Pairs Correctly Predicted: Mathews, Disney, Childs, Schroeder, Zuker, & Turner. 2004. PNAS 101: 7287.

6 Limitations to Prediction of the Minimum Free Energy Structure: A minimum free energy structure provides the single best guess for the secondary structure. Assumes that: –RNA is at equilibrium –RNA has a single conformation –RNA thermodynamic parameters are without error Non-nearest neighbor effects Some sequence-specific stabilities are averaged

7 A Method that Looks at the Probability of a Structure could be more Informative: A partition function can be used to determine the probability of a structure at equilibrium.

8 The Partition Function, Q:

9 So, what is Q good for? where k is the sum over all structures with the i-j base pair.

10 Accuracy: Sensitivity – what percentage of known pairs occur in the predicted structure. Positive Predictive Value (PPV) – what percentage of predicted pairs occur in the known structure. PPV ≤ Sensitivity because the structures determined by comparative sequence analysis do not have all pairs and there is a tendency to over- predict base pairs by free energy minimization.

11 Applying P i,j to Structure Prediction: Sensitivity Positive Predictive Value (PPV) PPV P BP ≥ 99% PPV P BP ≥ 95% PPV P BP ≥ 90% PPV P BP ≥ 70% PPV P BP > 50% Mathews. RNA. 10: 1178. (2004).

12 Percent of Predicted BP above Threshold: PPV P BP ≥ 99% PPV P BP ≥ 95% PPV P BP ≥ 90% PPV P BP ≥ 70% PPV P BP > 50% Mathews. RNA. 10: 1178. (2004).

13 Color Annotation: E. coli 5S rRNA

14 Structures Constructed from Highly Probable Pairs: P BP ≥ 99%P BP ≥ 90%P BP ≥ 70% P BP > 50%

15 “Maximizing Expected Accuracy:”

16 CONTRAfold: “Statistical learning method” to predict P i,j Generate structures: Where: Bioinformatics. 22: e90-e98. (2006).

17 Implement Maximum Expectation: Zhi John Lu, Jason Gloor, David Mathews Implement dynamic programming algorithm using partition function prediction of P i,j. Also implement suboptimal structure prediction. –Alternative hypotheses.

18 Sensitivity and PPV vs.  :

19 Comparison: Type of RNA: MaxExpect:Free Energy Minimization:CONTRAfold a : Sensitivity (%)PPV (%)Sensitivity (%)PPV (%)Sensitivity (%)PPV (%) SSU rRNA b 62.1±23.1 (56.0±14.9) 58.0±25.0 (52.0±14.2) 61.4±23.7 (45.5±15.2) 54.8±25.3 (38.3±14.9) 60.2±23.3 (46.2±14.5) 45.7±21.6 (34.0±13.0) LSU rRNA b 74.6±11.9 (46.9±14.1) 68.4±11.6 (43.3±14.9) 72.4±17.2 (55.1±11.5) 65.0±16.3 (47.1±12.2) 71.7±17.9 (57.4±14.9) 55.2±14.8 (43.3±11.5) 5S rRNA72.5±26.465.3±23.672.9±26.663.9±23.865.0±29.149.5±23.1 Group I intron71.2±13.968.0±15.170.2±13.663.4±13.575.3±13.657.6±11.6 Group II intron87.0±5.084.9±9.488.1±2.282.7±6.878.6±0.853.0±20.4 RNase P63.3±15.262.7±15.364.4±12.861.8±12.063.1±16.552.3±14.1 SRP65.9±26.351.3±22.168.9±25.452.9±22.267.5±23.746.1±19.2 tRNA85.8±17.984.9±19.685.6±19.683.6±22.286.7±19.968.1±18.3 Average c 72.8±9.567.9±11.973.0±9.466.0±11.471.0±8.953.4±7.2

20 Summary: Maximizing expected accuracy can predict structures with greater sensitivity and positive predictive value than free energy minimization. Maximizing expected accuracy using an underlying thermodynamic model is more accurate than an underlying statistical model.

21 Methanococcus thermolithotrophicus 5S rRNA (Szymanski et al., 1998): MaxExpect Predicted Structure:

22 Minimum Free Energy Structure: CONTRAfold Predicted Structure:


Download ppt "Zhi John Lu, Jason Gloor, and David H. Mathews University of Rochester Medical Center, Rochester, New York Improved RNA Secondary Structure Prediction."

Similar presentations


Ads by Google