Download presentation
Presentation is loading. Please wait.
Published byInge Hartanto Modified over 6 years ago
1
RNA Helical Structures Prediction and Comparison
Faculty of Life Sciences, Computer Science Department Bar-Ilan University Nussinov-Wolfson Structural Bioinformatics Group, Faculty of Exact Sciences Tel Aviv University RNA Helical Structures Prediction and Comparison Liron Barzilay and Tom Susel Project Advisor: Prof. Ruth Nussinov
2
Background: ncRNAs DNA mRNA Protein 3% ~30K ? DNA ncRNA 97% >60K
3
Bacteriophage MS2 Capsid
Hepatitis delta virus Ribozyme E. Coli Initiator tRNA receptor 3 ectodomain complex with double-stranded RNA Haloarcula marismortui 50s ribosomal unit
4
Structure Importance 1: Function
5
Structure Importance 2: Drugs Anti-Biotic Targeting:
RNA Helixes: Shallow groove Deep groove Mixture of RNA Helixes and phosphate groups:
6
Background: Structuring Main Idea
2D 3D ?
7
Background: Reduction to Helixes
WHY Helixes? >50% of all ncRNA nucleotides Most conserved (even more than loops) Usually Small (<15 BP, Common 2 BP) 1D≈2D
8
Idea: RCSB PDB ~800 ncRNAs. ~14500 Helixes.
~ BP Helixes (Window 2). 1D D 1D D
9
Modified Residues (Nucleosides): Intra-Chain Direction Reverse:
Analysis: Complexity Modified Residues (Nucleosides): mRNAs: 4-Letter Language. ncRNAs: ~110-Letter Language. Intra-Chain Direction Reverse: mRNAs: Standard 3’-5’/5’-3’. ncRNAs: ~3% Reversed 3’-5’/3’-5’.
10
Analysis: Complexity Broken Linkage:
mRNAs: Complete Phosphate Skeleton. ncRNAs: ~80% Have One or More Phosphate Missing. Non-wc BPs mRNAs: Non-WC BPs Are Rare. ncRNAs: ~40% Non-WC BPs.
11
Analysis 1: Find Helixes
Manual selection of ncRNA find_pairs 2D Only Valid Helixes 1D 2D BP Manager Strand A Strand B Positions A Positions B Initial DB 3D
12
Analysis 2: Clustering Distance Matrix Engine All Against All RMSD
Strand A Strand B Positions A Positions B All Against All RMSD Distance Matrix Engine Window 2 Window 3 Window 6 9240X9240 6848X6848 2273X2273 CLUTO
13
Output: Helix and Cluster Information
Analyzing Results Input: _Strand A_ Clusters DB _Strand b_ Finding The Best Fitted Cluster: Levenshtein Distance (Edit Distance) Finding The Representing Helix: Inner Cluster RMSD (for closest members) Output: Helix and Cluster Information
14
DEMO!
15
Some Statistics Code: Running Time: Requirements: C++: ~2200 lines.
Perl: ~1200 lines. Html/Java script: ~200 lines. Memory and Multi-Processing Optimized. Include Wrap for 3 external tools. Running Time: Started with 1.2gb of ncRNA PDBs BP Manager: 2 hours . mb of Helix-Only PDBs DM Engine: 8h x 5 = 40 hours. Perl Scripts: 30m x 5 = 2.5 hours. 305mb of small PDBs gb of Matrices Requirements: C++ code is for Ubuntu (Multi- Processing). Perl is OS-Independent. DM Engine and the associated Perl scripts are very demanding. For window 2 we needed Quad 3.0 GHz with 4gb of Memory to complete the run before memory overload.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.