Download presentation
Presentation is loading. Please wait.
Published byErika Blake Modified over 6 years ago
1
http://creativecommons.org/licenses/by-sa/2.0/ Mirela Andronescu
February 22, 2005 Lab 8.3 (c) 2005 CGDN
2
RNA LAB Mirela Andronescu (UBC)
3
Lab 8.3: RNA – Outline RNA Databases Secondary structure prediction
Mirela Andronescu February 22, 2005 Lab 8.3: RNA – Outline RNA Databases Secondary structure prediction Structure visualization (tertiary) Prediction of consensus secondary structure Searching homologues in genomes Lab 8.3 (c) 2005 CGDN
4
Setup You’ll do 7 activities, each with tasks and questions
Some of the activities are in a browser, some at the command prompt Open a terminal Check if you have the directory rnalab If you don’t, download from the course web page Lab 8.3
5
Activity 1 Downloading RNA secondary structures from Gutell database
Mirela Andronescu February 22, 2005 Activity 1 Downloading RNA secondary structures from Gutell database What is Gutell database? Gutell online database contains homologous RNA sequences (mostly ribosomal RNA) Some have associated secondary structures, very accurately determined through comparative sequence analysis Lab 8.3 (c) 2005 CGDN
6
A1: Gutell DB Open a browser and go to Gutell DB site
Follow Gutell link on the course web page Or type Login with cbw2005/rnalab Lab 8.3
7
Lab 8.3
8
A1: Gutell DB On top, click on tab 3. Sequence and structure data
Then on left, click on A. INDEX of Available RNA Sequences and Structures Click on the 24 representing 5S rRNA structures of Bacteria Lab 8.3
9
Lab 8.3
10
A1: Gutell DB View a PS or PDF file Select PS or PDF on top
Click on the first link in the column StrDiags Lab 8.3
11
Lab 8.3
12
Lab 8.3
13
A1: Gutell DB Save a bpseq file Select bpseq on top
Click on the first link in the column StrDiags Save the file into the directory rnalab Lab 8.3
14
Lab 8.3
15
Lab 8.3
16
Activity 1 question What is the length of the structure you’ve just viewed and downloaded? (without counting) Tip: the last line of the bpseq file will give you the answer Lab 8.3
17
Activity 2 Predicting RNA secondary structures using RNAfold from Vienna RNA package What is RNAfold? RNAfold is a program which predicts RNA secondary structure from sequence, using a dynamic programming algorithm A very similar and popular program is mfold Accuracy is about 73% on average Lab 8.3
18
A2: RNAfold Extract the sequence out of the bpseq file you have downloaded from Gutell DB You’ll use a simple Perl script: bpseq2seq.pl You could write such a script yourself This script is in the rnalab directory, you can read it At the command prompt, type: ./bpseq2seq.pl d.5.b.A.tumefaciens.bpseq > gutell.txt gutell.txt contains an RNA sequence Lab 8.3
19
A2: RNAfold Run RNAfold software At the command prompt, type:
RNAfold < gutell.txt The predicted structure in dot-parenthesis format is displayed, as well as the predicted minimum free energy Lab 8.3
20
Activity 2 questions What is the predicted minimum free energy?
Visualize the predicted structure At the command prompt, type: gv rna.ps & How many multi-loops there are in this structure? Tip: the following structure has one multi-loop Lab 8.3
21
Multi-loop with three branches Lab 8.3
22
Activity 3 Downloading a tertiary RNA structure from PDB (Protein Data Bank) What is PDB? PDB is a database containing tertiary structures of proteins and RNAs, determined by NMR or X-ray Lab 8.3
23
A3: PDB Open a browser and go to PDB database
Follow the link on the course web page Or type as address Type 1C2X in the search box Lab 8.3
24
Lab 8.3
25
A3: PDB Click Download/Display File
Click the TEXT link for complete PDF file Click Save full entry to disk and save Lab 8.3
26
Lab 8.3
27
Lab 8.3
28
Lab 8.3
29
Activity 3 questions Search pseudoknotted RNA structures in PDB
Tip: type RNA pseudoknot in the search box on the first page of PDB Save to disk a PDB file in the list What is the PDB ID of the structure you chose? Lab 8.3
30
Activity 4 Visualizing tertiary RNA structures using RasMol
What is RasMol? RasMol is a visualization tool for tertiary structures (proteins or RNA) You can see each atom You can rotate the figure with the mouse Takes as input a PDB file Lab 8.3
31
A4: RasMol display Lab 8.3
32
A4: RasMol At the command prompt, type: View the structure
rasmol 1C2X.pdb & View the structure Rotate the structure with the mouse Lab 8.3
33
Activity 4 questions Can you see the structure you are viewing is similar to the secondary structure in the next slide? Visualize with RasMol the pseudoknotted structure you downloaded at the end of Activity 3. Lab 8.3
34
Lab 8.3
35
Activity 5 Predicting a consensus structure using Alidot
What is Alidot? Alidot takes as input a set of homologous sequences and their minimum free energy secondary structure predicted with RNAfold Predicts the consensus structure Lab 8.3
36
A5: Alidot This activity is to be performed at the command prompt, using files in your rnalab directory The file bact5s.seq contains 13 input homologous sequences representing 5S rRNA, all from Gutell database You can read the information in this file Lab 8.3
37
A5: Alidot First align these sequences. At the command prompt type:
clustalw bact5s.seq This will create the file bact5s.aln Fold all these sequences with RNAfold RNAfold -p < bact5s.seq > bact5s.fold (-p uses the partition function) Lab 8.3
38
A5: Alidot The output file bact5s.fold contains all folded sequences, but Alidot wants them in separate files: split.pl bact5s.fold (don’t use split!!) Finally, run Alidot, which uses the structure files just created, and the alignment: alidot < bact5s.aln > alidot.out Create a figure with the consensus structure: cons.sh Lab 8.3
39
Activity 5 questions Visualize the consensus structure
gv bact5s.ps & How many conserved stems there are? Try to find the corresponding stems in the structure from Gutell DB that we viewed in activity 1 (see next slide). Lab 8.3
40
Lab 8.3
41
Activity 6 Searching homologous sequences in a genome fragment, using Infernal What is Infernal? Infernal searches sequences in a genome, which are homologous to a covariance model Why? To look for conserved/functional elements Infernal also creates structure-based multiple sequence alignments Lab 8.3
42
A6: Infernal diagram alignment w/ sec. str. annotation covariance
model cmbuild covariance model + genome or DNA seq cmsearch hits genomes or DNA seqs cmalign structural alignment Lab 8.3
43
A6: Infernal This activity is to be performed at the command prompt, using files in your rnalab directory The file RNAI.sto contains 10 RNAI (RNA Interference) homologous sequences from different organisms. They are aligned and they have a consensus structure associated Lab 8.3
44
A6: Infernal Given the file RNAI.sto, build a covariance model RNAI.cm: cmbuild RNAI.cm RNAI.sto output input Lab 8.3
45
A6: Infernal The file RNAI.db contains a genome fragment from another organism, which contains an RNAI gene Search for RNAI homologues in this fragment cmsearch RNAI.cm RNAI.db input Lab 8.3
46
A6: Infernal The file RNAI.fa contains several genome fragments from various organisms Create a structure-based multiple sequence alignment of these fragments cmalign RNAI.cm RNAI.fa input Lab 8.3
47
Activity 6 questions How many hits did you find with cmsearch?
Watch the structure-based alignment returned by cmalign Compare with the alignment obtained with clustalw (see next 2 slides) Do the two alignments differ a lot? Lab 8.3
48
Lab 8.3
49
Activity 7 Browsing RNAI in Rfam database What is Rfam?
Rfam is a database of RNA families of homologous non-coding RNA Rfam was built using Infernal. Starting from a set of aligned sequences with known structure (called “seed”), new homologous sequences were found and aligned. Lab 8.3
50
A7: Rfam Open a browser and go to Rfam database
Follow the Rfam link on the course web site Or type Click on the tab Browse Rfam Expand Gene, then antisense Click RNAI – RNAI Lab 8.3
51
Lab 8.3
52
Lab 8.3
53
A7: Rfam This will open a page with information about RNAI, a consensus structure, and a table The column Alignment allows download of aligned sequences The file RNAI.sto that you used as input to Infernal was obtained from here The column Member sequences opens a list of sequences, with the EMBL accession numbers Lab 8.3
54
Lab 8.3
55
Lab 8.3
56
Activity 7 questions Find the EMBL accession number of an RNAI which was used in the seed How many sequences from the family U1 spliceosomal RNA does Rfam contain in total? Tip: see the next two slides: Click Browse Rfam Expand Gene, then snRNA, then splicing Click on U1 spliceosomal RNA Read the value near Full under Member sequences Lab 8.3
57
Lab 8.3
58
Lab 8.3
59
Recap Databases: Gutell, PDB, Rfam Visualization - tertiary: RasMol
Secondary structure prediction: RNAfold Consensus structure prediction: Alidot Searching homologous structures: Infernal Lab 8.3
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.