Http://creativecommons.org/licenses/by-sa/2.0/ Mirela Andronescu February 22, 2005 Lab 8.3 (c) 2005 CGDN.

Slides:



Advertisements
Similar presentations
Downloading a multiple alignment for your region of interest from the UCSC Genome Browser ( that can be uploaded in ConTra for.
Advertisements

RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
© Wiley Publishing All Rights Reserved. How Most People Use Bioinformatics.
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
RNA Structure Prediction
Project presentation using TWiki Lim Yun Ping National University of Singapore.
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
Tutorial 7 Genome browser. Free, open source, on-line broswer for genomes Contains ~100 genomes, from nematodes to human. Many tools that can be used.
Introduction to Bioinformatics - Tutorial no. 9 RNA Secondary Structure Prediction.
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered.
. Class 5: RNA Structure Prediction. RNA types u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules.
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
Using 3D-SURFER. Before you start 3D-Surfer can be accessed at For visualization.
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
Public Resources (II) – Analysis tools  Web-based analysis tools – easy to use, but often with less customization options.  Stand-alone analysis tools.
© 2012 Boise State University1 WordPress Training February 14, 2013.
MODELLER hands-on Ben Webb, Sali Lab, UC San Francisco Maya Topf, Birkbeck College, London.
© 2012 Boise State University1 WordPress Training February 14, 2013.
Applications Software. Applications software is designed to perform specific tasks. There are three main types of application software: Applications packages.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
MCB 5472 Assignment #6: HMMER and using perl to perform repetitive tasks February 26, 2014.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Identifying the ortholog of TNF (Tumor necrosis factor) in mosquito genomes Pet Projects:
© Wiley Publishing All Rights Reserved. RNA Analysis.
Sackler Medical School
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Questions?. Novel ncRNAs are abundant: Ex: miRNAs miRNAs were the second major story in 2001 (after the genome). Subsequently, many other non-coding genes.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Protein Homologue Clustering and Molecular Modeling L. Wang.
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015.
Motif Search and RNA Structure Prediction Lesson 9.
MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Irakli Garibashvili Director, National Scientific Library in Georgia.
Protein Sequence, Structure, and Function Lab Gustavo Caetano - Anolles Protein Sequence, Structure, and Function Lab v1 | Gustavo Caetano - Anolles 1.
Bioinformatics Computing 1 CMP 807 – Day 4 Kevin Galens.
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Using BLAST to Identify Species from Proteins
Getting GO annotation for your dataset
Introduction to OBIEE:
Take a REST from manual searching: PDBe, programmatically
Regulatory Genomics Lab
Bioinformatics Research Group
SAGExplore web server tutorial for Module III:
Lecture 8.2.
Getting the Most out of the PDBe
Introduction to Programming the WWW I
Lab 8.3: RNA Secondary Structure
Getting Started with Microsoft Office 2010
RNA Secondary Structure Prediction
A database of Cross-border Regulation of microRNAs Shu xin
Yating Liu July 2018 G-OnRamp workshop
Regulatory Genomics Lab
Welcome to the GrameneMart Tutorial
Problems from last section
Exploring Microsoft® Office 2016 Series Editor Mary Anne Poatsy
Part II SeqViewer AraCyc Help
Introduction to RNA-Seq & Transcriptome Analysis
Regulatory Genomics Lab
Introduction to this semester’s Praktikum
Presentation transcript:

http://creativecommons.org/licenses/by-sa/2.0/ Mirela Andronescu February 22, 2005 Lab 8.3 (c) 2005 CGDN

RNA LAB Mirela Andronescu (UBC)

Lab 8.3: RNA – Outline RNA Databases Secondary structure prediction Mirela Andronescu February 22, 2005 Lab 8.3: RNA – Outline RNA Databases Secondary structure prediction Structure visualization (tertiary) Prediction of consensus secondary structure Searching homologues in genomes Lab 8.3 (c) 2005 CGDN

Setup You’ll do 7 activities, each with tasks and questions Some of the activities are in a browser, some at the command prompt Open a terminal Check if you have the directory rnalab If you don’t, download from the course web page Lab 8.3

Activity 1 Downloading RNA secondary structures from Gutell database Mirela Andronescu February 22, 2005 Activity 1 Downloading RNA secondary structures from Gutell database What is Gutell database? Gutell online database contains homologous RNA sequences (mostly ribosomal RNA) Some have associated secondary structures, very accurately determined through comparative sequence analysis Lab 8.3 (c) 2005 CGDN

A1: Gutell DB Open a browser and go to Gutell DB site Follow Gutell link on the course web page Or type http://www.rna.icmb.utexas.edu/ Login with cbw2005/rnalab Lab 8.3

Lab 8.3

A1: Gutell DB On top, click on tab 3. Sequence and structure data Then on left, click on A. INDEX of Available RNA Sequences and Structures Click on the 24 representing 5S rRNA structures of Bacteria Lab 8.3

Lab 8.3

A1: Gutell DB View a PS or PDF file Select PS or PDF on top Click on the first link in the column StrDiags Lab 8.3

Lab 8.3

Lab 8.3

A1: Gutell DB Save a bpseq file Select bpseq on top Click on the first link in the column StrDiags Save the file into the directory rnalab Lab 8.3

Lab 8.3

Lab 8.3

Activity 1 question What is the length of the structure you’ve just viewed and downloaded? (without counting) Tip: the last line of the bpseq file will give you the answer Lab 8.3

Activity 2 Predicting RNA secondary structures using RNAfold from Vienna RNA package What is RNAfold? RNAfold is a program which predicts RNA secondary structure from sequence, using a dynamic programming algorithm A very similar and popular program is mfold Accuracy is about 73% on average Lab 8.3

A2: RNAfold Extract the sequence out of the bpseq file you have downloaded from Gutell DB You’ll use a simple Perl script: bpseq2seq.pl You could write such a script yourself This script is in the rnalab directory, you can read it At the command prompt, type: ./bpseq2seq.pl d.5.b.A.tumefaciens.bpseq > gutell.txt gutell.txt contains an RNA sequence Lab 8.3

A2: RNAfold Run RNAfold software At the command prompt, type: RNAfold < gutell.txt The predicted structure in dot-parenthesis format is displayed, as well as the predicted minimum free energy Lab 8.3

Activity 2 questions What is the predicted minimum free energy? Visualize the predicted structure At the command prompt, type: gv rna.ps & How many multi-loops there are in this structure? Tip: the following structure has one multi-loop Lab 8.3

Multi-loop with three branches Lab 8.3

Activity 3 Downloading a tertiary RNA structure from PDB (Protein Data Bank) What is PDB? PDB is a database containing tertiary structures of proteins and RNAs, determined by NMR or X-ray Lab 8.3

A3: PDB Open a browser and go to PDB database Follow the link on the course web page Or type http://www.rcsb.org/pdb/ as address Type 1C2X in the search box Lab 8.3

Lab 8.3

A3: PDB Click Download/Display File Click the TEXT link for complete PDF file Click Save full entry to disk and save Lab 8.3

Lab 8.3

Lab 8.3

Lab 8.3

Activity 3 questions Search pseudoknotted RNA structures in PDB Tip: type RNA pseudoknot in the search box on the first page of PDB Save to disk a PDB file in the list What is the PDB ID of the structure you chose? Lab 8.3

Activity 4 Visualizing tertiary RNA structures using RasMol What is RasMol? RasMol is a visualization tool for tertiary structures (proteins or RNA) You can see each atom You can rotate the figure with the mouse Takes as input a PDB file Lab 8.3

A4: RasMol display Lab 8.3

A4: RasMol At the command prompt, type: View the structure rasmol 1C2X.pdb & View the structure Rotate the structure with the mouse Lab 8.3

Activity 4 questions Can you see the structure you are viewing is similar to the secondary structure in the next slide? Visualize with RasMol the pseudoknotted structure you downloaded at the end of Activity 3. Lab 8.3

Lab 8.3

Activity 5 Predicting a consensus structure using Alidot What is Alidot? Alidot takes as input a set of homologous sequences and their minimum free energy secondary structure predicted with RNAfold Predicts the consensus structure Lab 8.3

A5: Alidot This activity is to be performed at the command prompt, using files in your rnalab directory The file bact5s.seq contains 13 input homologous sequences representing 5S rRNA, all from Gutell database You can read the information in this file Lab 8.3

A5: Alidot First align these sequences. At the command prompt type: clustalw bact5s.seq This will create the file bact5s.aln Fold all these sequences with RNAfold RNAfold -p < bact5s.seq > bact5s.fold (-p uses the partition function) Lab 8.3

A5: Alidot The output file bact5s.fold contains all folded sequences, but Alidot wants them in separate files: split.pl bact5s.fold (don’t use split!!) Finally, run Alidot, which uses the structure files just created, and the alignment: alidot < bact5s.aln > alidot.out Create a figure with the consensus structure: cons.sh Lab 8.3

Activity 5 questions Visualize the consensus structure gv bact5s.ps & How many conserved stems there are? Try to find the corresponding stems in the structure from Gutell DB that we viewed in activity 1 (see next slide). Lab 8.3

Lab 8.3

Activity 6 Searching homologous sequences in a genome fragment, using Infernal What is Infernal? Infernal searches sequences in a genome, which are homologous to a covariance model Why? To look for conserved/functional elements Infernal also creates structure-based multiple sequence alignments Lab 8.3

A6: Infernal diagram alignment w/ sec. str. annotation covariance model cmbuild covariance model + genome or DNA seq cmsearch hits genomes or DNA seqs cmalign structural alignment Lab 8.3

A6: Infernal This activity is to be performed at the command prompt, using files in your rnalab directory The file RNAI.sto contains 10 RNAI (RNA Interference) homologous sequences from different organisms. They are aligned and they have a consensus structure associated Lab 8.3

A6: Infernal Given the file RNAI.sto, build a covariance model RNAI.cm: cmbuild RNAI.cm RNAI.sto output input Lab 8.3

A6: Infernal The file RNAI.db contains a genome fragment from another organism, which contains an RNAI gene Search for RNAI homologues in this fragment cmsearch RNAI.cm RNAI.db input Lab 8.3

A6: Infernal The file RNAI.fa contains several genome fragments from various organisms Create a structure-based multiple sequence alignment of these fragments cmalign RNAI.cm RNAI.fa input Lab 8.3

Activity 6 questions How many hits did you find with cmsearch? Watch the structure-based alignment returned by cmalign Compare with the alignment obtained with clustalw (see next 2 slides) Do the two alignments differ a lot? Lab 8.3

Lab 8.3

Activity 7 Browsing RNAI in Rfam database What is Rfam? Rfam is a database of RNA families of homologous non-coding RNA Rfam was built using Infernal. Starting from a set of aligned sequences with known structure (called “seed”), new homologous sequences were found and aligned. Lab 8.3

A7: Rfam Open a browser and go to Rfam database Follow the Rfam link on the course web site Or type http://www.sanger.ac.uk/Software/Rfam/ Click on the tab Browse Rfam Expand Gene, then antisense Click RNAI – RNAI Lab 8.3

Lab 8.3

Lab 8.3

A7: Rfam This will open a page with information about RNAI, a consensus structure, and a table The column Alignment allows download of aligned sequences The file RNAI.sto that you used as input to Infernal was obtained from here The column Member sequences opens a list of sequences, with the EMBL accession numbers Lab 8.3

Lab 8.3

Lab 8.3

Activity 7 questions Find the EMBL accession number of an RNAI which was used in the seed How many sequences from the family U1 spliceosomal RNA does Rfam contain in total? Tip: see the next two slides: Click Browse Rfam Expand Gene, then snRNA, then splicing Click on U1 spliceosomal RNA Read the value near Full under Member sequences Lab 8.3

Lab 8.3

Lab 8.3

Recap Databases: Gutell, PDB, Rfam Visualization - tertiary: RasMol Secondary structure prediction: RNAfold Consensus structure prediction: Alidot Searching homologous structures: Infernal Lab 8.3