Nucleic Acid Secondarily

Slides:



Advertisements
Similar presentations
Graphic Display in GCG Configuring Graphics Languages and Devices GIF (Graphics Interchange Format) – GIF87a, GIF89a HPGL (HP Graphics Language) – ColorPro,
Advertisements

Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Introduction to Bioinformatics - Tutorial no. 9 RNA Secondary Structure Prediction.
Probe design for microarrays using OligoWiz. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical.
Python Programming on PCR Primers Design
Selection of Optimal DNA Oligos for Gene Expression Arrays Reporter : Wei-Ting Liu Date : Nov
Real-Time Primer Design for DNA Chips Annie Hui CMSC 838 Presentation.
Computational Biology, Part 2 Representing and Finding Sequence Features using Consensus Sequences Robert F. Murphy Copyright  All rights reserved.
Computational Biology, Part 2 Sequence Comparison with Dot Matrices Robert F. Murphy Copyright  1996, All rights reserved.
1 Ref: Ch. 5 Mount: Bioinformatics i.Protein synthesis: ribosomal RNA transfer RNA messenger RNA ii.Catalysis e.g. ribozymes iii.Regulatory molecules 17.1.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Nucleic Acid Design Applications Polymerase Chain Reaction (PCR) Calculating Melting Temperature (Tm) PCR Primers Design.
Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.
©2003/04 Alessandro Bogliolo Primer design. ©2003/04 Alessandro Bogliolo Outline 1.Polymerase Chain Reaction 2.Primer design.
DNA Replication DNA mRNA protein transcription translation replication Before each cell division the DNA must be replicated so each daughter cell can get.
Interdisciplinary Center for Biotechnology Research
PCR Primer Design Guidelines
PCR Primer Design
Transcription Transcription- synthesis of RNA from only one strand of a double stranded DNA helix DNA  RNA(  Protein) Why is RNA an intermediate????
© Wiley Publishing All Rights Reserved.
DOT PLOT Daniel Svozil. Software choice source: Bioinformatics for Dummies.
IN THE NAME OF GOD. PCR Primer Design Lecturer: Dr. Farkhondeh Poursina.
PCR- Polymerase chain reaction
PCR optimization. Primers – design must be good but influenced by template sequence Quality of template DNA/impurities Components of PCR may need to be.
Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics
Molecular Biology basics. Restriction enzymes Natural enzymes made by bacteria to protect against viral and other infections Each restriction enzyme recognizes.
International Livestock Research Institute, Nairobi, Kenya. Introduction to Bioinformatics: NOV David Lynn (M.Sc., Ph.D.) Trinity College Dublin.
Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.
Assessment of sequence alignment Lecture Introduction The Dot plot Matrix visualisation matching tool: – Basics of Dot plot – Examples of Dot plot.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Tools of Bioinformatics
Computational Biology, Part 3 Sequence Alignment Robert F. Murphy Copyright  1996, All rights reserved.
Alineamiento Matricial (Harr Plot, Matrix Plot, Dot Plot, Dot Matrix)
Strand Design for Biomolecular Computation
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Dave Palmer Primer Design Dave Palmer
Primer Design and Computer Program Does it really matter? Principles of Primer Design Can I trust my gut feeling? What should I do? Sean Tsai ©1999, National.
Basic Overview of Bioinformatics Tools and Biocomputing Applications I Dr Tan Tin Wee Director Bioinformatics Centre.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
Copyright OpenHelix. No use or reproduction without express written consent1.
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
Comparing Sequences AND Multiple Sequence Alignment Bioinformatics
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
Doug Raiford Lesson 7.  RNA World Hypothesis  RNA world evolved into the DNA and protein world  DNA advantage: greater chemical stability  Protein.
RNA Structure Prediction RNA Structure Basics The RNA ‘Rules’ Programs and Predictions BIO520 BioinformaticsJim Lund Assigned reading: Ch. 6 from Bioinformatics:
Introduction Logo The effect of amplicon characteristics on the success of fast QPCR. Gerwyn Jones, Srujana Kapavaparu, Saima Nayab and Ian Kavanagh* Thermo.
A Software Tool for Generating Non-Crosshybridizing libraries of DNA Oligonucleotides Russell Deaton, junghuei Chen, hong Bi, and John A. Rose Summerized.
Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics
Sequence Alignment.
Create a folder “BIO” in your computer Download bioinformatics08.exe from or Decompress bioinformatics08.exe Open bioinformatics08.ppt.
D. Darban, Ph.D Department of Microbiology School of Medicine Alborz University of Medical Sciences 1 Probe and Primer Design.
Jason Gans Los Alamos National Laboratory Improved Assay-dependent Searching of Nucleic Acid Sequence Databases.
Fac. of Agriculture, Assiut Univ.
Polymerase Chain Reaction
Good qPCR The Necessary and the Reasonable
Vienna RNA web servers
PCR TECHNIQUE
Primer design.
Lecture 4: Probe & primer design
Nucleic Acid Interactions Practicalities
Introduction to Bioinformatics II
Identification and Characterization of pre-miRNA Candidates in the C
Protein Structures.
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
Molecular Therapy - Nucleic Acids
Jesse L. Montgomery, Nick Rejali, Carl T. Wittwer 
Russell Deaton, junghuei Chen, hong Bi, and John A. Rose
It is the presentation about the overview of DOT MATRIX and GAP PENALITY..
Presentation transcript:

Nucleic Acid Secondarily Bioinformatics 92-05 Nucleic Acid Secondarily Structure AND Primer Selection

http://gcg.nhri.org.tw:8003/gcg-bin/seqweb.cgi

Nucleic Acid Secondary Structure Stemloop and Mfold In Nucleic acids, inverted repeat sequences may indicate foldback (self pairing)structures. Identifying Inverted Repeats Calculating RNA Folding Displaying of Folding Structures Stemloop Plotfold/Dotplot Mfold

STEMLOOP StemLoop finds stems (inverted repeats) within a sequence. You specify the minimum stem length (number of nucleotides in a paired stretch), minimum and maximum loop sizes, and the minimum number of bonds per stem (length of nucleotide sequence between the paired regions). Vertical bars ('|') indicating the base pairs. The associated loop is shown to the right of the stem. If either the stem or loop is too long to be displayed in its entirety on the line, then only that part that fits on the line is shown. The first and last coordinates of the stem are displayed on the left, and the length of the stem (size), the number of bonds in the stem (quality), and the loop size are shown on the right. start size 217 AGGCTGCAGTG AGCCGTGAT 11, 25 |||||| |||| C 257 TCCGGCCTCAC GTCACCGCG quality end stem

StemLoop searches for inverted repeats in your sequence after you choose a minimum stem length and minimum and maximum loop sizes. You must also specify a minimum number of bonds per stem with G-T, A-T/U, and G-C scored as 1, 2, and 3 bonds, respectively. The stems found can be sorted by position, size (stem length), or quality (number of bonds) and can be either filed or displayed on the screen. StemLoop tells you the number of stems found for your settings of minimum stem size, maximum loop size, minimum loop size, and minimum bonds per stem. If you feel there are too many stems, you may reset the parameters without reviewing the stems found or view only the best stems found. To view only the best stems, there must be more than 25 stems found and you must sort them by quality or size.

PARAMETER REFERENCE – STEM LOOP Minimum stem length (window) sets the minimum stem length. This value cannot exceed either 50 or half the sequence length. Minimum bonds per stem (stringency) sets the minimum bonds per stem. Minimum loop size sets the minimum loop size. Maximum loop size (distance to furthest inverted repeat) sets the maximum loop size (distance to furthest inverted repeat). Sort stems by: position quality size indicates how to sort the stems in the output. Number of stems to show sets the maximum number of stems to show (only applies when stems are sorted by quality or size). Threshold for nibbling, match (|), and point display The output from this program has a '|' (vertical bar) between sequence symbols that match. This match display character is added to the output whenever the symbol comparison value for the two symbols in your scoring matrix is greater than or equal to the average positive non-identical comparison value in the matrix. The Threshold for nibbling, match (|), and point display parameter lets you specify a match display threshold appropriate for the scoring matrix you are using.

STEM LOOP output file Vertical bars ('|') indicating the base pairs. The associated loop is shown to the right of the stem. The first and last coordinates of the stem are displayed on the left. The length of the stem (size), the number of bonds in the stem (quality), and the loop size are shown on the right. 217 AGGCTGCAGTG AGCCGTGAT 11, 25 |||||| |||| C 257 TCCGGCCTCAC GTCACCGCG 19 135 TAGCCGGGCGT GG 11, 22 ||| || |||| 160 GTCCGCGCGCG GT 4 Loop Start End Size Quality 1 217 257 11 25 2 135 160 11 22 3 139 160 8 20 4 69 95 7 20 5 4 25 9 20 6 213 247 8 19 7 221 248 7 18 8 35 54 8 18

STEMLOOP Output formats 1) See the stems 2) See the stem coordinates 3) File the stems (*.fld) 4) File the stems as points for DOTPLOT 5) Choose new parameters 6) Get a different sequence Sort stems by: 1) Position 2) Quality 3) Size 221 TGCAGTG AGCCGTG 7, 18 ||||||| 248 ACGTCAC CGCGCTA 14 Loop Start End Size Quality 1 35 54 8 18 *.stem *.pnt  DOTPLOT

MFOLD Mfold output file: *.mfold Using energy minimization criteria, any predicted "optimal" secondary structure for an RNA or DNA molecule depends on the model of folding and the specific folding energies used to calculate that structure. Different optimal foldings may be calculated if the folding energies are changed even slightly. Because of uncertainties in the folding model and the folding energies, the "correct" folding may not be the "optimal" folding determined by the program. You may therefore want to view many optimal and suboptimal structures within a few percent of the minimum energy. You can use the variation among these structures to determine which regions of the secondary structure you can predict reliably. For instance, a region of the RNA molecule containing the same helix in most calculated optimal and suboptimal secondary structures may be more reliably predicted than other regions with greater variation. Mfold output file: *.mfold

MFOLD How to read *.mfold? Survey of optimal and suboptimal foldings A) sub-optimal energy plot B) p-num plot Sampling of optimal and suboptimal foldings C) circles D) domes E) mountains F) squiggles PLOTFOLD

A) sub-optimal energy plot PLOTFOLD A) sub-optimal energy plot The energy dotplot indicates all of the base pairs involved in all optimal and suboptimal secondary structures within the energy increment you specify. The plot takes the form of a two-dimensional graph where both axes of the graph represent the same RNA sequence. Each point drawn in the graph indicates a base pair between the ribonucleotides whose positions in the sequence are the coordinates of that point on the graph

PLOTFOLD B) p-num plot This plot shows the amount of variability in pairing at each position in the sequence in all predicted foldings within the increment of the optimal folding energy you specify

PLOTFOLD plotC) circles

PLOTFOLD D) domes

PLOTFOLD E) mountains The program plots representative secondary structures that satisfy the energy increment and window size criteria you specify.

PLOTFOLD F) squiggles

Exercise 07 StemLoop & Mfold link Open the file “Exercixe91-07-1.doc” and follow the steps. gcg2 4% fetch gb:d00063 and gb:j02061 j02061.gb_vi d00063.gb_pl gcg2 5% mfold d00063.gb_pl  d00063.mfold mfold j02061.gb_vi  j02061.mfold $ Mfold (Linear) MFOLD what sequence ? j02061.gb_vi Begin (* 1 *) ? End (* 121 *) ? What should I call the energy matrix output file (* j02061.mfold *) ?

Primer Selection Program-Prime Specificity - %GC - Dimer – Hairpin - Tm Nucleotide sequences Amino Acid CONSENSUS Pileup Pretty Prettybox Primer Selection Program-Prime backtranslate Confirm by BLAST Targets an amplicon length of 75 to 150 bp 50 to 60% GC content Limit secondary structure Limit stretch of G or C’s longer than 3 bases No stable interaction between forward and reverse primers (primer/dimer pairs) Place C’s and G’s on ends of primers, but no more than 2 in the last 5 bases on 3’ end Melting Temperature (Tm) above 50 oC Verify specificity

http://gcg.nhri.org.tw:8003/gcg-bin/seqweb.cgi

----------------------------------------------PCR Product Length Primer Length Minimum - Maximum - ----------------------------------------------PCR Product Length ----------------------------------------------Maximum number of primers or PCR products in output (range 1 thru 2500) Primer DNA concentration (nM) (range .1 thru 500.0) - Salt concentration (mM) (range .1 thru 500.0) - ----------------------------------------------Select: forward primers, only reverse primers, only primers on both strands for PCR Set maximum overlap (in base pairs) between predicted PCR products Forward strand primer extension must include position Reverse strand primer extension must include position ---------------------------------------------- Reject duplicate primer binding sites on template Specify primer 3' clamp (using IUB ambiguity codes) ----------------------------------------------- Primer % G+C Minimum (range 0.0 thru 100.0) Maximum -----------------------------------------------Primer Melting Temperature (degrees Celsius) Minimum (range 0.0 thru 200.0) -----------------------------------------------Maximum difference between melting temperatures of two primers in PCR (degrees Celsius) (range 0.0 thru 25.0) Product % G+C ----------------------------------------------- Product Melting Temperature (degrees Celsius)

Useful bookmarks for probe and primer design:   http://www.operon.com/oligos/toolkit.php  Use free online Tm calculators to see what the Tm for primer and probe sequences are. We use the Operon calculator, as it also has a good tool to check possible primer-dimerization sequences. Reach it from www.operon.com then select DNA Synthesis, then select Oligo Toolkit. http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi  This is a link to Primer3 software. It is software that allows for primer design and also helps picking an internal oligo sequence to these primers (a probe sequence). Like most primer design algorithms, it has the disadvantage of not taking into account secondary structure issues that are paramount in primer/probe design for real-time PCR. With that caveat in mind, it is a good place to start the design process, if you are not inclined to do it by eye (i.e. scanning the sequence yourself). Then you can check your sequences at the folding site (described below). http://www.ncbi.nlm.nih.gov/BLAST/  This is the site for the Basic Local Alignment Search Tool from the National Center for Biotechnology Information. Use this site for checking specificity of probe and primer sequences.

RNA folding programs on the Web mfold version 3.0 by Zuker and Turner at Washington Univ. of St. Louis http://mfold2.wustl.edu/~mfold/rna/form1.cgi SStructView: RNA Secondary Structure Java Applet that visualizes RNA structures as calculated by mfold http://smi-web.stanford.edu/projects/helix/pubs/gene-combis-96/eg-rnafold.html RNA secondary structure prediction with GenBee at the Belozersky Institute, Moscow State University, Russia http://www.genebee.msu.su/services/rna2_reduced.html Protein Hydrophobicity Server: Bioinformatics Unit, Weizmann Institute of Science , Israel http://bioinformatics.weizmann.ac.il/hydroph/ SAPS - statistical analysis of protein sequences http://www.isrec.isb-sib.ch/software/SAPS_form.html

Exercise 05 Primer Selection Use the human npm cDNA sequence to design a pair of primers that will copy the whole coding sequence when translated in frame. THEN Check the specificity of the primers by using BLAST.