Computational Biology, Part 5 PCR Primer Design & Basic Principles of Entrez Robert F. Murphy Copyright 1996, 1999, 2000, All rights reserved.
Polymerase Chain Reaction Method for exponential amplification of DNA or RNA sequences Method for exponential amplification of DNA or RNA sequences Basic requirements Basic requirements template DNA or RNA 2 oligonucleotide primers complementary to different regions of the template heat stable DNA polymerase 4 nucleotides and appropriate buffer
PCR reference books PCR Protocols: A Guide to Methods and Applications. M. A. Innis, D. H. Gelfand, J. J. Sninsky, T. J. White (eds.) Academic Press, Inc., San Diego, 1990 PCR Protocols: A Guide to Methods and Applications. M. A. Innis, D. H. Gelfand, J. J. Sninsky, T. J. White (eds.) Academic Press, Inc., San Diego, 1990 PCR: A Practical Approach. M. J. McPherson, P. Quirke, G. R. Taylor (eds.) Oxford University Press, Oxford, 1991 PCR: A Practical Approach. M. J. McPherson, P. Quirke, G. R. Taylor (eds.) Oxford University Press, Oxford, 1991
Basic Principles of PCR 1. Strands of template DNA (or RNA) are separated by melting 2. Forward Primer binds to one strand of template, Reverse Primer to other strand 3. DNA polymerase extends 3’ end of each primer, copying template 4. Strands are separated by raising temperature, allowing both original DNA and copies to act as templates 5. Repeat steps 2-4 many times
Temperature cycling Annealing temperature (usually C) allows primers to hybridize to template Annealing temperature (usually C) allows primers to hybridize to template Extension temperature (usually 72 C) allows polymerase to extend starting at the primer Extension temperature (usually 72 C) allows polymerase to extend starting at the primer Denaturation temperature (usually 95 C) separates strands Denaturation temperature (usually 95 C) separates strands
PCR Heat Cool Heat etc. 2 copies + and - strands of template forward primer reverse primer amplified DNA 3’ 5’ 3’ 5’ 3’ 5’ 4 copies 8 copies Heat Cool add primers, polymerase, dNTPs 1 copy
Primer Design Considerations Primers must be specific for desired sequence to be amplified Primers must be specific for desired sequence to be amplified primers should be long enough to ensure specificity (usually bases) primers normally screened against databases Primers must form stable duplex at annealing temperature Primers must form stable duplex at annealing temperature No complementarity between forward and reverse primers or primers and product No complementarity between forward and reverse primers or primers and product
Initial primer selection criteria Length (18-25 bases) Length (18-25 bases) Base composition (45-55% GC) Base composition (45-55% GC) Melting temperature (55-80 C) Melting temperature (55-80 C) 3’ terminal sequence 3’ terminal sequence strong bonding base (G or C) at end no runs (3 or more) of G or C at end
Primer complementarity criteria Primer vs. self & forward vs. reverse Primer vs. self & forward vs. reverse maximum number of consecutive bonds maximum number of consecutive G-C bonds Forward primer vs. Reverse primer Forward primer vs. Reverse primer maximum number of consecutive bonds between the 3’ ends Primer vs. product Primer vs. product maximum number of consecutive bonds between the 3’ ends
Optimization criteria Melting temperatures should be similar for both primers Melting temperatures should be similar for both primers Product should be as short as allowable Product should be as short as allowable
Automated PCR probe selection References W. Rychlik & R. E. Rhoads. A computer program for choosing optimal oligonucleotides for filter hybridization, sequencing and in vitro amplification of DNA. Nucleic Acids Res 17: (1989) W. Rychlik & R. E. Rhoads. A computer program for choosing optimal oligonucleotides for filter hybridization, sequencing and in vitro amplification of DNA. Nucleic Acids Res 17: (1989) A method is presented for choosing optimal oligodeoxyribonucleotides as probes for filter hybridization, primers for sequencing, or primers for DNA amplification. Three main factors that determine the quality of a probe are considered: stability of the duplex formed between the probe and target nucleic acid, specificity of the probe for the intended target sequence, and self-complementarity. DNA duplex stability calculations are based on the nearest-neighbor thermodynamic values determined by Breslauer et al. [Proc. Natl. Acad. Sci. U.S.A. (1986), 83: 3746]. Temperatures of duplex dissociation predicted by the method described here were within 0.4 degrees C of the values obtained experimentally for ten oligonucleotides. Calculations for specificity of the probe and its self-complementarity are based on a simple dynamic algorithm.
Automated PCR probe selection References T. Lowe, J. Sharefkin, S. Q. Yang & C. W. Dieffenbach. A computer program for selection of oligonucleotide primers for polymerase chain reactions. Nucleic Acids Res 18: (1990) T. Lowe, J. Sharefkin, S. Q. Yang & C. W. Dieffenbach. A computer program for selection of oligonucleotide primers for polymerase chain reactions. Nucleic Acids Res 18: (1990) We have designed a computer program which rapidly scans nucleic acid sequences to select all possible pairs of oligonucleotides suitable for use as primers to direct efficient DNA amplification by the polymerase chain reaction. This program is based on a set of rules which define in generic terms both the sequence composition of the primers and the amplified region of DNA. These rules (1) enhance primer-to-target sequence hybridization avidity at critical 3'-end extension initiation sites, (2) facilitate attainment of full length extension during the 72 degrees C phase, by minimizing generation of incomplete or nonspecific product and (3) limit primer losses occurring from primer-self or primer-primer homologies. Three examples of primer sets chosen by the program that correctly amplified the target regions starting from RNA are shown. This program should facilitate the rapid selection of effective and specific primers from long gene sequences while providing a flexible choice of various primers to focus study on particular regions of interest.
Automated PCR probe selection References L. Hillier & P. Green. OSP: a computer program for choosing PCR and DNA sequencing primers. PCR Methods Appl 1:124-8 (1991) L. Hillier & P. Green. OSP: a computer program for choosing PCR and DNA sequencing primers. PCR Methods Appl 1:124-8 (1991) OSP (Oligonucleotide Selection Program) selects oligonucleotide primers for DNA sequencing and the polymerase chain reaction (PCR). The user can specify (or use default) constraints for primer and amplified product lengths, %(G+C), (absolute or relative) melting temperatures, and primer 3' nucleotides. To help minimize nonspecific priming and primer secondary structure, OSP screens candidate primer sequences, using user-specifiable cutoffs, against potential base-pairing with a variety of sequences present in the reaction, including the primer itself, the other primer (for PCR), the amplified product, and any other sequences desired (e.g., repetitive element sequences in genomic templates, vector sequence in cloned templates, or other primer pair sequences in multiplexed PCR reactions). Base-pairing involving the primer 3' end is considered separately from base-pairing involving internal sequences. Primers meeting all constraints are ranked by a "combined score," a user-definable weighted sum of any of the above parameters. OSP is being routinely and extensively used to select sequencing primers for the Caenorhabditis elegans genome sequencing project and human genomic PCR primer pairs for the Washington University Genome Center mapping project, with success rates exceeding 96% and 81%, respectively. It is available for research purposes from the authors, at no cost, in both text output and interactive graphics (X windows) versions.
List-based rule application Define rules Define rules Select initial list of items Select initial list of items Select subsets of list that pass each rule, keeping count of number of times that each reason is used to reject an item Select subsets of list that pass each rule, keeping count of number of times that each reason is used to reject an item
Finding PCR primers with MacVector Two search types Two search types specify product size within a region example: detection of gene in a DNA or RNA sample specify two flanking regions example: subcloning of a plasmid fragment or insert
Two search types
Illustration Goal: Find primers for detecting presence of tubulin exon 5 in mRNA Goal: Find primers for detecting presence of tubulin exon 5 in mRNA
Illustration Specify Region to scan and use defaults for all other parameters Specify Region to scan and use defaults for all other parameters
Illustration Conclusion: Too many pairs found Conclusion: Too many pairs found
Illustration Be more restrictive: Require strong 3’ anchor Be more restrictive: Require strong 3’ anchor
Illustration Conclusion: Still too many Conclusion: Still too many
Illustration Reduce allowed bonding to product also Reduce allowed bonding to product also
Illustration 8 is few enough. Request graphical map. 8 is few enough. Request graphical map.
Illustration Can make final choice based on product size Can make final choice based on product size
What if no primers are found? Examine display listing number of primers or pairs rejected by each criterion Examine display listing number of primers or pairs rejected by each criterion Loosen appropriate criterion Loosen appropriate criterion If few forward or reverse primers are being accepted, broaden primer length, end base restriction, T m limits, or %GC limits If no primer pairs accepted, loosen primer vs. primer or primer vs. product criteria Repeat search Repeat search
Last step Compare primer sequences against nucleic acid sequence databases Compare primer sequences against nucleic acid sequence databases Calculate T m for duplex between primer and best match found Calculate T m for duplex between primer and best match found Compare with T m of primer itself Compare with T m of primer itself
Web sites related to PCR PCR primer designer: Primer3 PCR primer designer: Primer3 bin/primer/primer3.cgi All you ever wanted to know about PCR All you ever wanted to know about PCR Downloadable PCR design software catalog Downloadable PCR design software catalog de/pcr/software.html
Designing primers for sequence families Most PCR primer design software considers only one template sequence Most PCR primer design software considers only one template sequence Different software is required when Different software is required when designing a single set of primers to amplify more than one template sequence designing a set of primers that will amplify only a specific member of a set of sequences Examples: Examples: detecting all members of a gene family detecting a highly variable gene, such as a viral gene
Designing primers for families References K. Lucas, M. Busch, S. Mossinger & J. A. Thompson. An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. Comput Appl Biosci 7:525-9 (1991) K. Lucas, M. Busch, S. Mossinger & J. A. Thompson. An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. Comput Appl Biosci 7:525-9 (1991) We present here an easy-to-use computer program which finds oligonucleotides suitable as primers in polymerase chain reactions (PCR) or as probes for hybridization. In contrast to other programs used for this purpose, the additional advantage of this one is the possibility of directly detecting gene-as well as gene family-specific oligonucleotides. For this purpose, up to 200 different DNA sequences, of maximally 65,000 nucleotides each, can be scanned in a single search to ensure either single or multiple gene binding of the PCR primers or probes. Specific oligonucleotides for genes carrying internal repetitions and for single genes belonging to a set of highly conserved genes can also be detected. Many parameters such as exclusion of simple sequences, which are known to be highly repeated throughout various genomes or regions of stable secondary structures in both primer-primer and primer-template, can be taken into consideration and avoided. Furthermore, the G + C content and the length of the oligonucleotides can be changed in a broad range by the user.
Designing primers for families References M. L. Montpetit, S. Cassol, T. Salas & M. V. O'Shaughnessy. OLIGSCAN: a computer program to assist in the design of PCR primers homologous to multiple DNA sequences. J Virol Methods 36: (1992) M. L. Montpetit, S. Cassol, T. Salas & M. V. O'Shaughnessy. OLIGSCAN: a computer program to assist in the design of PCR primers homologous to multiple DNA sequences. J Virol Methods 36: (1992) OLIGSCAN (oligonucleotide scanner) is a computer program for IBM-PC-compatible computers that allows the user to scan up to 200 DNA sequences for homology to oligonucleotide sequences of interest. Once a core sequence of longer than the user- defined minimum length is found, the remainder of the oligonucleotide is compared to the corresponding positions of the larger sequence to identify matches or mismatches flanking the core region. This algorithm results in identification of the longest possible homologous regions first. The program was originally designed to assist in the identification of potential annealing sites for polymerase chain reaction (PCR) primers in the genomic DNA of related strains of viruses. However, it may also be used for more general pattern-identification purposes, including scanning for various sequence motifs of functional importance. We present the analysis of homology to an oligonucleotide primer in 16 complete genomic sequences of the human and simian immunodeficiency viruses.
Designing primers for families References J. Dopazo & F. Sobrino. A computer program for the design of PCR primers for diagnosis of highly variable genomes. J Virol Methods 41: (1993) J. Dopazo & F. Sobrino. A computer program for the design of PCR primers for diagnosis of highly variable genomes. J Virol Methods 41: (1993) PCRDiag (Diagnosis by PCR) is a computer program which allows the localization of pairs of oligonucleotides with optimal thermodynamic requirements for use in a PCR assay. The program is designed for the selection of pairs of primers complementary to sequences present in a group, whose identification is intended, but are absent in other non-specific sequences. The program constitutes a powerful tool, specially in systems which display a high degree of sequence heterogeneity, as is the case of RNA viruses. The program runs on IBM-PC and compatible computers and has no special software requirements. It does not need the previous alignment of the sequences analyzed
Web sites for degenerate primers List of various PCR primer design web sites List of various PCR primer design web sites primer.html PCR degenerate primer designer for multiple sequences PCR degenerate primer designer for multiple sequences
Block Diagram for PCR Primer Design PCR Primer List Processor PCR reaction parameters Sequence from which to choose primers Primer Selection Rules Results of Search, including suggested annealing temperatures (List)
Network Entrez a client-server system for retrieval of information related to molecular biology a client-server system for retrieval of information related to molecular biology can be used can be used via web page via local copy of Entrez client via "embedded" client in other software (e.g., MacVector)
Entrez Databases Literature PUBMED database contains Medline abstracts as well as links to full text articles on sites maintained by journal publishers Nucleic acid sequences Protein sequences 3D structures Genomes Taxonomy
Entrez literature searching can find papers on a given subject can find papers on a given subject can find papers on a specific gene can find papers on a specific gene can find papers related to a given paper can find papers related to a given paper can switch between literature and sequence databases can switch between literature and sequence databases
Entrez sequence searching can find sequences for a given gene can find sequences for a given gene can download copy of sequence can download copy of sequence
Example Entrez Session Goal: Find literature and sequences for cystic fibrosis genes Goal: Find literature and sequences for cystic fibrosis genes Use Nucleotide Database with Keyword searching.
Example Entrez Session Goal: Find literature and sequences for cystic fibrosis genes Goal: Find literature and sequences for cystic fibrosis genes Use MEDLINE with Keyword searching. Use neighbor feature to find related articles.
Example Entrez Session Goal: Find literature and sequences for obesity genes Goal: Find literature and sequences for obesity genes Use MEDLINE with Keyword searching. Use neighbor feature to find related articles. Switch to Nucleotide database to see sequence.
Example Entrez Session Goal: Find literature and sequences for obesity genes Goal: Find literature and sequences for obesity genes Use MEDLINE with Keyword searching. Use neighbor feature to find related articles. Switch to Nucleotide database to see sequence. Save a copy of sequence to local disk.
Example Entrez Session Goal: Find literature and sequences for obesity genes Goal: Find literature and sequences for obesity genes Use MEDLINE with Keyword searching. Use neighbor feature to find related articles. Switch to Nucleotide database to see sequence. Save a copy of sequence to local disk. Use MESH terms to find similar articles.
Example Entrez Session Goal: Find literature and sequences for obesity genes Goal: Find literature and sequences for obesity genes Use MEDLINE with Keyword searching. Use neighbor feature to find related articles. Switch to Nucleotide database to see sequence. Change to Genbank format to save sequence. Use MESH terms to find similar articles. Search the Nucleotide database by gene name.
Suggested Entrez Explorations Find the sequence of the Arabidopsis ACT1 mRNA. Find the sequence of the Arabidopsis ACT1 mRNA. Find the classic article on cloning of full- length cDNA by H. Okayama and P. Berg. Find the classic article on cloning of full- length cDNA by H. Okayama and P. Berg.
Block Diagram for Entrez Literature Searching Entrez Search Engine Additional Search Criterion Desired Output Format Results of Previous Search Displayed Item Selection Results of Search (List) Item Display