Download presentation
Presentation is loading. Please wait.
1
Real-Time Primer Design for DNA Chips Annie Hui CMSC 838 Presentation
2
CMSC 838T – Presentation Use of primers in PCR and Microarrays u PCR (polymerase chain reaction: to amplify a particular DNA fragment Use: to test for the presence of nucleotide sequences Ladder: a mixture of fragments of known length Lane 1 : PCR fragment is ~1850 bases long. Lane 2 and 4 : the fragments are ~ 800 bases long. Lane 3 : no product is formed, so the PCR failed. Lane 5 : multiple bands are formed because one of the primers fits on different places. u Test of PCR products:
3
CMSC 838T – Presentation Use of primers in PCR and Microarrays u DNA chips (Microarrays): to analyse a large number of genes in parallel. u Primers: 20 to 100 bases long Synthetically manufactured u Automated design of primer A computational approach Objective: To find primers that bind well without self-hybridizing Critique: how accurate? Fixed on chip fluorescence Bound to primer
4
CMSC 838T – Presentation Motivation: This group uses the automated NucliSens extraction system (bioMerieux) to develop their primers here.
5
CMSC 838T – Presentation 1. Select primers from target sequence two primers P (forward) and Q (reverse) for PCR, one primer for DNA chip (microarray) Using window size W, number of possible primers with length between m and n within 1 window is: Technique: The computational model
6
CMSC 838T – Presentation Technique: The computational model 2. For each primer pair, or single primer, Quantify 4 hybridization conditions: a. Primer length b. Melting temperature c. GC content d. Secondary structure i. Self annealing ii. Self end annealing iii. Pair annealing iv. Pair end annealing We are starting here
7
CMSC 838T – Presentation Technique: quantifying hybridization conditions a. Primer length len(P) Affect melting temperature and hybridization b. Melting temperature T m (P) Temperature at which the bonds between primer and gene sequence break c. CG content CG(P) G-C pairs are more stable than A-T pairs (because of more H-bonds) What is this measure good for?
8
CMSC 838T – Presentation Technique: quantifying hybridization conditions d. Secondary structure Study how likely a primer entangles with itself or with another primer P = {p 1, p 2, …, p n }, Q = {q 1, q 2, …, q m }, Scoring function: l S(p i, q j ) = 2 if {p i, q j } = {A, T} = 4 if {p i, q j } = {C, G} = 0otherwise Example: P:...AGCTTTAGCCATAG Q: TCTTAGGATCGC... score S(p i, q 1 ) = 2+4+2+2+4 = 14 Position i of primer P
9
CMSC 838T – Presentation Technique: quantifying hybridization conditions u Four measures of secondary structure: i. Self annealing, SA(P, P’) P’ = reverse of P P P’ ii. Self end annealing, SEA(P, P’) Like Self annealing k>=0 Only count longest continuous overlaps P P’ iii. Pair annealing, PA(P, Q) P and Q are the forward and reverse primers iv. Pair end annealing, PEA(P, Q) similar to self end annealing
10
CMSC 838T – Presentation u For PCR: P is forward primer, Q is reverse primer Ideally, no annealing, length, GC and temp of P equals Q The optimization is: u For DNA chips (Microarrays): Q doesn’t exist. No pair annealing to study. Only 5 terms left. Technique: How to apply the model
11
CMSC 838T – Presentation Technique: parallelize SCPCR(p,q) calculation Calculate Len, GC, Temp, SA and SEA in parallel Compute PA and PEA in parallel
12
CMSC 838T – Presentation u Melting temperature and CG content: Simple adder+divider Use pipelining 1 st one: O(m) Subsequent cost: O(1) u Annealing matrix Technique: details ad bd cd a b c d e f ce be ae af bf cf Whole window: AGCGATATA i-th P primer: GCGATA (i+I)-th P primer: CGATAT CG(P i+1 ) = CG(P i ) - 1 H(P i+1 ) = H(P i ) - H(GC) + H(AT), similar for S
13
CMSC 838T – Presentation u Complexity for sequential algorithm: For PCR: l Number of choices of P (window size=W p ): l Number of choices of Q (window size=W q ): l Each distance SCPCR(P,Q): l Total: u Complexity for parallel algorithm: For PCR: l Distance measure SCPCR(P, Q) = O(1) l Total: O(S*T) Similar but simpler for Microarray Complexity O(S*S*T*T) is a typo in the paper
14
CMSC 838T – Presentation Evaluation u Experimental environment 512 primer pairs, |W p | = |W q | = 16 1. 500MHz Celeron system with integrated hardware accelerator 2. Software implementation u Evaluation results 1920 secs for software implementation 3.41 secs for using hardware accelerator
15
CMSC 838T – Presentation Related Work u Previous approach DOPRIMER l Same computational model l Differ in the way of doing dynamic programming l Sequential in nature u Other Primer selection softwares Eg: Primer Premier 5, Primer3, PrimerGen, PrimerDesign Similarities: l Criteria: Length, Temp range, GC range, GC Clamp, 3’ end stability, uniqueness of 3’ end base, Dimer/hairpins, Degeneracy, Salt concentration, Annealing Oligo Concentration, etc Differences: l Not a weighed linear sum of all criteria l Need much expert’s supervision, l the numerical criteria are used as a guide only
16
CMSC 838T – Presentation More Related Works u Case study Burpo did a critical review of PCR primer design algorithms l Subject: saccharomyces cerevisiae deletion strains l Conclusion: u no suitable program for the task of post-design PCR analysis u Especially in the aspect of accurately predicting non-specific hybridization events that impair PCR amplification.
17
CMSC 838T – Presentation Observations u My observations: Minus side: l Is the computational model too simplistic? l Specifically, is a weighed linear sum justified? Plus side: l The design of the parallel architecture is neat. l Since primers are about the length of 18-22 bases, current technology certainly can handle it. When would you need fast primer selection? l Primer walking to connect contigs together quickly l To scan through a large number of sequences for possible primers
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.