Primer Selection Methods for Detection of Genomic Inversions and Deletions via PAMP Bhaskar DasGupta, University of Illinois at Chicago Jin Jun, and Ion.

Slides:



Advertisements
Similar presentations
~1~ Infocom’04 Mar. 10th On Finding Disjoint Paths in Single and Dual Link Cost Networks Chunming Qiao* LANDER, CSE Department SUNY at Buffalo *Collaborators:
Advertisements

. The sample complexity of learning Bayesian Networks Or Zuk*^, Shiri Margel* and Eytan Domany* *Dept. of Physics of Complex Systems Weizmann Inst. of.
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
Optimal Testing of Digital Microfluidic Biochips: A Multiple Traveling Salesman Problem R. Garfinkel 1, I.I. Măndoiu 2, B. Paşaniuc 2 and A. Zelikovsky.
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.
Interchanging distance and capacity in probabilistic mappings Uriel Feige Weizmann Institute.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Reference Assisted Nucleic Acid Sequence Reconstruction from Mass Spectrometry Data Gabriel Ilie 1, Alex Zelikovsky 2 and Ion Măndoiu 1 1 CSE Department,
1 Stochastic Event Capture Using Mobile Sensors Subject to a Quality Metric Nabhendra Bisnik, Alhussein A. Abouzeid, and Volkan Isler Rensselaer Polytechnic.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
A New Force-Directed Graph Drawing Method Based on Edge- Edge Repulsion Chun-Cheng Lin and Hsu-Chen Yen Department of Electrical Engineering, National.
Localized Techniques for Power Minimization and Information Gathering in Sensor Networks EE249 Final Presentation David Tong Nguyen Abhijit Davare Mentor:
E. AlthausMax-Plank-Institut fur Informatik G. CalinescuIllinois Institute of Technology I.I. MandoiuUC San Diego S. Prasad Georgia State University N.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
Prune-and-search Strategy
L i a b l eh kC o m p u t i n gL a b o r a t o r y Performance Yield-Driven Task Allocation and Scheduling for MPSoCs under Process Variation Presenter:
Symmetric Connectivity With Minimum Power Consumption in Radio Networks G. Calinescu (IL-IT) I.I. Mandoiu (UCSD) A. Zelikovsky (GSU)
Michael Heusch - IntCP 2006 Modeling and solving of a radio antennas deployment support application with discrete and interval constraints.
Yield- and Cost-Driven Fracturing for Variable Shaped-Beam Mask Writing Andrew B. Kahng CSE and ECE Departments, UCSD Xu Xu CSE Department, UCSD Alex Zelikovsky.
2-Layer Crossing Minimisation Johan van Rooij. Overview Problem definitions NP-Hardness proof Heuristics & Performance Practical Computation One layer:
May 25, GSU Biotech Symposium1 Minimum PCR Primer Set Selection with Amplification Length and Uniqueness Constraints Ion Mandoiu University of.
APBC Improved Algorithms for Multiplex PCR Primer Set Selection with Amplification Length Constraints Kishori M. Konwar Ion I. Mandoiu Alexander.
The Simplified Partial Digest Problem: Hardness and a Probabilistic Analysis Zo ë Abrams Ho-Lin Chen
Applied Biosystems 7900HT Fast Real-Time PCR System I. Real-time RT-PCR analysis of siRNA-induced knockdown in mammalian cells (Amit Berson, Mor Hanan.
Simple search methods for finding a Nash equilibrium Ryan Porter, Eugene Nudelman, and Yoav Shoham Games and Economic Behavior, Vol. 63, Issue 2. pp ,
Dana Moshkovitz, MIT Joint work with Subhash Khot, NYU.
Elements of the Heuristic Approach
Constraint Satisfaction Problems
Introduction to Monte Carlo Methods D.J.C. Mackay.
Photo-realistic Rendering and Global Illumination in Computer Graphics Spring 2012 Stochastic Radiosity K. H. Ko School of Mechatronics Gwangju Institute.
Detecting copy number variations using paired-end sequence data Nick Furlotte CS224 May 29, 2009.
Approximation Algorithms for Stochastic Combinatorial Optimization Part I: Multistage problems Anupam Gupta Carnegie Mellon University.
Target Tracking with Binary Proximity Sensors: Fundamental Limits, Minimal Descriptions, and Algorithms N. Shrivastava, R. Mudumbai, U. Madhow, and S.
Chapter 5 Statistical Models in Simulation
Mehdi Kargar Aijun An York University, Toronto, Canada Discovering Top-k Teams of Experts with/without a Leader in Social Networks.
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
An Algorithm for the Coalitional Manipulation Problem under Maximin Michael Zuckerman, Omer Lev and Jeffrey S. Rosenschein (Simulations by Amitai Levy)
© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.
© The McGraw-Hill Companies, Inc., Chapter 6 Prune-and-Search Strategy.
Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.
CS CM124/224 & HG CM124/224 DISCUSSION SECTION (JUN 6, 2013) TA: Farhad Hormozdiari.
Combinatorial Optimization Problems in Computational Biology Ion Mandoiu CSE Department.
Energy-Efficient Sensor Network Design Subject to Complete Coverage and Discrimination Constraints Frank Y. S. Lin, P. L. Chiu IM, NTU SECON 2005 Presenter:
Register Placement for High- Performance Circuits M. Chiang, T. Okamoto and T. Yoshimura Waseda University, Japan DATE 2009.
1 Prune-and-Search Method 2012/10/30. A simple example: Binary search sorted sequence : (search 9) step 1  step 2  step 3  Binary search.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Vaida Bartkutė, Leonidas Sakalauskas
Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
© Yamacraw, Fall 2002 Power Efficient Range Assignment in Ad-hoc Wireless Networks E. Althous (MPI) G. Calinescu (IL-IT) I.I. Mandoiu (UCSD) S. Prasad.
1 Random Disambiguation Paths Al Aksakalli In Collaboration with Carey Priebe & Donniell Fishkind Department of Applied Mathematics and Statistics Johns.
Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
March 7, Using Pattern Recognition Techniques to Derive a Formal Analysis of Why Heuristic Functions Work B. John Oommen A Joint Work with Luis.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Non-LP-Based Approximation Algorithms Fabrizio Grandoni IDSIA
CSE280Stefano/Hossein Project: Primer design for cancer genomics.
Metaheuristics for the New Millennium Bruce L. Golden RH Smith School of Business University of Maryland by Presented at the University of Iowa, March.
E. AlthausMax-Plank-Institut fur Informatik G. CalinescuIllinois Institute of Technology I.I. MandoiuUC San Diego S. Prasad Georgia State University N.
1 Power Efficient Monitoring Management in Sensor Networks A.Zelikovsky Georgia State joint work with P. BermanPennstate G. Calinescu Illinois IT C. Shah.
Learning Hidden Graphs Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity.
Lecture 18: Uniformity Testing Monotonicity Testing
Computability and Complexity
Haim Kaplan and Uri Zwick
Analysis of Algorithms
Power Efficient Range Assignment in Ad-hoc Wireless Networks
Maximum Lifetime of Sensor Networks with Adjustable Sensing Range
A Near-Optimal Sensor Placement Algorithm to Achieve Complete Coverage/Discrimination in Sensor Networks Authors : Frank Y. S. Lin and P. L. Chiu, Student.
Presentation transcript:

Primer Selection Methods for Detection of Genomic Inversions and Deletions via PAMP Bhaskar DasGupta, University of Illinois at Chicago Jin Jun, and Ion Mandoiu University of Connecticut

Outline  Introduction  Anchored Deletion Detection  Inversion Detection  Conclusions

Genomic Structural Variation  Deletions  Inversions  Translocations, insertions, fissions, fussions…

Primer Approximation Multiplex PCR (PAMP)  Introduced by [Liu&Carson 2007]  Experimental technique for detecting large-scale cancer genome lesions such as inversions and deletions from heterogeneous samples containing a mixture of cancer and normal cells  Can be used for Tracking how genetic breakpoints are generated during cancer development Monitoring the status of cancer progression with a highly sensitive assays

PAMP details A. Large number of multiplex PCR primers selected s.t. There is no PCR amplification in the absence of genomic lesions A genomic lesion brings one or more pairs of primers in the proximity of each other with high probability, resulting in PCR amplification B. Amplification products are hybridized to a microarray to identify the pair(s) of primers that yield amplification Liu&Carson 2007

Outline  Introduction  Anchored Deletion Detection  Inversion Detection  Conclusions

Anchored Deletion Detection  Assume that the deletion spans a known genomic location (anchored deletions)  [Bashir et al. 2007] proposed ILP formulations and simulated annealing algorithms for PAMP primer selection for anchored deletions

Criteria for Primer Selection  Standard criteria for multiplex PCR primer selection Melting temperature, T m Lack of hairpin secondary structure, and No dimerization between pairs of primers  Single pair of dimerizing primers is sufficient to negate the amplification [Bashir et al. 2007]

Optimization Objective  Multiplex PCR primer set selection Minimize number of primers and/or multiplex PCR reactions needed to amplify a given set of discrete amplification targets  PAMP primer set selection Minimize the probability that an unknown genomic lesion fails to be detected by the assay

PCR Amplification Efficiency Model  0-1 Step model (used in our simulations) 1 0 LL+1 Distance between two primers PCR amplification success probability 1 0 L Distance between two primers PCR amplification success probability  Exponential decay in amplification efficiency above a certain product length

 p l,r : probability of having a lesion with endpoints, l and r where  Simple model: uniform distribution p l,r =h if r-l>D, 0 otherwise  Function of distance p l,r =f(r-l) e.g. a peak at r-l=d  Function of hotspots High probability around hotspots e.g. two (pairs of) hotspots Probabilistic Models for Lesion Location r l D x min r l r l Hotspots Hot- spots r-l=d x max h

PAMP Primer Selection Problem for Anchored Deletion Detection (PAMP-DEL)  Given: Sets of forward and reverse candidate primers, {p 1,p 2,…,p m } and {q 1,q 2,…,q n } Set E of primer pairs that form dimers Maximum multiplexing degrees N f and N r, and amplification length upper-bound L  Find: Subset P’ of at most N f forward and at most N r reverse primers such that 1. P’ does not include any pair of primers in E 2. P’ minimizes the failure probability  where f(P ’ ;l,r) = 1 if P’ fails to yield a PCR product when the deletion with endpoints (l,r) is present in the sample, and f(P ’ ;l,r) = 0 otherwise.

ILP Formulation for PAMP-DEL x i’ xixi yjyj y j’ (l-1-x i’ )+(y j’ -r-1) = L 5’ 3’ p i’ pipi q j’ qjqj l r x i’ y j’ Deletion anchor l1l1 r1r1 l1l1 r1r1 Failure f(P’;l,r)=1 (l 1 -1-x i’ )+(y j’ -r 1 -1) > L

ILP Formulation for PAMP-DEL x i’ xixi yjyj y j’ 5’ 3’ p i’ pipi q j’ qjqj l r x i’ y j’ l2l2 r2r2 l2l2 r2r2 (l 2 -1-x i’ )+(y j’ -r 2 -1) ≤ L Success f(P’;l,r)=0  0/1 variables f i (r i ) to indicate when p i (respectively q i ) is selected in P’, f i,j (r i,j ) to indicate that p i and p j (respectively q i and q j ) are consecutive primers in P’, e i,i‘,j,j‘ to indicate that both (p i, p i’ ) and (q j, q j’ ) are pairs of are consecutive primers in P’ Deletion anchor (l-1-x i’ )+(y j’ -r-1) = L

Failure probability Compatibility constraints ILP Formulation for PAMP-DEL (2) Path connecting constraints No dimerization constraints p0p0 p m+1 pipi pjpj pkpk f 0,i f i,j f j,k f i,m+1... :::: :::: Max. multiplex degree constraints

PAMP-1SDEL  One-sided version of PAMP-DEL in which one of the deletion endpoints is known in advance Introduced by [Bhasir et al. 2007]  Assume we know the left deletion endpoint Let x 1 <x 2 <…<x n be the hybridization positions for the reverse candidate primers q 1,…, q n  C i,j : probability that a deletion whose right endpoint falls between x i and x j does not result in PCR amplification  r i, r i,j : 0/1 decision variables similar to those in PAMP-DEL ILP

PAMP-1SDEL ILP

Comparison to Bashir et al. Formulation  PAMP-DEL formulation in Bashir et al. Each primer responsible for covering L/2 bases Covered area by adjacent primers u, v: dimerization 0L2L2.5L3L Forward primers l1l1 l2l2 Unconvered area L/2 Forward primers + l 1 L/2 Forward primers + l 2 Failure prob. 1/2 0

Approximation Analysis  Lemma 1. Assuming the UNIQUE GAMES conjecture, PAMP-1SDEL (and hence, PAMP-DEL) cannot be approximated to within a factor of 2- for any constant >0.  Proof By reducing the vertex cover problem to PAMP-1SDEL  Lemma 2. There is a 2-approximation algorithm for the special case of PAMP-1SDEL in which candidate primers are spaced at least L bases apart and the deletion endpoint is distributed uniformly within a fixed interval (x min, x max ].

PAMP-DEL Heuristics  ITERATIVE-1SDEL Iteratively solve PAMP-1SDEL with fixed primers from previous PAMP-1SDEL Fixed N f (N r ) at each step  INCREMENTAL-1SDEL ITERATIVE-1SDEL but with incremental multiplexing degrees  E.g. k/2k·N f, (k+1)/2k·N f, …, N f  where k is the number of steps

Comparison of PAMP-DEL Heuristics  m=n=N f =N r =15, x max -x min =5Kb, L=2Kb, 5 random instances  PAMP-DEL ILP can handle only very small problem  Both ITERATED-1SDEL and INCREMENTAL-1SDEL solutions are very close to optimal for low dimerization rates  For larger dimerization rates INCREMENTAL-1SDEL detection probability is still close to optimal

INCREMENTAL-1SDEL Scalability  L=20Kb, 5 random instances

Outline  Introduction  Anchored Deletion Detection  Inversion Detection  Conclusions

Inversion Detection

PAMP Primer Selection Problem for Inversion Detection (PAMP-INV)  Given: Set P of candidate primers Set E of dimerizing candidate primer pairs Maximum multiplexing degree N and amplification length upper-bound L  Find: a subset P’ of P such that 1. |P’| ≤ N 2. P’ does not include any pair of primers in E 3. P’ minimizes the failure probability  where f(P ’ ;l,r) =1 if P’ fails to yield a PCR product when the inversion with endpoints (l,r) is present in the sample, and f(P ’ ;l,r) =0 otherwise.

ILP Formulation for PAMP-INV xixi x i’ xjxj x j’ (l-1-x i )+(r-x j ) = L 5’ 3’ pipi p i’ p j’ pjpj l r l r l r (l-1-x i )+(r-x j ) ≤ L Success f(P';l,r)=0  0/1 variables e i =1 iff p i is selected in P’, e i,j =1 iff p i and p j are consecutive primers in P’, e i,i‘,j,j‘ =1 iff (p i, p i’ ) and (p j, p j’ ) are pairs of are consecutive primers in P’ 5’ 3’ pipi p i’ p j’ pjpj xixi xjxj f(P';l',r')=1

ILP Formulation for PAMP-INV (2)

Detection Probability and Runtime for PAMP-INV ILP  PAMP-INV ILP can be solved to optimality within a few hours  Runtime is relatively robust to changes in dimerization rate, candidate primer density, and constraints on multiplexing degree.  x max -x min =100Kb  L=20Kb  5 random instances

Effect of Inversion Length and Dimerization Rate  x max -x min =100Kb, L=20Kb, n=30, dimerization rate r between 0 and 20% and N=20  Detection probability is relatively insensitive to Length of Inversion

Outline  Introduction  Anchored Deletion Detection  Inversion Detection  Conclusions

Summary  ILP formulations for PAMP primer selection Anchored deletion detection (PAMP-DEL) 1-sided anchored deletion detection (PAMP-1SDEL) Inversion detection (PAMP-INV) Practical runtime for mid-sized PAMP-INV ILP, highly scalable PAMP-1SDEL ILP  Heuristics for PAMP-DEL based on PAMP- 1SDEL ILP Near optimal solutions with highly scalable runtime

Questions?