Nonunique Probe Selection and Group Testing Ding-Zhu Du.

Slides:



Advertisements
Similar presentations
Optimal Space Lower Bounds for All Frequency Moments David Woodruff MIT
Advertisements

Approximation Algorithms
Knapsack Problem Section 7.6. Problem Suppose we have n items U={u 1,..u n }, that we would like to insert into a knapsack of size C. Each item u i has.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
1 NP-completeness Lecture 2: Jan P The class of problems that can be solved in polynomial time. e.g. gcd, shortest path, prime, etc. There are many.
CSCI 256 Data Structures and Algorithm Analysis Lecture 22 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
Complexity ©D Moshkovitz 1 Approximation Algorithms Is Close Enough Good Enough?
Chapter 10 Complexity of Approximation (1) L-Reduction Ding-Zhu Du.
The Stackelberg Minimum Spanning Tree Game Jean Cardinal · Erik D. Demaine · Samuel Fiorini · Gwenaël Joret · Stefan Langerman · Ilan Newman · OrenWeimann.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
CSC5160 Topics in Algorithms Tutorial 2 Introduction to NP-Complete Problems Feb Jerry Le
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
1 Approximation Algorithms CSC401 – Analysis of Algorithms Lecture Notes 18 Approximation Algorithms Objectives: Typical NP-complete problems Approximation.
Implicit Hitting Set Problems Richard M. Karp Harvard University August 29, 2011.
Vertex Cover, Dominating set, Clique, Independent set
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 23 Instructor: Paul Beame.
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
Optimization problems INSTANCE FEASIBLE SOLUTIONS COST.
Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 22 Instructor: Paul Beame.
An introduction to Approximation Algorithms Presented By Iman Sadeghi.
1 Introduction to Approximation Algorithms Lecture 15: Mar 5.
1 Physical Mapping --An Algorithm and An Approximation for Hybridization Mapping Shi Chen CSE497 04Mar2004.
MCS312: NP-completeness and Approximation Algorithms
APPROXIMATION ALGORITHMS VERTEX COVER – MAX CUT PROBLEMS
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 27.
Lecture 22 More NPC problems
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
Physical Mapping of DNA BIO/CS 471 – Algorithms for Bioinformatics.
TECH Computer Science NP-Complete Problems Problems  Abstract Problems  Decision Problem, Optimal value, Optimal solution  Encodings  //Data Structure.
Approximation Algorithms
Computational Molecular Biology Non-unique Probe Selection via Group Testing.
Pooling designs for clone library screening in the inhibitor complex model Department of Mathematics and Science National Taiwan Normal University (Lin-Kou)
Computational Molecular Biology Introduction and Preliminaries.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Lecture 10 Applications of NP-hardness. Knapsack.
Combinatorial Optimization Problems in Computational Biology Ion Mandoiu CSE Department.
Chapter 2 Greedy Strategy I. Independent System Ding-Zhu Du.
Unit 9: Coping with NP-Completeness
Problem Setting :Influence Maximization A new product is available in the market. Whom to give free samples to maximize the purchase of the product ? 1.
Lecture 6 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
Computational Molecular Biology Non-unique Probe Selection via Group Testing.
Variations of the Prize- Collecting Steiner Tree Problem Olena Chapovska and Abraham P. Punnen Networks 2006 Reporter: Cheng-Chung Li 2006/08/28.
Lecture 25 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.
NPC.
An Algorithm for the Consecutive Ones Property Claudio Eccher.
Computational Molecular Biology
Approximation Algorithms Greedy Strategies. I hear, I forget. I learn, I remember. I do, I understand! 2 Max and Min  min f is equivalent to max –f.
Learning Hidden Graphs Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity.
Computational Molecular Biology Pooling Designs – Inhibitor Models.
Trees.
Greedy Approximation with Non- submodular Potential Function Ding-Zhu Du.
Combinatorial Group Testing 傅恆霖應用數學系. Mathematics? You are you, you are the only one in the world just like you! Mathematics is mathematics, there is.
The Set-covering Problem Problem statement –given a finite set X and a family F of subsets where every element of X is contained in one of the subsets.
Approximating Set Cover
Lecture 2-6 Polynomial Time Hierarchy
Computational Molecular Biology
Computability and Complexity
Problem Solving 4.
Lecture 2-5 Applications of NP-hardness
NP-Completeness Yin Tat Lee
CSE 6408 Advanced Algorithms.
Computational Molecular Biology
Ch09 _2 Approximation algorithm
The Complexity of Approximation
Dominating Set By Eric Wengert.
Instructor: Aaron Roth
Lecture 28 Approximation of Set Cover
Presentation transcript:

Nonunique Probe Selection and Group Testing Ding-Zhu Du

DNA Hybridization

Polymerase Chain Reaction (PCR) cell-free method of DNA cloning Advantages much faster than cell based method need very small amount of target DNA Disadvantages need to synthesize primers applies only to short DNA fragments(<5kb)

Preparation of a DNA Library DNA library: a collection of cloned DNA fragments usually from a specific organism

DNA Library Screening

Problem If a probe doesn’t uniquely determine a virus, i.e., a probe determine a group of viruses, how to select a subset of probes from a given set of probes, in order to be able to find up to d viruses in a blood sample.

Binary Matrix viruses c 1 c 2 c 3 c j c n p … 0 … 0 … 0 … 0 … 0 … 0 … 0 p … 0 … 0 … 0 … 0 … 0 … 0 … 0 p … 0 … 0 … 0 … 0 … 0 … 0 … 0 probes … 0 … 0 … 0 … 0 … 0 … 0 … 0. p i … 0 … 0 … 1 … 0 … 0 … 0 … 0. p t … 0 … 0 … 0 … 0 … 0 … 0 … 0 The cell (i, j) contains 1 iff the ith probe hybridizes the jth virus.

Binary Matrix of Example virus c 1 c 2 c 3 c j p p p probes Observation: All columns are distinct. To identify up to d viruses, all unions of up to d columns should be distinct!

d-Separable Matrix viruses c 1 c 2 c 3 c j c n p … 0 … 0 … 0 … 0 … 0 … 0 … 0 p … 0 … 0 … 0 … 0 … 0 … 0 … 0 p … 0 … 0 … 0 … 0 … 0 … 0 … 0 probes … 0 … 0 … 0 … 0 … 0 … 0 … 0. p i … 0 … 0 … 1 … 0 … 0 … 0 … 0. p t … 0 … 0 … 0 … 0 … 0 … 0 … 0 All unions of up to d columns are distinct. Decoding: O(n d ) _

d-Disjunct Matrix viruses c 1 c 2 c 3 c j c n p … 0 … 0 … 0 … 0 … 0 … 0 … 0 p … 0 … 0 … 0 … 0 … 0 … 0 … 0 p … 0 … 0 … 0 … 0 … 0 … 0 … 0 probes … 0 … 0 … 0 … 0 … 0 … 0 … 0. p i … 0 … 0 … 1 … 0 … 0 … 0 … 0. p t … 0 … 0 … 0 … 0 … 0 … 0 … 0 Each column is different from the union of every d other columns Decoding: O(n) Remove all clones in negative pools. Remaining clones are all positive. es

Nonunique Probe Selection Given a binary matrix, find a d-separable submatrix with the same number of columns and the minimum number of rows. Given a binary matrix, find a d-disjunct submatrix with the same number of columns and the minimum number of rows. Given a binary matrix, find a d-separable submatrix with the same number of columns and the minimum number of rows _

Classical Group Testing Model Given n items with some positive ones, identify all positive ones by less number of tests. Each test is on a subset of items. Test outcome is positive iff there is a positive item in the subset.

Example 1 - Sequential

Example 2 – Non-adaptive p p p p4 p5 p6 O ( ) tests for n items

General Model about Nonadaptive Group Testing Classical: no restriction on pools. Complex model: some restriction on pools General model: Given a set of pools, select pools from this set to form a d- separable (d\bar-separable, d-disjunct) matrix.

Minimum d-Separable Submatrix Given a binary matrix, find a d-separable submatrix with minimum number of rows and the same number of columns. For any fixed d >0, the problem is NP-hard. In general, the problem is conjectured to be Σ 2 –complete. p

d-Separable Test Given a matrix M and d, is M d-separable? It is co-NP-complete.

d-Separable Test Given a matrix M and d, is M d\bar- separable? This is co-NP-complete. (a) It is in co-NP. Guess two samples from space S(n,d\bar). Check if M gives the same test outcome on the two samples. _

d-Disjunct Test Given a matrix M and d, is M d-disjunct? This is co-NP-complete.

Complexity of Sequential Group Testing Given n items, d and t, is there a group testing algorithm with at most t tests for n items with at most d positives? In PSPACE Conjectured to be PSPACE-complete.

Complexity of Nonadaptive Group Testing Given n items, d and t, is there a t x n d- separable matrix? Given n items, d and t, is there a t x n d\bar-separable matrix? Given n items, d and t, is there a t x n d- disjunct matrix?

Approximation Greedy approximation has performance 1+2d ln n If NP not= P, then no approximation has performance o(ln n) If NP is not contained by DTIME(n^{log log n}), then no approximation has performance (1-a)ln n for any a > 0.

Pool Size = 2 The minimum 1-separable submatrix problem is also called the minimum test set (the minimum test cover, the minimum test collection). The minimum test cover is APX-complete (story was complicated). The minimum 1-disjunct submatrix is really polynomial-time solvable.

Lemma Consider a collection C of pools of size at most 2. Let G be the graph with all items as vertices and all pools of size 2 as edges. Then C gives a d-disjunct matrix if and only if every item not in a singleton pool has degree at least d+1 in G.

Proof

Theorem Min-d-DS is polynomial-time solvable in the case that all given pools have size exactly 2

Theorem 2 Min-1-DS is NP hard in the case that all given pools have size at most 2.

Proof Vertex-Cover

Theorem 2’ Min-1-DS is MAX SNP-complete in the case that all given pools have size at most 2.

Lemma 2 Suppose all given pools have size at most 2. Let s be the number of given singleton pools. Then any feasible solution of Min-d- DS contains at least s+ (n-s)(d+1)/2 pools.

Proof

Step 1 Compute a minimum solution of the following polynomial-time solvable problem: Let G be the graph with all items as vertices and all given pools of size 2 as edges. Find a subgraph H, with minimum number of edges, such that every item not in a singleton pool has degree at least d+1.

Step 2 Suppose H is a minimum solution obtained in Step 1. Choose all singleton pools at vertices with degree less than d+1 in H. All edges of H and chosen singleton pools form a feasible solution of Min-d-DS.

Theorem 3 The feasible solution obtained in the above algorithm is a polynomial-time approximation with performance ratio 1+2/(d+1).

Proof Suppose H contains m edges and k vertices of degree at least d+1. Suppose an optimal solution containing s* singletons and m* pools of size 2. Then m < m* and (n-k)-s*< 2m*/(d+1). (n-k)+m < s*+m*+ 2m*/(d+1) < (s*+m*)(1+2/(d+1)).

Pooling Design and Nonadaptive Group Testing Ding-Zhu Du Frank F. Hwang