Learning Hidden Graphs Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity.

Slides:



Advertisements
Similar presentations
CS 336 March 19, 2012 Tandy Warnow.
Advertisements

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
A survey of some results on the Firefighter Problem Kah Loon Ng DIMACS Wow! I need reinforcements!
Presented by Yuval Shimron Course
1 NP-completeness Lecture 2: Jan P The class of problems that can be solved in polynomial time. e.g. gcd, shortest path, prime, etc. There are many.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
The Divide-and-Conquer Strategy
Computational problems, algorithms, runtime, hardness
NP-Complete Problems Reading Material: Chapter 10 Sections 1, 2, 3, and 4 only.
Increasing graph connectivity from 1 to 2 Guy Kortsarz Joint work with Even and Nutov.
Is the following graph Hamiltonian- connected from vertex v? a). Yes b). No c). I have absolutely no idea v.
1 Bipartite Matching Lecture 3: Jan Bipartite Matching A graph is bipartite if its vertex set can be partitioned into two subsets A and B so that.
Theta Function Lecture 24: Apr 18. Error Detection Code Given a noisy channel, and a finite alphabet V, and certain pairs that can be confounded, the.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.
1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
1 Physical Mapping --An Algorithm and An Approximation for Hybridization Mapping Shi Chen CSE497 04Mar2004.
From Haystacks to Needles AP Biology Fall Isolating Genes  Gene library: a collection of bacteria that house different cloned DNA fragments, one.
Physical Mapping of DNA Shanna Terry March 2, 2004.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
 2004 SDU Lecture 7- Minimum Spanning Tree-- Extension 1.Properties of Minimum Spanning Tree 2.Secondary Minimum Spanning Tree 3.Bottleneck.
JM - 1 Introduction to Bioinformatics: Lecture III Genome Assembly and String Matching Jarek Meller Jarek Meller Division of Biomedical.
Computational Molecular Biology Non-unique Probe Selection via Group Testing.
Pooling designs for clone library screening in the inhibitor complex model Department of Mathematics and Science National Taiwan Normal University (Lin-Kou)
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Private Approximation of Search Problems Amos Beimel Paz Carmi Kobbi Nissim Enav Weinreb (Technion)
Lecture 10 Applications of NP-hardness. Knapsack.
Combinatorial Optimization Problems in Computational Biology Ion Mandoiu CSE Department.
Fragment assembly of DNA A typical approach to sequencing long DNA molecules is to sample and then sequence fragments from them.
Techniques for Proving NP-Completeness Show that a special case of the problem you are interested in is NP- complete. For example: The problem of finding.
Approximation Algorithms for NP-hard Combinatorial Problems Magnús M. Halldórsson Reykjavik University Local Search, Greedy and Partitioning
Nonunique Probe Selection and Group Testing Ding-Zhu Du.
Testing the independence number of hypergraphs
Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
Computational Molecular Biology Non-unique Probe Selection via Group Testing.
Extensions of Graph Pebbling Carl R. Yerger*, Prof. Francis Edward Su (Advisor), Prof. Anant P. Godbole (Second Reader) Introduction Given a graph, G,
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
Introduction to Graph Theory
OPERA highthroughput paired-end sequences Reconstructing optimal genomic scaffolds with.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
CSE280Stefano/Hossein Project: Primer design for cancer genomics.
An Algorithm for the Consecutive Ones Property Claudio Eccher.
Computational Molecular Biology
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
Group Testing and Its Applications
Computational Molecular Biology Pooling Designs – Inhibitor Models.
Group Testing – An Introduction 簡介群試理論
Combinatorial Group Testing 傅恆霖應用數學系. Mathematics? You are you, you are the only one in the world just like you! Mathematics is mathematics, there is.
ICS 353: Design and Analysis of Algorithms NP-Complete Problems King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Group Testing and Its Applications
Prof. Yu-Chee Tseng Department of Computer Science
Applied Discrete Mathematics Week 13: Graphs
Recent Interests and Progress Hung-Lin Fu (傅恒霖)
Lecture 2-6 Polynomial Time Hierarchy
Group Testing and Its Applications
Approximating the MST Weight in Sublinear Time
Analysis and design of algorithm
ICS 353: Design and Analysis of Algorithms
Chapter 11 Limitations of Algorithm Power
NP-Completeness Yin Tat Lee
Pooling Designs in Group Testing
Rainbow Graph Designs Hung-Lin Fu (傅 恒 霖)
Grid-Block Designs and Packings
Computational Molecular Biology
Learning a hidden graph with adaptive algorithms
Presentation transcript:

Learning Hidden Graphs Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity

謹以此演講向黃光明老師致上最大 的敬意 ! 並感謝在 Group Testing 工作上給 我協助的主要伙伴 : 陳宏賓, 張惠 蘭,施智懷 

What is “ Group Testing ” ? A problem: I have a favor number in my mind and the number is in {1,2,3,…,63}. Can you find the number by asking (me) as few queries as possible? (A query asks if the number is in a certain subset of the given set.)

The answer : At most 6 (1) Is the number larger than 32? Yes or Not? (2) If the answer is “ Yes ”, then ask if the number is larger than 48 or not. On the other hand, if the answer is “ Not ”, then ask if the number is larger than 16 or not. Continued … Worst case: 6 (Information bound)

Another Idea

Group Testing Consider a set N of n items consisting of at most d positive (used to be called defective) items with the others being negative (used to be called good) items. A group test, sometimes called a pool, can be applied to an arbitrary set S of items with two possible outcomes; negative: all items in S are negative; positive: at least one positive item in S, not knowing which or how many.

Blood Testing Syphilis ( 梅毒 ) Tests (World war II)

One by one

Red is positive and Brown is negative

Keep going

More positives (72 tests)

Test a group at one time

Some group is negative!

They are done!

Save a lot of Money! Save 33-3 times What ’ s next?

Algorithms An adaptive (sequential) algorithm conducts the tests one by one and the outcomes of all previous tests can be used to set up the later test. A non-adaptive algorithm specifies a set of tests in advance so that they can be conducted simultaneously; thus forbidding using the information of previous tests.

Complex Group Testing If the positives are subsets of items instead of just individual items, then we have the complex group testing. The subsets are called complexes. The problems of complex group testing can be formulated as hidden graphs problems.

Example of Complex Group Testing In screening clone library the goal is to determine which clones in the clone library hybridize with a given probe in an efficient fashion. A clone is said to be positive if it hybridize with the probe( 探針 ), and negative otherwise.

Shotgun Sequencing Shotgun sequencing is a throughput technique resulting in the sequencing of a large number of bacterial genomes, mouse genomes and the celebrated human genomes. In all such projectss, we are left with a collection of contigs that for special reasons cannot be assembled with general assembly algorithms. Continued …

Random shotgun approach cut many times at random (Shotgun) genomic segment 6

26 Whole-genome shotgun sequencing –Short reads are obtained and covering the genome with redundancy and possible gaps. Circular genome

–Reads are assembled into contigs with unknown relative placement.

–Primers : (short) fragments of DNA characterizing ends of contigs.

–A PCR (Polymerase Chain Reaction) reaction reveals if two primers are proximate (adjacent to the same gap). –Multiplex PCR can treat multiple primers simultaneously and outputs if there is a pair of adjacent primers in the input set and even sometimes the number of such pairs.

Two primers of each contig are “ mixed together ” –Find a Hamiltonian cycle by PCRs!

Primers are treated independently. –Find a perfect matching by PCRs.

Goal Our goal is to provide an experimental protocol that identifies all pairs of adjacent primers with as few PCRs (queries) (or multiplex PCRs respectively) as possible.

Mathematical Models Hidden Graphs (Reconstructed)

Idea Edge-detecting queries: Q G (S) Q G (S) = 1 if the induced subgraph of the vertex subset S of V(G) contains at least one edge of G and Q G (S) = 0 otherwise. For convenience, we assume G is defined on [1,n] = {1, 2, …, n}.

Example (A triangle is hidden) G :

Q({1,2,3,4,5,6,7,8}) =

Q({1,2,3,4}) =

Q({1,2,3,4,5,7}) =

Q({1,2,3,4,5}) = v v = {5}, S \ {v} = {1, 2, 3, 4} Q({5,1,2}) = 0Q({5,3}) = 1

Find a matching We can delete the edge found above and then find another edge if there is any left. Algorithm 1 shows that we can find a matching by using splitting idea.

Algorithm 2 G : PARTITION_OF_VERTEX_SET(V) Add more partite sets if necessary.

Algorithm 3 It is left to find all the edges between independent sets. Now, a general graph is reconstructed.

Recent Result Theorem ( with Huilan Chand and Chie- Huai Shih, Optimization Letters, 2014 ) Let G be a graph with n vertices and m edges. Then we have an adaptive algorithm using at most m(2log 2 n + 9) queries.

Hidden hypergraphs It deals with the case that a complex may contain more than two items. The following example shows an idea of finding a hidden 3-uniform hypergraph.

Find a hyperedge Q({1,2,3,4,5})Yes Q({1,2,3})No Q({1,2,3,4})Yes {1,2,3,4,5}

Find a hyperedge Algorithm Q({1,2,3,4,5})Yes Q({1,2,3})No Q({1,2,3,4})Yes {4,1,2,3} {1,2,3,4,5} Q({4,1,2})No

Find a hyperedge Algorithm Q({1,2,3,4,5})Yes Q({1,2,3})No Q({1,2,3,4})Yes {4,1,2,3} {1,2,3,4,5} Q({4,1,2})No {4,3,1,2} Q({4,3,1})Yes

IDENTIFY-HIDDEN-r- UNIFORM-HYPERGRAPH

G=(V,E) V={1,2,3,4,5} E={{1,2,3}, {1,2,4}, {1,3,4}, {1,4,5}}

Step 1

Initial setting: S 1 = {1,2,3,4,5} 1 23 Find {1,4,5} 45 S1S1 V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}}

Initial setting: S 0 = {1,2,3,4,5} 1 23 Find {1,4,5} 45 S0S0 V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}}

Initial setting: S 1 = {1,2,3,4,5} 1 23 Find {1,4,5} 45 S 1 = {1,2,3} S 2 = {4} S 3 = {5} S1S1 S3S3 S2S2 V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}}

V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}} S1S1 S3S3 S2S2 S 1 = {1,2,3} S 2 = {4} S 3 = {5}

Find {1,3,4} 45 S1S1 S3S3 S2S2 S 1 = {1,2,3} S 2 = {4} S 3 = {5} S 1 = {1,2} S 2 = {4} S 3 = {3,5} V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}}

S1S1 S3S3 S2S2 V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}} Find {1,2,4} S 1 = {1,2} S 2 = {4} S 3 = {3,5} S 1 = {1} S 2 = {4} S 3 = {2,3,5}

S1S1 S3S3 S2S2 S 1 = {1} S 2 = {4} S 3 = {2,3,5} V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}} Find {1,2,3} S4S4 S 1 = {1} S 2 = {4} S 3 = {2,5} S 4 = {3} 1

S1S1 S3S3 S2S2 V={1,2,3,4,5} E={{1,2,3},{1,2,4},{1,3,4},{1,4,5}} S4S4 S 1 = {1} S 2 = {4} S 3 = {2,5} S 4 = {3}

Step 2

S1S1 S2S2 S3S3 S4S4

S1S1 S2S2 S4S4

S1S1 S2S2 S4S4 Find {1,4,9}

v S1S1 S2S2 S4S4 v

v S1S1 S2S2 S4S4 vv

Conclusion We can find a hidden r-uniform hypergraph G with m  r  log 2 n + (2 r+2 ) 1/2 r r m r/2 edge-detecting queries where m is the number of edges in G. It is by adaptive algorithm.

Problems We believe that the best answer should be around (mrlog 2 n) queries, but not able to prove this fact. We have no good idea about non- adaptive algorithm for optimal result at this moment.

More Problems If there is an inhibitor, then how do we use adaptive algorithm to learn a hidden graph? What if errors occurred when adaptive algorithm is utilized?

Reference A new construction of 3-bar-separable matrices via improved decoding of Macula construction (with F. K. Hwang), Discrete Optim., 5(2008), An upper bound of the number of tests in pooling designs for the error-tolerant complex model (with Hong-Bin. Chen and F. K. Hwang), Optimization Letters, 2(2008), no. 3, The minimum number of e-vertex-cover among hypergraphs with e edges of given rank (with F. H. Chang, F. K. Hwang and B. C. Lin), Discrete Applied Math., 157(2009), Non-adaptive algorithms for threshold group testing (with Hong-Bin Chen), Discrete Applied Math., 157(2009),

Continued Identification and classification problem on pooling designs for inhibitor models (with Huilan Chang and Hong-Bin Chen), J. Computational Biology, 17(2010), No. 7, Reconstruction of hidden graphs and threshold group testing (with Huilan Chang, Hong-Bin Chen and Chi-Huai Shih), J. Combin. Optimization, 22(2011), no. 2, 270 – 281. Group testing with multiple mutually-obscuring positives (with Hong-Bin Chen), Lecture Notes Computer Science, 7777(2013), 557 – 568. Threshold group testing on inhibitor model (Huilan Chang and Chie-Huai Shih), J. Computational Biology, Vol. 20, No. 6, 2013, 1 – 7.