Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning a hidden graph with adaptive algorithms

Similar presentations


Presentation on theme: "Learning a hidden graph with adaptive algorithms"— Presentation transcript:

1 Learning a hidden graph with adaptive algorithms
Hung-Lin Fu Department of Applied Mathematics National Chiao Tung University Hsin Chu, Taiwan

2 Motivated by bioinformatics applications
Introduction

3 Random shotgun approach
genomic segment cut many times at random (Shotgun) 6

4 Whole-genome shotgun sequencing
Short reads are obtained and covering the genome with redundancy and possible gaps. Circular genome Introduction

5 Reads are assembled into contigs with unknown relative placement.
Introduction

6 Primers : (short) fragments of DNA characterizing ends of contigs.
Introduction

7 A PCR (Polymerase Chain Reaction) reaction reveals if two primers are proximate (adjacent to the same gap). Multiplex PCR can treat multiple primers simultaneously and outputs if there is a pair of adjacent primers in the input set and even sometimes the number of such pairs. Introduction

8 Two primers of each contig are “mixed together”
Find a Hamiltonian cycle by PCRs! Introduction

9 Primers are treated independently.
Find a perfect matching by PCRs. Introduction

10 Goal Our goal is to provide an experimental protocol that identifies all pairs of adjacent primers with as few PCRs (queries) (or multiplex PCRs respectively) as possible. Introduction

11 Mathematical Models Hidden Graphs (Reconstructed)
Topology-known graphs, e.g. Hamiltonian cycle, matching, star, clique, bipartite graph, …, etc. Graphs of bounded degree Hypergraphs Graphs of known number of edges REF Introduction

12 Models Multi-vertex model Quantitative multi-vertex model
k-vertex model Quantitative k-multi-vertex model Learning a hidden graph by edge-detecting queries: 8

13 Described into Math Part II
Algorithms Adaptive algorithms: a query can depend on the answers obtained by previous queries. Nonadaptive algorithms: queries are independent and can be processed in parallel. Hidden Graph Introduction

14 Example 3 4 8 7 1 2 5 6 G :

15 Q({1,2,3,4,5,6,7,8}) = 1 3 4 8 7 1 2 5 6

16 Q({1,2,3,4}) = 0 3 4 8 7 1 2 5 6

17 Q({1,2,3,4,5,7}) = 1 3 4 8 7 1 2 5 6

18 3 4 8 7 1 2 5 6 v = {5}, S \ {v} = {1, 2, 3, 4} Q({1,2,3,4,5}) = 1 v Q({5,1,2}) = 0 Q({5,3}) = 1 5 2 1 4 3 5 2 1 4 3

19 Known Results (Matching)
The information-theoretic lower bound for matching is (1+o(1))nlgn bound can be reached by an adaptive algorithm. [Bouvel, et al. 05’]. Proof. Nonadaptive algorithms require queries. [Alon, Beigel, Kasif, Rudich, Sudakov 02’]. Proof Introduction

20 Strategy: first to find one vertex
Theorem: [Angluin 06’] A vertex in a hidden graph on n vertices can be reconstructed with at most queries. Proof. Introduction

21 Results Example of Find-One-Vertex
Introduction

22 Known Results on Other Graphs
Hamiltonian[lower][upper] Star Introduction

23 Hamiltonian cycle ~ adap.
O(nlgn) bound can be reached by an adaptive algorithm. [Grebinski, Kucherov 1997]. Proof. To process all vertices one-by-one by storing them in the independent set of chains. case I: no/no case II: yes/no case III: yes/yes at most 2nlgn queries BACK Introduction

24 How about more general graphs?

25 Lower bound Theorem 3. For any , edge-detecting queries are required to identify a graph drawn from the class of all graphs with vertices and edges. Proof. 18

26 Main Ideas If there are edges between two independent sets A and B, we may find all of the edges by using (a, B)-algorithm, a  A. We start with finding the maximal matching! Algorithm 1. MAXIMAL_MATCHING(V) Algorithm 2. PARTITION_OF_VERTEX_SET(V) Algorithm 3. HIDDEN_GRAPH(V) 20

27 Reference Reconstructing a Hamiltonian cycle by querying the graph: Application to DNA physical mapping [Grebinski and Kucherov 98’ ] Learning a hidden Matching [ N. Alon et al, 04’] Learning a hidden graph using O(lgn) queries per edge. [Angluin and Chen 04’] Learning a hidden subgraph [Alon and Asodi, 05’] Combinatorial search on graphs motivated by bioinformatics applications: a brief survey [Bouvel, Grebinski and Kucherov, 05’] Learning a hidden hypergraph [Angluin and Chen, 06’] Math Introduction

28 Example (Algorithm A(V): Finding an edge on V)
6 8 5 7 2 1 4 3 MAXIMAL_MATCHING(V) Algorithm A({1,2,3,4,5,6,7,8}) 1 3 Algorithm A({2,4,5,6,7,8}) 2 4 Algorithm A({5,6,7,8}) 5 7 Q({8,6}) = 0 21

29 Algorithm 2 PARTITION_OF_VERTEX_SET(V) 6 8 6 8 G : 6 8 5 7 2 1 4 3 1 3
21

30 Algorithm 3 It is left to find all the edges between independent sets. Now, a general graph is reconstructed.

31 Don’t Stop!

32 Complexity The number of queries is less than 2m(log n + 9).
Algorithm 1. Line Number of queries 2 3 total

33 Algorithm 2. Algorithm 3. Line Number of queries 2 3 total Line
1 7 14+17 0 (all of queries be answered in algorithm 2. , 10th line) 15+18 26 total

34 Concluding remarks Reduce the rounds of Algorithm 1 (i.e., obtain an efficient algorithm to find a maximal matching). Learning a hidden graph in Quantitative k-multi-vertex model. 24

35 References [1] N. Alon, R. Beigel, S. Kasif, S. Rudich,and B. Sudakov. Learning a hidden matching, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 197–206, 2002. [2] D. Angluin and J. Chen. Learning a hidden graph using O(log n) queries per edge. Manuscript, 2006. [3] D. Angluin and J. Chen. Learning a hidden hypergraph of Machine Learning Research 7, , 2007. [4] R. Beigel, N. Alon, S. Kasif, M. S. Apaydin and L. Fortnow. An optimal procedure for gap closing in whole genome shotgun sequencing, In RECOMB, 22–30, 2001. [5] V. Grebinski and G. Kucherov. Optimal query bounds for reconstructing a Hamiltonian cycle in complete graphs, In fifth Israel symposium on the Theory of Computing Systems, , 1997. [6] V. Grebinski and G. Kucherov. Reconstructing a Hamiltonian cycle by querying the graph: Application to DNA physical mapping. Discrete Applied Math., 88(1-3): 147–165, 1998. 25

36 Thank you for your attention!
Introduction


Download ppt "Learning a hidden graph with adaptive algorithms"

Similar presentations


Ads by Google