Download presentation
Presentation is loading. Please wait.
Published byAllan Taylor Modified over 9 years ago
1
Incomplete Directed Perfect Phylogeny Itsik Pe'er, Tal Pupko, Ron Shamir, and Roded Sharan SIAM Journal on Computing Volume 33, Number 3, pp. 590-607
2
Abstract Perfect phylogeny is one of the fundamental models for studying evolution. We investigate the following variant of the model: The input is a species-characters matrix. The characters are binary and directed, i.e., a species can only gain characters. The difference from standard perfect phylogeny is that for some species the states of some characters are unknown. The question is whether one can complete the missing states in a way that admits a perfect phylogeny. The problem arises in classical phylogenetic studies, when some states are missing or undetermined.
3
Abstract(cont.) Quite recently, studies that infer phylogenies using inserted repeat elements in DNA gave rise to the same problem. Extant solutions for it take time O(n 2 m) for n species and m characters. We provide a graph theoretic formulation of the problem as a graph sandwich problem, and give near-optimal ~ O(nm)-time algorithms for the problem. We also study the problem of finding a single, general solution tree, from which any other solution can be obtained by node splitting. We provide an algorithm to construct such a tree, or determine that none exists.
4
Problem An incomplete matrix A c1c1 c2c2 c3c3 c4c4 c5c5 s1s1 1?001 s2s2 ??010 s3s3 ?01?? c1c1 c2c2 c3c3 c4c4 c5c5 s1s1 10001 s2s2 11010 s3s3 10101 A completion of B c1c1 c5c5 c3c3 c2, c4c2, c4 s2s2 s1s1 s3s3 A phylogenetic tree that explains A via B
5
Problem(cont.) c2c2 c1c1 s3s3 s1s1 s2s2 The Σ subgraph. c1c1 c2c2 s1s1 11 s2s2 10 s3s3 01 A binary matrix B has a phylogenetic tree iff the 1-sets of every two characters are compatible. ( Two sets are compatible if they are either disjoint, or one of them contains the other.)
6
Algorithm (Divide and Conquer) Alg( A = ((S, C), E 0, E ?, E 1 )): 1. If |S| > 1 then do: (a) Remove all S-semi-universal characters and all null characters from G( A ). (b) If the resulting graph G ’ is connected then output False and halt. (c) Otherwise, let K 1 …K r be the connected components of G ’ 0, and let A 1 … A r be the corresponding submatrices of A. (d) For i = 1 … r do: Alg( A i ). 2. Output S.
7
Example c1c1 c2c2 c3c3 c4c4 c5c5 s1s1 1?00? s2s2 11??0 s3s3 ?11?0 s4s4 ??11? s5s5 ?0?10 c1c1 c2c2 c3c3 c4c4 c5c5 s1s1 10000 s2s2 11110 s3s3 11110 s4s4 10110 s5s5 10110
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.