Download presentation
Presentation is loading. Please wait.
Published byCordelia Harmon Modified over 9 years ago
2
Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University
3
An evolutionary tree A rooted tree Each leaf represents one species. Internal nodes are unlabelled. (inferred common ancestors) abcdef
4
A (rooted) triple (triplet) An evolutionary tree of 3 species. A constraint in an evolutionary tree construction problem. (c(ab)): lca(b,c)=lca(c,a) lca(a,b) lca : lowest common ancestor : “ is an ancestor of “ a,b should be closer than a,c or b,c. abc
5
A tree compatible with triples Given a set of triples, construct a tree satisfying all the triples. If such a tree exists, the problem is polynomial time solvable. [Aho et al, 1981]
6
Incompatible (conflicting) triples Two conflicting triplesThree conflicting triples (pairwise compatible)
7
Two optimization problems The maximum consensus tree: –the tree satisfying maximum number of triples. –NP-hard [Jansson, 2001][Wu, to appear] –A new heuristic algorithm [this paper] The maximum compatible set: –The compatible species subset of maximum cardinality. –NP-hard [this paper]
8
Previous heuristic Best-One-Split-First If a species x is split from a set V, all triples (x(v 1 v 2 )), v 1 and v 2 in V, will be satisfied. Repeatedly split one species from the set. Choose the split species greedily.
9
{a,b,d} c b {a,d} c dabc c is chosen, and the two triples is satisfied. c is split b is split
10
Previous heuristic Min-Cut-Split-First Construct an auxiliary graph: –Vertex: species –Each edge is labeled by a set: for each triple (x(yz)), x is in the label set of edge (y,z).
11
–A bipartition corresponds to a split in the tree. –The label in the cut of the bipartition corresponds to the triples conflicting the split. Repeatedly find the bipartition with minimum cut. a min-cut, triple (c(bd)) is conflicting
12
Previous heuristic Best-Pair-Merge-First Instead of top-down splitting, BPMF uses the bottom-up merging strategy. Starting from sets of singleton, we repeatedly merge the sets step by step. Scoring functions are used to evaluate which pair should be merged in each step.
13
{a}{b}{c}{d} {a,d}{b}{c} {a,d}{b,c} {a,d,b,c} ad adbc adbc
14
An exact algorithm for MCTT Dynamic programming F(V)=max{F(V 1 )+F(V 2 )+W(V 1,V 2 )}, taken among all bipartition (V 1,V 2 ) of V. –F(V): # of satisfied triples over V. –W(V 1,V 2 ): # of (x(v 1 v 2 ) for x not in V and v 1, v 2 in V 1, V 2 respectively. Computed with cardinality from small to large.
15
n=4abcd 3 n=3abc 1 abd 3 bcd 2 n=2ab 0 ac 0 ad 2 bc 1 bd 1 cd 0 n=1a0a0 b0b0 c0c0 d0d0
16
Our new heuristic algorithm (DPWP) Derived from the exact algorithm. The number of subsets of each cardinality is limited by a parameter K. When K=infinity, it is just the exact algorithm. Time-quality trade-off. The time complexity is O(n 2 k 2 (n 3 +k)). –Sorry, there is a mistake in the paper.
17
The experiment results (time)
21
The MCST problem Given triples over species set S, find a subset U of S such that all given triples over U is compatible and |U| is maximum. We show the problem is NP-hard. –Transformed from the Feedback Vertex Set problem.
22
The feedback vertex set problem Feedback vertex set: a vertex subset containing at one vertex of each cycle of the given directed graph. –In other words, removing a feedback vertex set results in an acyclic digraph.
23
The reduction
24
Concluding remarks What is the approximation ratio? –The Best-One-Split-First algorithm is a 3- approximation algorithm, –The larger K give us better solution, but we do not know the theoretic bound of the ratio.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.