Download presentation
Presentation is loading. Please wait.
Published byErwin Geiger Modified over 5 years ago
1
Imputing Supertrees and Supernetworks from Quartets
By B. Holland, G. Conner, K. Huber, and V. Moulton Presented by Razieh Nokhbeh Zaeem
2
This talk Basic problem: constructing an estimate of a species phylogeny (in this case, network) from a given set of gene trees Input: a set of partial gene trees (not all taxa) Output: a supernetwork, allowing the conflicting signals Algorithm by Holland et al. combines quartet-imputation with consensus network construction Experiments comparing the new method to previous method Z-closure and to MRP with respect to “False Positives”, “False Negatives”. Q-imputation provides a useful complementary tool
3
Q-imputation Some definitions: L(T), T|Z, Q(T) and
Let … : collection of input trees corresponding to a collection of gene trees. Put For each tree , we sequentially insert all of the taxa in into to get Once we get all s, we apply consensus network method to obtain a network
4
Polynomial time alg: For each For each new taxon y: Find a place to add a pendant edge labeled by y We are trying to choose place p s.t. it maximizes the # of agreed quartets between and all other s Choose randomly if there is more than one place to add y to get the best score If the max score is 0 we don’t have enough information
5
An example – insert F into
FB|AD FB|AE FB|DE FA|DE FA|CE FB|AC FB|AE FB|CE FD|BC
6
The consensus network For example:
The consensus network (the split network): Those splits of X that are displayed by more than a certain proportion, t, of the trees computed by Q-imputation In case t = 0 we drop the subscript t: splits which appear at least once For example: If t = 100, then the consensus network is a strict-consensus tree If t = 50, then the consensus network is the majority-rule consensus tree If t < 50, then the consensus network may display conflicting splits
7
Simulation Three different types of input: (3 types of simulations)
Evolution is tree like. Gene trees are correct, but miss taxa Evolution is tree like. Gene trees have errors and miss taxa Evolution is not tree like. Random input trees. In each simulation, three parameters were varied: The species tree, either The completely balanced tree on 16 taxa or The completely unbalanced tree on 16 taxa g taking values 2, 4, 8, 16, and 32 m (The number of taxa missing) taking values 1, 2, 3, 4, 5, and 6, deleted randomly One hundred repetitions were carried out for each parameter combination.
8
Simulation The split systems generated were: Measuring FP and FN
MRP: and , the splits in the majority-rule consensus and strict consensus from MRP. Q-imputation: , and Z-closure: the splits generated using Z-closure Measuring FP and FN FP: splits contained in the output split system that are not in the input FN: splits in input that are not in the output split system
9
WIP Z-closure satisfies WIP
Definition: weak induction property (WIP): For input trees … any split S in should restrict to a split in for some The WIP holds for all splits in in case input trees are all subtrees of a phylogenetic tree. There are examples where WIP does not hold, although very few generated by Q-imputation. Z-closure satisfies WIP Any method with WIP property cannot generate FP: Every split in output has come from some tree in the input set, so there is not split which appears in output but not input. Q-imputation with t=0 cannot produce FN
10
Simulation results: FP
Z-closure cannot generate FP, so we just look at splits in Q-imputation and MRP. 6000 different settings for each type of simulation. Normalized numbers in parenthesis. Each tree on 16 taxa, 13 internal edges. Type Method Simulation 1 Simulation 2 36 (0.006) 35 (0.006) 87 (0.015) 46 (0.008) Simulation 3 56 (0.009) 52 (0.009) 5252(0.875) 4368(0.728)
11
Simulation 1 results: FN, normalized, %
Z-closure Q-imoutaion20 MRP50 g m 1 2 3 4 5 6 (1b) 0.01 0.17 0.30 0.51 0.63 0.92 0.00 0.06 0.05 0.13 0.32 0.41 8 0.02 16 32 (1u) 0.23 0.44 0.78 0.07 0.15 0.26 0.03
12
Simulation 2 results: FN, normalized, %
Z-closure Q-imoutaion20 MRP50 g m 1 2 3 4 5 6 (2b) 0.04 0.16 0.27 0.48 0.65 0.77 0.00 0.61 0.54 0.34 0.30 0.17 0.07 0.05 0.14 0.10 1.67 1.45 1.40 1.16 1.04 0.81 8 0.01 2.89 2.59 2.49 2.06 1.81 1.42 3.30 3.01 2.81 2.47 2.23 1.91 16 6.49 6.00 5.32 4.96 4.42 3.62 6.56 6.02 5.38 5.03 4.45 3.77 32 13.13 12.16 11.15 9.83 8.66 7.61 12.19 9.84 8.67 7.62 (2u) 0.37 0.59 0.58 0.70 0.92 0.84 0.41 0.22 0.08 0.28 0.40 0.44 0.46 2.37 2.09 1.38 1.11 0.89 0.23 0.18 0.13 0.09 0.15 3.78 3.33 2.86 2.33 1.90 1.52 4.46 3.98 3.29 2.42 1.97 8.97 7.53 6.69 5.62 4.71 3.86 9.04 7.64 6.74 5.69 4.82 3.97 18.09 15.50 13.94 11.56 9.59 7.98 18.10 15.52 13.95 9.62 8.05
13
Simulation 3 results: FN, normalized, %
Z-closure Q-imoutaion20 MRP50 g m 1 2 3 4 5 6 (3) 0.48 0.88 0.80 0.96 0.82 0.67 0.00 2.15 1.87 1.34 0.99 0.56 0.18 0.07 0.23 0.31 0.41 0.66 5.57 4.92 4.27 3.61 2.96 2.30 8 0.01 0.08 11.38 10.09 9.02 7.64 6.53 5.21 11.95 10.76 9.72 8.34 7.26 5.95 16 25.36 22.89 20.42 18.36 16.09 13.90 24.61 22.45 20.22 17.98 15.74 13.41 32 51.85 46.86 42.29 37.80 33.50 29.21 50.05 45.77 41.52 37.16 32.74 28.31
14
Discussion on simulation results
By increasing the # of gene trees: FN produced by Z-closure reduces (good) FN produced by Q-imputation increases (bad) As a supertree method (simulation 1 & 2), Q-imputation tended to return fewer FP (unsupported) splits, but also fewer supported splits (more FN (?)) than MRP As a supernetwork method, Q-imputation tended to give rise to FP but not FN(?), whereas Z-closure gave rise to FN but no FP Also, in simulations where there was an underlying species tree, while increasing number of gene trees: For Z-closure the number of FN increased (?) For the split system derived from applying a threshold to the trees completed by Q ‑ imputation, the number of FN had the desirable property of decreasing (?) For the output to be visually palatable, we need to have some FN to restrict the number of splits that are being displayed. Q-imputation: a natural means to filter out splits. Look at case study.
15
Case study 7 genes, 45 taxa Z-closure Q-imputation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.