Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene Tree Estimation Through Affinity Propagation

Similar presentations


Presentation on theme: "Gene Tree Estimation Through Affinity Propagation"— Presentation transcript:

1 Gene Tree Estimation Through Affinity Propagation
Vladimir Smirnov

2 Context The gene tree estimation problem
The Affinity Propagation clustering algorithm Can we somehow combine the two?

3 Quick Reminder - Affinity Propagation
We have a set of data points with a notion of “similarity” Each point chooses a representative The algorithm (approximately) optimizes the total similarity

4 Intuition - Adapting to the Gene Tree Problem
“Data points” → Tree nodes “Similarity” → Branch length “Representative” → Parent “Optimizing total similarity” → Optimizing total branch lengths

5 Informal Algorithm Begin with a star topology tree
While tree is not binary: Augment existing nodes with pool of candidate nodes Run Affinity Propagation over this set Candidate nodes chosen as representatives become new internal nodes Cleanup and return result

6

7

8

9

10

11 The Main Design Questions
How do we label and select candidate internal nodes? Best solution: label with probability distribution over sequences Pick a probability distribution somewhere between parent and child How do we correctly prepare the similarity matrix? Best solution: “intersect” the probability distributions at each site. Sum up the logs How do we ensure that the tree becomes binary? Best solution: retain all candidates. Revisit unresolved polytomies as much as needed

12

13

14 Conclusion Didn’t work
Neighbor Joining is philosophically similar, but does it better Why? Error rate starts very low, but grows nonlinearly with number of nodes inserted Effectively captures coarse distinctions at the outermost layers of the tree, but gets confused in the interior Reliance on distributions at existing internal nodes to anchor subsequent optimization causes error to compound

15 References Desper, R., & Gascuel, O. (2002, September). Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. In International Workshop on Algorithms in Bioinformatics (pp ). Springer, Berlin, Heidelberg. Lefort, V., Desper, R., & Gascuel, O. (2015). FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program. Molecular biology and Evolution, 32(10), Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. science, 315(5814), Liu, K., Raghavan, S., Nelesen, S., Linder, C. R., & Warnow, T. (2009). Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science, 324(5934), Saitou, N., & Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular biology and evolution, 4(4),


Download ppt "Gene Tree Estimation Through Affinity Propagation"

Similar presentations


Ads by Google