Download presentation
Presentation is loading. Please wait.
1
P-POD-PANTHER: update
Kara Dolinski P-POD/Princeton Paul Thomas PANTHER/SRI
2
Current status: OrthoMCL clusters from P-POD are incorporated in PANTHER families:
Colors indicate different OrthoMCL families, mouseover displays OrthoMCL ID (soon will hyperlink to P-POD)
3
Currently crunching away at new protein sets generated by PANTHER:
Updated protein sets P-POD OrthoMCL InParanoid/MultiParanoid PANTHER trees Consensus clusters compare Next step: incorporate TreeFam in PANTHER and Consensus clusters via sequence mapping (cannot simply run the TreeFam analysis on our own protein sets, so more complicated than OrthoMCL/InParanoid)
4
Large-scale comparison of trees with OrthoMCL clusters
Algorithm to compare each OrthoMCL cluster to a tree and classify as: Perfect match to tree Consistent with tree Inconsistent with tree Manually review inconsistencies with the aim to improve trees
5
Clusters from different “orthology” methods
E.c. A.t. MTHFR1 A.t. MTHFR2 D.d. S.p. S.c. MET13 S.p. S.c. MET12 C.e. D.m. A.g. D.r. G.g. H.s. MTHFR R.n. M.m. OrthoMCL in red; PhiGs in blue; InParanoid in green An “ortholog cluster” is made by one or more “slices” through the protein family tree Some combination of evolutionary rates and history of duplications Might miss genes that have inherited some but not all functions from the MRCA
6
Perfect agreement E.c. A.t. MTHFR1 A.t. MTHFR2 D.d. S.p. S.c. MET13
C.e. D.m. A.g. D.r. G.g. H.s. MTHFR R.n. M.m.
7
Perfect agreement E.c. A.t. MTHFR1 A.t. MTHFR2 D.d. S.p. S.c. MET13
C.e. D.m. A.g. D.r. G.g. H.s. MTHFR R.n. M.m.
8
Consistent E.c. A.t. MTHFR1 A.t. MTHFR2 D.d. S.p. S.c. MET13 S.p.
C.e. D.m. A.g. D.r. G.g. H.s. MTHFR R.n. M.m.
9
Inconsistent (blue) E.c. A.t. MTHFR1 A.t. MTHFR2 D.d. S.p. S.c. MET13
C.e. D.m. A.g. D.r. G.g. H.s. MTHFR R.n. M.m.
10
OrthoMCL clusters overlaid on PANTHER trees
14695 non-singleton clusters from P-POD spanning 12 RefGenomes 4815 trees from PANTHER 62% 20% 18%
11
Validating trees by comparing with other tree methods
TreeFam Compare tree topology Robinson-Foulds “symmetric difference distance” (requires exact match of all leaf nodes) Compare ortholog and within-species paralog predictions Requires only a match of a subset of leaf nodes
12
GIGA trees on “full” TreeFam alignments are more similar to “clean” TreeFam trees
0.2 0.4 0.6 0.8 1 Robinson-Foulds tree distance GIGA-full vs TreeFam-clean (red) TreeFam-full vs. TreeFam-clean (blue)
13
GIGA trees are robust to addition of more sequences
0.2 0.4 0.6 0.8 1 Robinson-Foulds tree distance GIGA-full vs. GIGA-clean (red) TreeFam-full vs. TreeFam-clean (blue)
14
Next steps Start annotation of trees using PAINT
Review first trees with all GO curators to work out process Begin quantitatively tracking progress, e.g. Number of families annotated, number of homology annotations inferred, number of homology annotations inferred per experimental annotation Compare consistency with OrthoMCL using the same dataset Review and correct trees if necessary before GO annotation of tree Compare tree algorithm with TreeFam curated seed trees (incorporate subtrees from TreeFam if they are superior) Map additional orthology methods to trees InParanoid TreeFam
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.