Download presentation
Presentation is loading. Please wait.
Published byCharla Richard Modified over 9 years ago
2
Domains or not domains? ShuoYong Shi, Indraneel Majumdar and Nick V. Grishin Howard Hughes Medical Institute, Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas http://prodata.swmed.edu/CASP8
3
Traditionally, CASP targets are evaluated as domains, i.e. each target structure is parsed into domains, and model quality is computed for each domain separately. This strategy makes sense for two reasons: Domains can be mobile and their relative packing can be influenced by ligand presence, crystal packing for X-ray structures, or be semi-random in NMR structures. Thus even a perfect prediction algorithm will not be able to cope with this adequately, for instance in the absence of knowledge about the ligand presence or crystal symmetry. Predictions may be better or worse for individual domains than for their assembly. This happens when domains are of a different predictability, e.g. one has a close template, but the other one does not. Even if domains of a target are of equal prediction difficulty, it is possible that the mutual domain arrangement in the target structure, while predictable in principle, differs from the template, and thus is modeled incorrectly by predictors. Comparison of the whole-chain evaluation with the domain-based evaluation dissects the problem of 'individual domain' vs. 'domain assembly' modeling and should help in development of prediction methods. Why domains?
4
Evolutionary domains: correspond to structurally compact evolutionary modules. How domains? Ago protein: T0487 consist of 5 domains http://prodata.swmed.edu/CASP8/evaluation/DomainDefinition.htm
5
Do we need domains? 122 targets, 176 evolutionary domains, do we need that many? Server predictions helps us to reduce the number of domains: if whole chain prediction quality is not much different from domain prediction quality, domain evaluation is not necessary. GDT-TS(whole chain) VS. Σ i=1 Number of domains Σ i=1 Number of domains Length(domain i) * GDT-TS(domain i) Length(domain i) http://prodata.swmed.edu/CASP8/evaluation/Domains.htm
6
Correlation between weighted by the number of residues sum of GDT-TS scores for domain-based evaluation (y, vertical axis) and whole chain GDT-TS (x, horizontal axis). T0490: correlation between whole chain and domain predictions
7
Each point represents first server model. Green, gray and black points are top 10, bottom 25% and the rest of prediction models. Blue line is the best-fit slope line (intersection 0) to the top 10 server models. Red line is the diagonal. Two parameters to describe correlation between whole chain and domain predictions 1. The root mean square (RMS) difference between the weighted sum of GDT_TS on domains and GDT_TS on the whole chain (RMS of y−x) measures absolute GDT-TS difference. 2. A slope of best-fit line with intercept set to 0 (slope) measures relative GDT-TS difference. These parameters are computed on top 10 (according to the weighted sum) predictions
8
Correlation between weighted by the number of residues sum of GDT-TS scores for domain- based evaluation (y, vertical axis) and whole chain GDT-TS (x, horizontal axis). T0504 needs domain evaluation
9
Correlation between weighted by the number of residues sum of GDT-TS scores for domain-based evaluation (y, vertical axis) and whole chain GDT-TS (x, horizontal axis). T0447 does not need domain evaluation
10
Ribbon diagram of 459: 3df8 chain A (rainbow) with its symmetry mate (white). Domain swaps! 5 out of 122 targets (4% !!!!) exhibit domain swaps, e.g.
11
Ribbon diagram of 459: 3df8 chain A with a swapped N-terminal β-hairpin from its symmetry mate chain (rainbow) and the swapped hairpin symmetry mate chain (white). Swapped domain in T0459
12
Correlation plots for the two domain definitions (swapped and swapped segment removed) of this single-domain target reveal differences whole chain 459: 3df8 chain A Domain-swapped 459: 3df8 chain B*:-2-22 plus chain A:23-106 459 with domain-swapped segment removed: 3df8 chain A:23-106
13
All targets: Correlation between RMS of the difference between GDT_TS on domains and GDT_TS on the whole chain (vertical axis) and the slope of the best-fit line (horizontal axis), both computed on top 10 server predictions.
15
Summary: Comparison of domain-based predictions with whole chain predictions revealed a natural, data-dictated cutoff (slope of the zero intercept best-fit line is above 1.3) to select targets that require domain- based evaluation. These 17 targets are: T0397, T0405, T0407, T0409, T0416, T0419, T0429, T0443, T0457, T0462, T0472, T0478, T0487, T0496, T0501, T0504, T0510. Predictions for other targets follow the general trend, are of a more similar quality for 'domain' and 'whole chain' and thus domain-based evaluation may not be necessary for them.
16
Acknowledgement Our group Collaborators HHMI, NIH, UTSW, The Welch Foundation Shuoyong Shi Jing Tong Ruslan Sadreyev Lisa Kinch Jimin Pei Ming Tang Sasha Safronova Yuan Qi Hua Cheng Jamie Wrabl Indraneel Majumdar Erik Nelson Yong Wang S. Sri Krishna Bong-Hyun Kim Dorothee Staber David Baker U. Washington Kimmen Sjölander UC Berkeley William Noble U. Washington
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.