University of Washington Modelling proteomes Ram Samudrala University of Washington
{ } What is a “proteome”? What does it mean to “model a proteome”? All proteins of a particular system (organelle, cell, organism) What does it mean to “model a proteome”? For any protein, we wish to: ANNOTATION { figure out what it looks like (structure or form) understand what it does (function) Repeat for all proteins in a system EXPRESSION + INTERACTION } Understand the relationships between all of them
De novo prediction of protein structure sample conformational space such that native-like conformations are found select hard to design functions that are not fooled by non-native conformations (“decoys”) astronomically large number of conformations 5 states/100 residues = 5100 = 1070
CASP5 prediction for T138 4.6 Å Cα RMSD for 84 residues
CASP5 prediction for T146 5.6 Å Cα RMSD for 67 residues
4.8 Å Cα RMSD for all 69 residues CASP5 prediction for T170 4.8 Å Cα RMSD for all 69 residues
CASP5 prediction for T129 5.8 Å Cα RMSD for 68 residues
CASP5 prediction for T172 5.9 Å Cα RMSD for 74 residues
CASP5 prediction for T187 5.1 Å Cα RMSD for 66 residues
CASP5 independent assessor’s results http://protinfo.compbio.washington.edu
Comparative modelling of protein structure KDHPFGFAVPTKNPDGTMNLMNWECAIP KDPPAGIGAPQDN----QNIMLWNAVIP ** * * * * * * * ** … scan align de novo simulation build initial model minimum perturbation construct non-conserved side chains and main chains graph theory, semfold refine physical functions
1.0 Å Cα RMSD for 133 residues (57% id) CASP5 prediction for T129 1.0 Å Cα RMSD for 133 residues (57% id)
1.0 Å Cα RMSD for 249 residues (41% id) CASP5 prediction for T182 1.0 Å Cα RMSD for 249 residues (41% id)
2.7 Å Cα RMSD for 99 residues (32% id) CASP5 prediction for T150 2.7 Å Cα RMSD for 99 residues (32% id)
6.0 Å Cα RMSD for 428 residues (24% id) CASP5 prediction for T185 6.0 Å Cα RMSD for 428 residues (24% id)
2.5 Å Cα RMSD for 125 residues (22% id) CASP5 prediction for T160 2.5 Å Cα RMSD for 125 residues (22% id)
6.0 Å Cα RMSD for 260 residues (14% id) CASP5 prediction for T133 6.0 Å Cα RMSD for 260 residues (14% id)
Livebench 7 automated assessment for 71 targets http://protinfo.compbio.washington.edu
Prediction of protein-inhibitor binding energies with dynamics HIV protease MD simulation time Correlation coefficient ps 0 0.2 0.4 0.6 0.8 1.0 1.0 0.5 with MD without MD Ekachai Jenwitheesuk
Prediction of SARS CoV proteinase inhibitors Ekachai Jenwitheesuk
Prediction of inhibitor resistance/susceptibility http://protinfo.compbio.washington.edu Kai Wang / Ekachai Jenwitheesuk
} + + Integrated structural and functional annotation of proteomes structure based methods microenvironment analysis zinc binding site? structure comparison homology function? * Bioverse assign function to entire protein space sequence based methods sequence comparison motif searches phylogenetic profiles domain fusion analyses + experimental data single molecule + genomic/proteomic + EXPRESSION INTERACTION }
Bioverse – explore relationships among molecules and systems http://bioverse.compbio.washington.edu Jason McDermott
Bioverse – explore relationships among molecules and systems Jason Mcdermott
Bioverse – prediction of protein interaction networks Target proteome protein A 85% predicted interaction protein B 90% Interacting protein database protein α protein β experimentally determined interaction Assign confidence based on similarity and strength of interaction Jason Mcdermott
Bioverse – E. coli predicted protein interaction network Jason McDermott
Bioverse – M. tuberculosis predicted protein interaction network Jason McDermott
Bioverse – C. elegans predicted protein interaction network Jason McDermott
Bioverse – H. sapiens predicted protein interaction network Jason McDermott
Bioverse – network-based annotation for C. elegans Jason McDermott
Bioverse – identifying key proteins on the anthrax predicted network Articulation point proteins Jason McDermott
Bioverse – identifying key proteins on the rice predicted network Defense-related proteins Jason McDermott
Bioverse – viewer Aaron Chang
Take home message Prediction of protein structure, function, and networks may be used to model whole genomes to understand organismal function and evolution
Acknowledgements http://bioverse.compbio.washington.edu Aaron Chang Ekachai Jenwitheesuk Gong Cheng Jason McDermott Kai Wang Ling-Hong Hung Lynne Townsend Marissa LaMadrid Mike Inouye Stewart Moughon Shing-Chung Ngan Tianyun Liu Yi-Ling Cheng Zach Frazier National Institutes of Health National Science Foundation Searle Scholars Program (Kinship Foundation) UW Advanced Technology Initative in Infectious Diseases http://bioverse.compbio.washington.edu http://protinfo.compbio.washington.edu