Download presentation
Presentation is loading. Please wait.
Published byGeoffrey Parsons Modified over 9 years ago
1
MODELLING PROTEOMES RAM SAMUDRALA ASSOCIATE PROFESSOR UNIVERSITY OF WASHINGTON How does the genome of an organism specify its behaviour and characteristics?
2
PROTEOME ~60,000 in human ~60,000 in rice ~4500 in bacteria Several thousand distinct sequence families
3
STRUCTURE A few thousand distinct structural folds
4
FUNCTION Tens of thousands of functions
5
EXPRESSION Different expression patterns based on time and location
6
INTERACTION Interaction and expression are interdependent with structure and function
7
PROTEIN FOLDING …-CTA-AAA-GAA-GGT-GTT-AGC-AAG-GTT-… Gene …-L-K-E-G-V-S-K-D-… One amino acid Protein sequence Unfolded protein Native biologically relevant state Spontaneous self-organisation (~1 second) Not unique Mobile Inactive Expanded Irregular
8
PROTEIN FOLDING …-L-K-E-G-V-S-K-D-… …-CTA-AAA-GAA-GGT-GTT-AGC-AAG-GTT-… One amino acid Gene Protein sequence Unfolded protein Native biologically relevant state Spontaneous self-organisation (~1 second) Unique shape Precisely ordered Stable/functional Globular/compact Helices and sheets Not unique Mobile Inactive Expanded Irregular
9
STRUCTURE 0246 ACCURACY Experiment (X-ray, NMR) Computation (de novo) Computation (template-based) Hybrid (Iterative Bayesian interpretation of noisy NMR data with structure simulations) One distance constraint for every six residues One distance constraint for every ten residues C α RMSD
10
DE NOVO MODELLING Astronomically large number of conformations 5 states/100 residues = 5 100 = 10 70 SELECT Hard to design functions that are not fooled by non-native conformations (“decoys”) Sample conformational space such that native-like conformations are found
11
DE NOVO MODELLING GENERATE Make random moves to optimise what is observed in known structures …… Find the most protein-like structures MINIMISE …… FILTER Filter based on all-atom pairwise interactions, bad contacts compactness, secondary structure, consensus of generated conformations
12
TEMPLATE-BASED MODELLING KDHPFGFAVPTKNPDGTMNLMNWECAIP KDPPAGIGAPQDN----QNIMLWNAVIP ** * * * * * * * ** …… SCAN ALIGN Refine using constraints derived from multiple templates Build initial models from multiple templates using minimum perturbation Construct nonconserved side and main chains using graph theory and semfold
13
STRUCTURE 0.5 Å C α RMSD for 173 residues (60% identity) T0290 – peptidyl-prolyl isomerase from H. sapiens T0364 – hypothetical from P. putida 5.3 Å C α RMSD for 153 residues (11% identity) T0332 – methyltransferase from H. sapiens 2.0 Å C α RMSD for 159 residues (23% identity) T0288 – PRKCA-binding from H. sapiens 2.2 Å C α RMSD for 93 residues (25% identity) Liu/Hong-Hung/Ngan
14
HYBRID MODELLING http://protinfo.compbio.washington.edu/protinfo_nmr http://protinfo.compbio.washington.edu/psicsi Hong-Hung
15
FUNCTION Wang/Cheng Ion binding energy prediction with a correlation of 0.7 Calcium ions predicted to < 0.05 Å RMSD in 130 cases Meta-functional signature for DXS model from M. tuberculosis Meta-functional signature accuracy
16
INTERACTION McDermott/Wichadakul/Staley/Horst/Manocheewa/Jenwitheesuk/Bernard BtubA/BtubB interolog model from P. dejongeii (35% identity to eukaryotic tubulins) Transcription factor bound to DNA promoter regulog model from S. cerevisiae Prediction of binding energies of HIV protease mutants and inhibitors using docking with dynamics
17
SYSTEMS McDermott/Wichadakul Example predicted protein interaction network from M. tuberculosis (107 proteins with 762 unique interactions) In sum, we can predict functions for more than 50% of a proteome, approximately ten million protein-protein and protein-DNA interactions with an expected accuracy of 50%. Utility in identifying function, essential proteins, and host pathogen interactions Proteins PPIs TRIs H. sapiens 26,741 17,652 828,807 1,045,622 S. cerevisiae 5,801 5,175 192,505 2,456 O.sativa (6) 125,568 19,810 338,783 439,990 E. coli 4,208 885 1,980 54,619
18
SYSTEMS McDermott/Wichadakul Combining protein-protein and protein-DNA interaction networks to determine regulatory circuits
19
INFRASTRUCTURE Guerquin/Frazier http://bioverse.compbio.washington.edu http://protinfo.compbio.washington.edu ~500,000 molecules over 50+proteomes served using a 1.2 TB PostgreSQL database and a sophisticated AJAX webapplication and XML-RPC API
20
INFRASTRUCTURE Guerquin/Frazier
21
INFRASTRUCTURE Chang/Rashid http://bioverse.compbio.washington.edu/integrator
22
APPLICATION: DRUG DISCOVERY HSV KHSVCMV Jenwitheesuk
23
APPLICATION: DRUG DISCOVERY HSV KHSV CMV Computionally predicted broad spectrum human herpesvirus protease inhibitors is effective in vitro against members from all three classes and is comparable or better than anti-herpes drugs HSV Our protease inhibitor acts synergistically with acylovir (a nucleoside analogue that inhibits replication) and it is less likely to lead to resistant strains compared to acylovir Lagunoff
24
APPLICATION: NANOTECHNOLOGY Oren/Sarikaya/Tamerler
25
FUTURE Structural genomics Functional genomics + Computational biology + MODELLING PROTEIN AND PROTEOME STRUCTURE FUNCTION AT THE ATOMIC LEVEL IS NECESSARY TO UNDERSTAND THE RELATIONSHIPS BETWEEN SINGLE MOLECULES, SYSTEMS, PATHWAYS, CELLS, AND ORGANISMS
26
ACKNOWLEDGEMENTS Baishali Chanda Brady Bernard Chuck Mader David Nickle Ersin Emre Oren Ekachai Jenwitheesuk Gong Cheng Imran Rashid Jeremy Horst Ling-Hong Hung Michal Guerquin Rob Brasier Rosalia Tungaraza Shing-Chung Ngan Siriphan Manocheewa Somsak Phattarasukol Stewart Moughon Tianyun Liu Vania Wang Weerayuth Kittichotirat Zach Frazier Kristina Montgomery, Program Manager Current group members: Aaron Chang Duncan Milburn Jason McDermott Kai Wang Marissa LaMadrid Past group members: Funding agencies: National Institutes of Health National Science Foundation Searle Scholars Program Puget Sound Partners in Global Health UW Advanced Technology Initiative Washington Research Foundation UW TGIF James Staley Mehmet Sarikaya/Candan Tamerler Michael Lagunoff Roger Bumgarner Wesley Van Voorhis Collaborators:
28
MOTIVATION FOR DETERMINING PROTEIN STRUCTURE The functions necessary for life are undertaken by proteins. Protein function is mediated by protein three-dimensional structure. Knowing protein structure at high resolution will enable us to: Determine and understand molecular function. Understand substrate and ligand binding. Devise intelligent mutagenesis and biochemical experiments to understand biological function. Design therapeutics rationally. Design novel proteins. Knowing the structures of all proteins encoded by an organism’s genome will enable us to understand complex pathways and systems, and ultimately organismal behaviour and evolution. Applications in the area of medicine, nanotechnology, and biological computing.
29
s(d ab ) for contacts AO AN AC... YOH AO AN AC … YOH 167 X167 contacts distance bins known structures atom-atom contacts AO AN AC... YOH AO AN AC … YOH candidate structure atom-atom contacts AO AN AC... YOH AO AN AC … YOH NxN contacts ALL-ATOM SCORING FUNCTION
30
CRITICAL ASSESSMENT OF STRUCTURE PREDICTION Bias towards known structures Pre-CASPCASP Blind prediction
31
Oxidoreductase TransferaseHydrolaseLigaseLyase STRUCTURE TO FUNCTION? TIM barrel proteins 2000+ experimental structures
32
INTEROLOG MODELLING Interacting protein database Protein a Protein b Experimentally determined interaction Target proteome Protein A Predicted interaction Protein B 85% 90% Assign confidence based on similarity and strength of interaction Paradigm is the use of homology to transfer information across organisms; not limited to yeast, fly, and worm Consensus of interactions helps with confidence assignments
33
E. coli INTERACTIONS McDermott
34
M. tuberculosis INTERACTIONS McDermott
35
C. elegans INTERACTIONS McDermott
36
H. sapiens INTERACTIONS McDermott
37
Network-based annotation for C. elegans McDermott
38
Articulation points KEY PROTEINS IN ANTHRAX
39
HOST PATHOGEN INTERACTIONS McDermott
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.