Download presentation
Presentation is loading. Please wait.
1
How does the genome of an organism
Modelling proteomes Conjoint 548A Ram Samudrala How does the genome of an organism specify its behaviour and characteristics?
2
Proteome – all proteins of a particular system
~60,000 in human ~60,000 in rice ~4500 in bacteria like Salmonella and E. coli Several thousand distinct sequence families
3
Modelling proteomes – understand the structure of individual proteins
A few thousand distinct structural folds
4
Modelling proteomes – understand their individual functions
Thousands of possible functions
5
Modelling proteomes – understand their expression
Different expression patterns based on time and location
6
Modelling proteomes – understand their interactions
Interactions and expression patterns are interdependent with structure and function
7
Protein folding Gene Protein sequence Unfolded protein
…-CTA-AAA-GAA-GGT-GTT-AGC-AAG-GTT-… Protein sequence …-L-K-E-G-V-S-K-D-… one amino acid not unique mobile inactive expanded irregular Unfolded protein Native biologically relevant state spontaneous self-organisation (~1 second)
8
Protein folding Gene Protein sequence Unfolded protein
…-CTA-AAA-GAA-GGT-GTT-AGC-AAG-GTT-… Protein sequence …-L-K-E-G-V-S-K-D-… one amino acid not unique mobile inactive expanded irregular Unfolded protein Native biologically relevant state spontaneous self-organisation (~1 second) unique shape precisely ordered stable/functional globular/compact helices and sheets
9
Protein primary structure
twenty types of amino acids R H C OH O N Cα two amino acids join by forming a peptide bond R Cα H C O N OH R Cα H C O N c f y each residue in the amino acid main chain has two degrees of freedom (f and y) the amino acid side chains can have up to four degrees of freedom (c1-4)
10
Protein secondary structure
b a L f 0 0 y +180 -180 many f,y combinations are not possible b sheet (anti-parallel) N C N C b sheet (parallel) a helix
11
Protein tertiary and quaternary structures
Ribonuclease inhibitor (2bnh) Haemoglobin (1hbh) Hemagglutinin (1hgd)
12
Methods for obtaining structure
Experimental X-ray crystallography NMR spectroscopy Theoretical De novo prediction Homology modelling
13
Computer representation of protein structure
Structures are stored in the protein data bank (PDB), a repository of mostly experimental models based on X-ray crystallographic and NMR studies < Atoms are defined by their Cartesian coordinates: ATOM N GLU ATOM CA GLU ATOM C GLU ATOM O GLU ATOM CB GLU ATOM CG GLU ATOM CD GLU ATOM OE1 GLU ATOM OE2 GLU ATOM N PHE ATOM CA PHE These structures provide the basis for most of theoretical work in protein folding and protein structure prediction
14
Comparison of protein structures
Need ways to determine if two protein structures are related and to compare predicted models to experimental structures Commonly used measure is the root mean square deviation (RMSD) of the Cartesian atoms between two structures after optimal superposition (McLachlan, 1979): Usually use Ca atoms 3.6 Å 2.9 Å NK-lysin (1nkl) Bacteriocin T102/as48 (1e68) T102 best model Other measures include contact maps and torsion angle RMSDs
15
De novo prediction of protein structure
sample conformational space such that native-like conformations are found select hard to design functions that are not fooled by non-native conformations (“decoys”) astronomically large number of conformations 5 states/100 residues = 5100 = 1070
16
Requirements for sampling methods and scoring functions
Sampling methods must produce good decoy sets that are comprehensive and include several native-like structures Scoring function scores must correlate well with RMSD of conformations (the better the score/energy, the lower the RMSD)
17
Semi-exhaustive segment-based folding
EFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDDAEALKKALEEAGAEVEVK minimise Find the most protein-like structures … generate Make random moves to optimise what is observed in known structures … filter all-atom pairwise interactions, bad contacts compactness, secondary structure, consensus of generated conformations
18
Critical Assessment of protein Structure Prediction methods (CASP)
Pre-CASP CASP Bias towards known structures Blind prediction
19
5.0 Å Cα RMSD for all 53 residues
CASP6 prediction (model1) for T0215 5.0 Å Cα RMSD for all 53 residues Ling-Hong Hung/Shing-Chung Ngan
20
4.3 Å Cα RMSD for all 70 residues
CASP6 prediction (model1) for T0281 4.3 Å Cα RMSD for all 70 residues Sheets –no copying- pairing and packing Ling-Hong Hung/Shing-Chung Ngan
21
Homologous proteins share similar structures
Gan et al, Biophysical Journal 83: , 2002
22
Comparative modelling of protein structure
KDHPFGFAVPTKNPDGTMNLMNWECAIP KDPPAGIGAPQDN----QNIMLWNAVIP ** * * * * * * * ** … scan align de novo simulation build initial model minimum perturbation construct non-conserved side chains and main chains graph theory, semfold refine physical functions
23
1.3 Å Cα RMSD for all 137 residues (80% ID)
CASP6 prediction (model1) for T0231 1.3 Å Cα RMSD for all 137 residues (80% ID) Tianyun Liu
24
2.4 Å Cα RMSD for all 142 residues (46% ID)
CASP6 prediction (model1) for T0271 2.4 Å Cα RMSD for all 142 residues (46% ID) Tianyun Liu
25
Protein structure from combining theory and experiment
Ling-Hong Hung
26
Similar global sequence or structure does not imply similar function
TIM barrel proteins 2246 with known structure hydrolase ligase lyase oxidoreductase transferase
27
Function prediction from structure
Kai Wang
28
Prediction of HIV-1 protease-inhibitor binding energies with MD
Can predict resistance/susceptibility to six FDA approved inhibitors with 95% accuracy in conjunction with knowledge-based methods Ekachai Jenwitheesuk
29
Prediction of protein inhibitors
Ekachai Jenwitheesuk
30
Prediction of protein interaction networks
Target proteome protein A 85% predicted interaction protein B 90% Interacting protein database protein a protein b experimentally determined interaction Assign confidence based on similarity and strength of interaction Key paradigm is the use of homology to transfer information across organisms; not limited to yeast, fly, and worm Consensus of interactions helps with confidence assignments Jason McDermott
31
E. coli predicted protein interaction network
Jason McDermott
32
M. tuberculosis predicted protein interaction network
Jason McDermott
33
C. elegans predicted protein interaction network
Jason McDermott
34
H. sapiens predicted protein interaction network
Jason McDermott
35
Bioverse – v2.0 Michal Guerquin/Zach Frazier
36
Network-based annotation for C. elegans
Jason McDermott
37
Identifying key proteins on the anthrax predicted network
Articulation point proteins Jason McDermott
38
Identification of virulence factors
Jason McDermott
39
Bioverse - Integrator Aaron Chang/Imran Rashid
40
+ + Computational Functional Structural biology genomics genomics
Where is all this going? Computational biology + Structural genomics Functional genomics + Take home message Prediction of protein structure, function, and networks may be used to model whole genomes to understand organismal function and evolution
41
Current group members: Past group members:
Acknowledgements Andrew Nichols Baishali Chanda Chuck Mader David Nickle Duangdao Wichadakul Duncan Milburn Ersin Emre Oren Ekachai Jenwitheesuk Gong Cheng Jason McDermott Jeremy Horst Kai Wang Ling-Hong Hung Michal Guerquin Shing-Chung Ngan Somsak Phattarasukol Stewart Moughon Tianyun Liu Weerayuth Kittichotirat Zach Frazier Kristina Montgomery, Program Manager Current group members: Aaron Chang Marissa LaMadrid Mike Inouye Sarunya Suebtragoon Past group members: Funding agencies: National Institutes of Health National Science Foundation Searle Scholars Program Puget Sound Partners in Global Health UW Advanced Technology Initiative
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.