Download presentation
Presentation is loading. Please wait.
Published bySylvia Daniel Modified over 8 years ago
1
Protein Structures from A Statistical Perspective Jinfeng Zhang Department of Statistics Florida State University
2
2 Introduction – Machines of Life Proteins play crucial roles in virtually all biological processes. Rhodopsin Myosin Hemoglobin From Protein Data Bank (PDB) Pepsin
3
3 From Sequence to Structure to Function The function of a protein is governed by its three-dimensional structure. –Structure is determined by sequence. DNA Sequences Amino Acid Sequences Protein Structures
4
4 Protein Folding Problem
5
5 Evaluation of Energy Functions Current approach: rank the energy of the single native structure among those of decoy structures. –Neither necessary nor sufficient. EE X-ray structure EE A B Near-Native structures
6
6 X-ray Structures
7
7 Proteins are Dynamic Molecules
8
8 YJ. Huang and GT. Montelione, Nature, 438, (2005), 36-37.
9
9 N Furnham, T Blundell, M DePristo, Nature Structure & Molecular Biology, (2006) 13:184-185. … A more suitable representation of a macro-molecular crystal structure would be an ensemble of models. The range of structures in the ensemble would be considered by any user of the structural information.
10
10 A New Criterion for Energy Function Evaluation Based on Probability of NNS All structures Near-Native Structures (NNS) Native Structure
11
11 Off-lattice Discrete State Model J Zhang, R Chen, J Liang, (2006), Proteins, 63:949-960. CC
12
12 All structures NNS Sampling Near-Native Structures (NNS) by Sequential Monte Carlo (SMC) SMC Native structure Two constraints: – Conformational – Energetic Partition functions: J. Zhang et. al. (2007), Proteins, 66: 61-68
13
13 Comparison with Enumeration At length 15, 5-state model, protein 1ail has 1.04×10 9 conformations. Estimated number is 1.039×10 9 with a sample size of 10,000.
14
14 Energy Functions and Their Performances J Zhang et. al., (2007), Proteins, 66:61-68. J Zhang, R Chen, J Liang, (2006), Proteins, 63:949-960. UP: Uniform Potential CP: Contact Potential CALSP: Contact And Local Sequence-structure Potential
15
15 Characterizing Ensemble of Side Chain Conformations Side chain conformations –Ensemble of structures with the same backbone but different side chain conformations.
16
16 Ensemble of Side Chain Conformations
17
17 Ensemble of Side Chain Conformations Number of side chain conformations, N sc. Side chain conformational entropy. S sc = k B ln(N sc ) Protein stability. G = H-T S http://wishart.biology.ualberta.ca/moviemaker
18
18 Sequential Monte Carlo (SMC) S n = (r 1,…, r j,…, r n ), r j ∈ R j = {1,…, M j }. SMC: sample a side-chain (r) one at a time and fix the residues that are already sampled. For each sample i, there is an associated weight, w (i). At step t, a residue, r t, is picked, and a rotamer, k, is sampled from a given distribution with probability p k. Update weight by
19
19 Sequential Sampling of Side Chains
20
20 Performance The total number of self-avoiding side-chain conformations for the fragment of 3ebx, residue 1-17, is 396,325,923,840 ≈ 3.96×10 11, SMC estimate is 4.01×10 11 with a sample size of 1000.
21
21 Incorporating SCE in Energy Function G = H – H : Residue contact potential. G = H - T S sc – H : Residue contact potential. – S sc : Side-chain entropy. –T = 1.
22
22 ΔH vs. ΔH - ΔS sc Protein IDΔHΔHΔH - ΔS sc Protein IDΔHΔHΔH - ΔS sc 1ctf (A, 630)*611beo (D, 2000)672 1r69 (A, 675)2451ctf (D, 2000)101 1sn3 (A, 660)86101dkt-A (D, 2000)5885 2cro (A, 674)6351fca (D, 2000)13610 3icb (A, 653)19251nkl (D, 2000)2173 4pti (A, 687)143831pgb (D, 1572)121 4rxn (A, 677)1471b0n-B (E, 497)114104 1fc2 (B, 500)751ctf (E, 497)134 1hdd-C (B, 500)1051dtk (E, 215)11 2cro (B, 500)47171fc2 (E, 500)323 4icb (B, 500) 11 1igd (E, 500)1596 1bl0 (C, 971)85141shf-A (E, 437)2 2 1eh2 (C, 2413)99532cro (E, 500)11 1jwe (C, 1407)28812ovo (E, 347)192 smd3 (C, 1200)26614pti (E, 343)11 * A: 4state_reduced, B: fisa, C: fisa_casp3, D: lattice_ssfit, E: lmds.
23
23 Summary Proteins can be better modeled as ensemble of conformations. –Estimating entropy and free energy by SMC P NNS as a better criterion for evaluating energy functions. SCE is important for protein folding and structure modeling.
24
24 Acknowledgement Prof. Jun LiuDepartment of Statistics Harvard University Prof. Jie LiangBioengineering Department University of Illinois at Chicago Prof. Rong ChenDepartment of Information and Decision Science University of Illinois at Chicago Dr. Ming LinDepartment of Information and Decision Science University of Illinois at Chicago NIH, NSF for financial support!
25
25 Protein Interactions http://wishart.biology.ualberta.ca/moviemaker
26
26 Native & Decoy Structure of Protein Complexes 1spb 1brc Native
27
27 Native & Decoys Structures Native S sc can differ by more than 20 in k B unit, which corresponds to -11.9 kcal/mol of free energy at 300K. The stability of a protein is around -5 to -20 kcal/mol. 1ctf
28
28 Side-chain Modeling All heavy atoms are explicitly modeled. Side-chain flexibility –Rotamer library by D. Richardson Excluded volume effect –A pair of atoms i and j are considered to be a hard clash if r ij : distance; r 0 (i) and r 0 (j) : van der Waals radii of the two atoms; a : scaling coefficient.
29
29 X-ray & NMR Structures Protein in crystalProtein in solution
30
30 SCE vs. R g of X-ray and NMR Structures 23 proteins with both X-ray and NMR structures
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.