Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring landscapes...for protein folding, binding, and fitness “important coordinates” energy 700 K replica 200 K replica.

Similar presentations


Presentation on theme: "Exploring landscapes...for protein folding, binding, and fitness “important coordinates” energy 700 K replica 200 K replica."— Presentation transcript:

1 Exploring landscapes...for protein folding, binding, and fitness “important coordinates” energy 700 K replica 200 K replica

2 Important coordinates Effective potential Exploring landscapes for protein folding Effective potentials - the AGBNP all atom model Sampling - replica exchange molecular dynamics (REMD) Network models for polypeptide folding pathways and kinetics

3 Solvation Models Most detailed/(accurate) Thermodynamics requires averaging over solvent coordinates Computationally expensive # degrees of freedom average solvent reaction field Explicit  Implicit Continuum approximation Based on solvent PMF Relative solvation free energies from single point energy calculations. 1990s 2000s

4 Typical Modern Implicit Solvent Model Electrostatic Component: Continuum Dielectric Models –Poisson-Boltzmann solvers (accurate but numerical and slow). –Generalized Born Models (faster, analytical). Non-Polar Component: Surface Area Model,

5 The AGBNP Implicit Solvent Model Analytical Generalized Born + Non-Polar Motivation: Applicable to small and large molecules, arbitrary functional groups: Suitable for MD sampling: –Analytical with analytical gradients Computationally efficient Parameter-free pairwise descreening implementation. Cavity/vdW dispersion decomposition. Combined with OPLS-AA: OPLS/AGBNP Gallicchio E., and R.M. Levy. J. Comp. Chem., 25, 479-499 (2004)

6 Generalized Born Models Charging free energy in linear dielectric medium: B i is the Born radius of atom i defined by (in the Coulomb field approximation): GB implementations differ mostly in the way that Born radii are computed. - +

7 AGBNP: Pairwise Descreening Scheme i Born radii: rescaled pairwise descreening approximation: Rescale according to self-volume of j: Self-volume of j (Poincarè formula): j E. Gallicchio, R. Levy, J. Comp. Chem. 25, 479 (2004) Gaussian functions gi to estimate overlap volumes:

8 Non-Polar Hydration Free Energy Estimator Explicit solvent hydration free energies of alkanes Cavity and van der Waals dispersion decomposition 1.SASA alone is unable to describe hydration free energy of linear vs. cyclic alkanes 2.Cavity component correlates well with SASA 3.Solute-solvent Van der Waals energy not directly correlated with SASA depends on number of atoms 4.Hydration free energy results from a balance between the opposing cavity and van der Waals components Gallicchio, E., M. Kubo, and R.M. Levy. J. Phys. Chem., 104, 6271-6285 (2000)

9 Replica exchange molecular dynamics rough energy landscapes and distributed computing 200 K MD 700 K “important coordinates” energy 450 K 320 K Y. Sugita, Y. Okamoto Chem. Phys. Let., 314, 261 (1999)

10 Replica exchange molecular dynamics rough energy landscapes and distributed computing 200 K MD 700 K 450 K 320 K Y. Sugita, Y. Okamoto (1999) Chem. Phys. Let., 314:261 “important coordinates” energy 700 K replica 200 K replica walker 4 walker 1 replica MD walker 2 walker 3

11 Protein folding: REMD and kinetic network models free energy surfaces of the GB1 peptide from REM and comparison with experiment kinetic network model of REMD (simulations of simulations) F2F2 U2U2 F1F1 U1U1 Andrec M, Felts AK, Gallicchio E, Levy RM.. PNAS (2005) 102:6801. kinetic network models and folding pathways Zheng W, Andrec M, Gallicchio E, Levy RM. PNAS (2007) 104:15340.

12 The  -Hairpin of B1 Domain of Protein G Folding nucleus of the B1 domain Blanco, Serrano. Eur. J. Biochem. 1995, 230, 634. Kobayashi, Honda, Yoshii, Munekata. Biochemistry 2000, 39, 6564. Features of a small protein: stabilized by 1) formation of secondary structure 2) association of hydrophobic residues Munoz, Thompson, Hofrichter, Eaton. Nature 1997, 390, 196. Computational studies using Explicit and Implicit solvent models Pande, PNAS 1999 Dinner,Lazaridis,Karplus,PNAS,1999 Ma & Nussinov, JMB, 2000 Pande, et al., JMB, 2001 Garcia & Sanbonmatsu, Proteins, 2001 Zhou & Berne, PNAS, 2002

13 The  -Hairpin of B1 Domain of Protein G The potential of mean force of the capped peptide. Simple (surf area) nonpolar modelOPLS/AGBNP A Felts, Y. Harano, E. Gallicchio, and R. Levy, Proteins, 56, 310 (2004)

14 The  -Hairpin of B1 Domain of Protein G The potential of mean force of the capped peptide. Simple (surf area) nonpolar modelOPLS/AGBNP A Felts, Y. Harano, E. Gallicchio, and R. Levy, Proteins, 56, 310 (2004)  -hairpin > 90%  -helix < 10%  G ~ 2 kcal/mol

15 Kinetic network models for folding Network nodes are snapshots from multiple temperatures of a replica exchange simulation. Waiting time in a state is an exponential random variable with mean = 1/(  j k ij ) Next state is chosen with probability proportional to k ij Simulations are performed using the Gillespie algorithm for simulating Markov processes on discrete states: Transition rates (edges) are motivated by Kramers theory: transitions are allowed if there is sufficient structural similarity, and forbidden otherwise. Dynamical/kinetic considerations: Equilibrium considerations: Sufficiently long trajectories must reproduce WHAM results. 800,000 nodes 7.4 billion edges T cold T hot Andrec, Felts, Gallicchio & Levy (2005) PNAS, 102, 6801

16 Connection between kinetic model and equilibrium populations Equilibrium populations for temperature T 0 are preserved if for each pair of nodes (i, j) the ratio of transition rates follows WHAM weighting: node i from temperature T A having energy E i node j from temperature T B having energy E j where f A (  0 ) and f B (  0 ) are free energy weights for the T A and T B simulations at reference temperature T 0 These weights are order-parameter independent and will give correct PMFs for any projection. T-WHAM PMF at low temperature contains information from high temperature simulations

17 The majority of beta-hairpin folding trajectories pass through alpha helical intermediate states   91% of 4000 temperature-quenched stochastic trajectories begun from high-energy coil states pass through states with  -helical content Fraction of hairpin conformation averaged over 4000 stochastic trajectories run at 300 K and begun from an initial state ensemble equilibrated at 700 K.    = 2500 units ≈ 50 µs  = 9 units ≈ 180 ns Andrec, Felts, Gallicchio & Levy (2005) PNAS, 102, 6801

18 Exploring stability and fitness landscapes: a biophysical view of protein evolution from DePristo, Weinreich & Hartl (2005) Nature Rev. Genetics 6:678 Approach: Analyze mutational patterns in sequence databases (information theory) Develop biophysical models Relate to structures

19 drug-naïve patients patients treated with 1 protease inhibitor patients treated with 2 or more protease inhibitors Example: Evolution of Drug Resistance in HIV Protease Study of 13,608 HIV-1 protease amino acid sequences from the Stanford HIV Drug Resistance Database (8,229 drug-naïve, 2,677 PI-monotherapy, 2,702 multi-drug therapy).

20 Pairwise and higher-order correlations among drug- resistance mutations in HIV protease residuestype drug contact?distance correlation (φ value) 82-71 primary- accessory yes-no17 Å0.33 46-10 primary- accessory no-no22 Å0.31 10-71-82triplet??? 10-46-71-82quartet???? 20–32–46–48–53–54–58–74–82–9010-mer?????????? exposed to 2 drugs exposed to 6 drugs

21 Sequence Correlations and Biophysics Correlations reflect the distribution of protein stabilities: sequence correlations: , I c (2), etc. all possible sequences folded sequences Narrow stability range implies sequence correlation Given sequence correlation, what can we say about stability and the fitness landscape? GG

22 Distribution of charged residues in observed sequences of HIV Protease Reduced model for electrostatics 3D structure of HIV Protease, charges placed on backbone and side chains of charged residues Scoring uses The Generalized Born Model: AGBNP2 Estimate stability as the difference in energies of observed sequences and random sequences Goal: Relate the charge patterns (~ 1200 distinct patterns in 13,000 sequences) to structure and stability

23 Observed mutations increase electrostatic stability of HIV protease observed sequences random sequences total electrostatic energy (kcal/mol) probability for: 5432 all mutations with net charge (+3) and total charge (19)

24 Conclusions Drug resistance mutations help HIV protease evade drugs, but at the cost of stability and activity Evolution of accessory mutations compensate for this loss of fitness Correlations amongst these mutations are complex and involve higher-order interactions beyond just pairs of residues (connected information) We are trying to develop biophysical models for these mutational patterns which link sequences with structure, and can explain e.g. the mutation patterns observed for charged residues

25 Important coordinates Effective potential Exploring landscapes... for protein folding, binding, and fitness Effective potentials for folding and binding (AGBNP) Emilio Gallicchio Protein folding pathways, network models, and kinetics Michael Andrec, Weihua Zheng, Tony Felts, E.G. Protein stability and fitness landscapes Omar Hag, Michael Andrec, Alex Morozov

26 Important coordinates Effective potential Exploring landscapes for protein folding and binding using replica exchange simulations The AGBNP all atom effective solvation potential & REMD Peptide free energy surfaces & folding pathways from all atom simulations and network models Temp. dependence of folding: physical kinetics and replica exchange kinetics with a network model Replica exchange on a 2-d continuous potential with an entropic barrier to folding

27 Drug treatment and mutational correlations Given the observed distribution we fit to maximize the likelihood of the observed sample and calculate their Shannon entropies The total contribution from correlation is The contribution from 2-body interactions is Higher-order interactions are 28% of total correlation Higher-order interactions are 41% of total correlation 10–20–33–36–46– 54–55–63–71–73– 74–82–84–90–93 20–32–46–48–53– 54–58–74–82–90

28 Recovering Folding Kinetics using Replica Exchange Simulations, a Kinetic Network model and Effective Stochastic Dynamics Weihua Zheng †, Michael Andrec ‡, Emilio Gallicchio ‡ and Ronald M. Levy ‡


Download ppt "Exploring landscapes...for protein folding, binding, and fitness “important coordinates” energy 700 K replica 200 K replica."

Similar presentations


Ads by Google