Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling I Prof. Corey O’Hern Department of Mechanical Engineering Department of Physics Yale University
Protein folding kinetics configuration space e N local energy minima (configurations) connected via transitions Random walk on network from initial to native state States and transition probabilities obtained from simulations basin of attraction transition initial state native state energy minimum
“Describing protein folding kinetics by Molecular Dynamics Simulations. 1. Theory” W. C. Swope, J. W. Pitera, and F. Suits, J. Phys. Chem. B 108 (2004) Markov Modeling of Proteins “Describing protein folding kinetics by Molecular Dynamics Simulations. 2. Example applications to Alanine Dipeptide and a -hairpin peptide” W. C. Swope, J. W. Pitera, et al., J. Phys. Chem. B 108 (2004) 6582.
Additional Reading 1. “Molecular simulation of ab Initio protein folding for a Millisecond folder NTL9(1-39),” JACS 132 (2010) “Using massively parallel simulation and Markovian models to study protein folding: Examining the dynamics of the Villin headpiece,” J. Chem. Phys. 124 (2006) “Progress and challenges in the automated construction of Markov state models for full protein systems,” J. Chem. Phys. 131 (2009) “Using generalized ensemble simulations and Markov state models to identify conformational states,” Methods 49 (2009) “Stochastic dynamics of model proteins on a directed graph,” Phys. Rev. Lett. 79 (2009)
Markov Modeling Describes temporal evolution of state of the system No memory; transition probabilities only depend on current state; satisfied by MD trajectories Time domain (continuous or discrete); state space (continuous or discrete) Statistical description: What is probability that member of the ensemble of systems will be in a given state at time t? How does one choose set of states for Markov model of protein dynamics---continuous degrees of freedom yields infinite number of states? Number of native contacts…but not specific enough
initial state native state Lumping of States: From 11 to 3 Are transitions among aggregated states (A, B, C) Markovian? Yes, at sufficiently long time scales. How does one decide on lumping scheme? A B C
Mathematical Description T to, from i,j=1,N s Elements non-negative Columns sum to 1 Eigenvalues i ≤ 1: T = ( =1) gives steady-state probability distribrution Detailed balance (no net flow) Eigenvectors form complete set N s - 1 eigenvalues determine relaxation rates
Toy Model
T ij = =1=1 Transition Matrix Lumped Transition Matrix =1=1 L
State Probabilities
Results from Toy Model 9 microstates L(T n ) (L(T)) n Larger deviations; practical
ii log i Eigenvalue Spectra Small deviations