Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University of Madras, Guindy Campus Chennai
In this talk Introduction Mean Field Theory Mutually Orthogonal Latin Squares Applications to – Peptide structure Energy landscape Protein structure Ligand Docking
The Problem: Given the sequence of a protein, can we use information from Physics, Chemistry, …… to calculate its three dimensional structure ? Introduction
‘Native Structure’ is one with minimum potential energy Choose a model + force field (potential function) Optimize structure with respect to potential function
Introduction Optimisation techniques Monte Carlo Molecular Dynamics – simulated annealing Genetic algorithms …… Mean Field Technique
Mean field technique has been applied, e.g. to predict protein side chain structure Φ is the conformational search space This is divided into a number of subspaces φ i Each such subspace has a number of states φ ij each with a probability of occurrence ρ ij
First ρ ij = 1/n i n i is number of states of subspace i Next, the effective potential due to a state φ rs of a subspace φ r is evaluated as V eff (φ rs ) = ∑ i,j ρ ij V(φ rs, φ ij ) Next, ρ rs re-evaluated as ρ rs = exp{-V eff (φ rs )/RT} / ∑ q exp{-V eff (φ rq )/RT} Mean Field Technique
Application to peptide/protein (backbone) structure - Torsion angles as subspaces? Extension to torsion angle space is not straight- forward The interaction between a pair of subspaces (when the subspaces are the torsion angles) does not depend only on their respective states, but is a function of the states of all other subspaces. i.e. V(φ rs, φ ij ) is not meaningful - we need V(φ rs, φ ij, ……) Mean Field Technique
To avoid combinatorial explosion we use a small sample (~ n 2 ) of the possible (m n ) combinations We use mutually orthogonal Latin squares (MOLS) to identify the sample e.g. 3 torsion angles, 5 values each - Mutually Orthogonal Latin Squares φ ij, i = 1, 3; j = 1, 5 33 11 22
Mutually Orthogonal Latin Squares φ 11, φ 21 φ 31 φ 15, φ 24 φ 33 φ 14, φ 22 φ 35 φ 13, φ 25 φ 32 φ 12, φ 23 φ 34 φ 12, φ 22 φ 32 φ 11, φ 25 φ 34 φ 15, φ 23 φ 31 φ 14, φ 21 φ 33 φ 13, φ 24 φ 35 φ 13, φ 23 φ 33 φ 12, φ 21 φ 35 φ 11, φ 24 φ 32 φ 15, φ 22 φ 34 φ 14, φ 25 φ 31 φ 14, φ 24 φ 34 φ 13, φ 22 φ 31 φ 12, φ 25 φ 33 φ 11, φ 23 φ 35 φ 15, φ 21 φ 32 φ 15, φ 25 φ 35 φ 14, φ 23 φ 32 φ 13, φ 21 φ 34 φ 12, φ 24 φ 31 φ 11, φ 22 φ 33 3 MOLS of order 5
Mutually Orthogonal Latin Squares The effective energy is now V eff (φ rs ) = ∑ q w q V q (φ rs …) w q = exp{-V q (φ rs …)/RT} / ∑ q exp{-V q (φ rs …)/RT} The summation is over all the points in the MOLS grid in which φ rs occurs Note: w q is calculated from V q, not from V eff
φ 11, φ 21 φ 31 φ 15, φ 24 φ 33 φ 14, φ 22 φ 35 φ 13, φ 25 φ 32 φ 12, φ 23 φ 34 φ 12, φ 22 φ 32 φ 11, φ 25 φ 34 φ 15, φ 23 φ 31 φ 14, φ 21 φ 33 φ 13, φ 24 φ 35 φ 13, φ 23 φ 33 φ 12, φ 21 φ 35 φ 11, φ 24 φ 32 φ 15, φ 22 φ 34 φ 14, φ 25 φ 31 φ 14, φ 24 φ 34 φ 13, φ 22 φ 31 φ 12, φ 25 φ 33 φ 11, φ 23 φ 35 φ 15, φ 21 φ 32 φ 15, φ 25 φ 35 φ 14, φ 23 φ 32 φ 13, φ 21 φ 34 φ 12, φ 24 φ 31 φ 11, φ 22 φ 33 Mutually Orthogonal Latin Squares e.g. for φ 11
Mutually Orthogonal Latin Squares Since w q is calculated from V q, not from V eff Therefore w q is not ρ ij and the procedure is not iterative Parameterize the search space Use these to build a set of MOLS (chosen at random) to globally sample the space Analyze the samples to obtain a low energy conformation (This is followed by gradient minimization) Another low energy conformation?
Mutually Orthogonal Latin Squares We obtain ~1500 low energy structures By clustering, we show these may be reduced to ~ 50 mutually dissimilar structures e.g. 23 structures for Met-enkephalin
Mutually Orthogonal Latin Squares The search is exhaustive Plot of ‘new’ structure versus structure number Sample Overlap First sample Second sample
Applications – Peptide Structure Peptide Conformation search – the procedure identifies all computed and experimental structures e.g. Met-enkephalin, ECEPP/3 force field MOLS GEM of Scheraga, et al. MOLS Crystal Structure We use the MOLS procedure to create structure libraries
Applications – Peptide Structure Energy landscape, ECEPP/3 force field Minimal energy envelope for Met-enkephalin
Applications – Protein Structure MOLS libraries + genetic algorithms (ECEPP/3) (AMBER + ‘hydrophobic’) Selection Initialization Mutation Variation Diversity Fragment Libraries Fragments MOLS Minimization & Clustering Phase 1 Phase 2 Cross over
Applications – Protein Structure Avian Pancreatic Polypeptide 4.0 A Villin Head Piece 5.2 A Mellitin 4.3 c-MYB 6.1 A Tryptophan zipper 1.8 A Left Predicted; Right Experimental
Applications – Protein loops Loops in protein crystal structures classified by size. Structure predicted using ECEPP/3 …DSPEF… …VNRKSD… …INMTSQQ… …TNASTLDT… …KAPGGGCND… …RNLTKDRCKP… Blue – Predicted structure Orange – Crystal structure
Applications – Docking Ligand (drug) docking to proteins Redefine the search space as the conformational space of peptide ligand (i.e. n torsion angles φ 1 to φ n ), plus the ‘docking’ space (i.e. the rotation and translation parameters of the peptide in receptor site, r 1 to r 6 ). Composite scoring function is now f 1 {φ 1 to φ n } + f 2 {r 1 …r 6 } conformational energy + ‘docking’ energy
PDB ID: 1sua, RMSD = 1.50Å Sequence: ALAL PDB ID: 1a30, RMSD = 0.66Å Sequence: EDL PDB ID: 8gch, RMSD = 1.04Å Sequence: GAW PDB ID: 1b32, RMSD = 0.51Å Sequence: KMK PDB ID: 1dkx, RMSD = 1.35Å Sequence: NRLLLTG PDB ID: 1awq, RMSD = 1.50Å Sequence: HAGPIA Blue – predicted structure Red – Crystal structure
Acknowledgements Z A Rafi, K Vengadesan, J Arunachalam, V Kanakasabai, P Arun Prasad CSIR DST-FIST UGC-SAP Thank You