T-cell epitope prediction by molecular dynamics simulations Irini Doytchinova Medical University of Sofia School of Pharmacy Medical University of Sofia
Vaccines and Epitopes live attenuated or killed pathogens subunit vaccines epitope-based vaccines Epitope is a continuous or non-continuous sequence of a protein that is recognized by and interacts with other protein. linear epitope conformational epitope Т-limphocyte В-limphocyte
Antigen processing pathways Intracellular pathway Extracellular pathway
T-cell epitope prediction in vitro and in vivo tests clinical tests Epitope-based vaccine development in silico prediction 100 aa 92 overlapping nonamer peptides 10 nonamer peptides
T-cell epitope prediction T-cell epitope prediction is a critical step in the development of epitope- based vaccines. As the veracity of the predictions improves, the subsequent expensive “wet lab” work becomes faster, more efficient and more successful. in silico prediction in vitro and in vivo tests clinical tests Epitope-based vaccine development Biology Informatics Bioinformatics Immunology Immunoinformatics
Immunoinformatics approaches Sequence-based methodsStructure-based methods peptide pIC50exp ILDPFPVTV ALDPFPPTV VLDPFPITV LLDPFPPPV ILDPIPPTV LLDDFPVTV ILDPLPPTV YLFPGPVTA Affinity = f (Chemical Structure) Motif-based, QMs, ANN, SVM Affinity = f (Interaction energy) Molecular docking Molecular dynamics
Our immunoinformatics tools
MHC class II binding prediction by molecular dynamics Combinatorial library ΔG PKYVKQNTLKLAT PKXVKQNTLKLAT PKYVKXNTLKLAT … PKYVKQNXLKLAT … PKYVKQNTXKLAT … PKYVKQNTLKXAT … A … … … … … C … … … … … D … … … … … E … … … … … … … … External validation QM Peptide – HLA-DP2 protein complex (DPA1*0103 red, DPB1*0101 blue) pdb code: 3lqz, April 2010
Combinatorial library p1 p2 p3 p4 p5 p6 p7 p8 p9 RK FHYLPFLPS TGGS 9 positions x 19 amino acids + 1 original ligand = 172 ligands
MD simulations Problems to solve: 1. Which energy to use for prediction? 2. How long to equilibrate the system? GROMACS is developed by Herman Berendsens group, Groningen University. GROMACS 4.0.7: Hess, et al. (2008) J. Chem. Theory Comput. 4: pdb to gmx neutralize the charge with counterions create a box around the complex energy minimization fill the box with water molecules position-restrained MD MD with simulated annealing record the interaction energies Force field: GROMOS96 53a6 side: 1 nm NA+ 20 ps K LJ-SR & Coul-SR
Which energy to use for prediction ? Test set n = 1932 known binders to HLA-DRB1*0101 originating from 122 foreign proteins Lennard-Jones short-range potential gives better prediction than Coulomb short-range potential. Sensitivities were calculated over the top 5% of the predicted affinities of all overlapping peptides originating from one protein.
How long to equilibrate the system? Time/accuracy trade-off: 1 ns calculated for 11 hours
MD-based Quantitative Matrices (MD-QMs) Normalized position per position (QMnpp) Normalized over all positions (QMnap) Favourable amino acids have positive values, disfavourable aa take negative ones.
External validation Test set of 457 known binders to HLA-DP2 protein originating from 24 foreign proteins Immune Epitope Database: Peptidescore Score = X p1 + X p2 + X p3 + X p4 + X p5 + X p6 + X p7 + X p8 + X p9 MGHRTYYKL0.567 GHRTYYKLP1.245 HRTYYKLPR2.935 RTYYKLPRT TYYKLPRTT3.719 YYKLPRTTN1.543 YKLPRTTNV0.451 KLPRTTNVD2.039 TYYKLPRTT3.719 HRTYYKLPR2.935 KLPRTTNVD2.039 YYKLPRTTN1.543 GHRTYYKLP1.245 MGHRTYYKL0.567 YKLPRTTNV0.451 RTYYKLPRT Peptidescore ranking top 5%
External validation QMnap predicts better than QMnpp.
Influence of flanking residues p1 p2 p3 p4 p5 p6 p7 p8 p9 RK FHYLPFLPS TGGS 13 positions x 19 amino acids + 1 original ligand = 248 ligands p-1 p-2 p+1 p+2
External validation Addition of flanking residues terms does not improve the predictive ability.
Addition of cross terms p1 p2 p3 p4 p5 p6 p7 p8 p9 RK FHYLPFLPS TGGS Score = X p1 + X p2 + X p3 + X p4 + X p5 + X p6 + X p7 + X p8 + X p9 + X p1p2 + X p2p3 + X p3p4 + X p4p5 + X p5p6 + X p6p7 + X p7p8 + X p8p9
External validation Addition of cross terms slightly improves the predictive ability.
Influence of anchor residues p1 p4 p6 p9 RK F HY L P FL P S TGGS 5 positions x 19 amino acids + 1 original ligand = 96 ligands p7
External validation Anchor-based QM is better predictor than all position-based QM.
Anchor residues + cross terms p1 p4 p6 p9 RK F HY L P FL P S TGGS p7 Score = X p1 + X p4 + X p6 + X p7 + X p9 + X p1p4 + X p4p6 + X p6p7 + X p7p9
External validation Combination between anchor positions and cross terms improves the prediction.
Acknowledgements Ivan Dimitrov Mariyana Atanasova Panaiot Garnev Department of Chemistry School of Pharmacy Medical University of Sofia Peicho Petkov School of Physics University of Sofia Darren R. Flower Aston University, Birmingham, UK All models are wrong but some are useful. George E. P. Box, 1987 Professor of Statistics, University of Wisconsin