Correlated mutations The phenomenon of several mutations occurring simultaneously and dependent on each other According to the current hypothesis of molecular positive Darwinian selection, correlated mutations are related to the changes occurring in their neighborhood, they reflect the protein-to-protein interaction and they preserve the biological activity and structural properties of the molecule
Eglin-like proteinase inhibitor family (25 sequences) Bowman-Birk proteinase inhibitor family (52 sequences) Myoglobins (74 sequences) Lysozymes (56 sequences) Four unrelated protein families have been studied for correlated mutations occurrence and characteristics:
The amino acids occurring at variable positions of eglin and Bowman-Birk family EGLIN-LIKE PROTEINS 8KR43 -EILV 9-ELNQRST44 -DL 10EFMQRST45 -ANS 11FW46 DGM 12P47 GQSTV 13EHQ48 AFHINPV 14LV49 -W 15CILV50 -AFIV 16EG51 -T 17ACKLMSTV52 AEKLMQT 18DGPRST53 DEN 19AGITV54 EFILY 20ADEKLS55 DKLNR 21ADEFKLQVY56 CFILPY 22A57 DEKNQ 23AEKMRV58 R 24AEGKQT59 IV 25-U1IKTVY60 FR 26FIV61 ILV 27EKLQT62 FLWY 28AEKLQRT63 DNVY 29DEHQ64 ADHNT 30KMNRY65 -DEIKLPRV 31PSV66 -AGLNRS 32DEKLNQRS67 -DGNT 33AILVY68 -DFIKLNSTY 34DEKQRST69 IV 35-AINV70 ANTV 36-E71 DKNQRSZ 37-V72 AHIMPTV 38-EHIQY73 -ASV 39-FILTV74 -P 40-LMSV75 -AHKQRSTV 41-P76 IV 42-EHIQRS77 AGT BOWMAN-BIRK INHIBITORS 3-DEKQST35 ADET 4-RSTVY36 C 5-KPST37 DEKLNS 6EGHKPSTW38 ADEFGHKLRST 7AEGKP39 C 8C40 AEGILMV 9C41 CEKP 10DNRS42 ANRSTV 11EFHIKLQRST43 EFHKLRTVY 12ACQ44 DGS 13-ADFIKLMPRSTV45 DEFIMNQSY 14C46 -DPS 15C47 -AGPS 16AKR48 KLMPQR 17S49 CHR 18DEFIKMNQR50 FHIQRSVY 19P51 -I 20AP52 C 21EFIKMQT53 ABEFGLQTVY 22C54 DN 23HQRSTV55 -IMQTV 24C56 DHKNQTY 25AEHMNQRSTV57 -DHIKNRTV 26DNQ58 -FGY 27-IKLMQTV59 -CDI 28GLRV60 -HPTY 29DEFIL61 -ADEGKP 30DEKNQRT62 -AKPQS 31-S63 -MT 32C64 -C 33AHPS65 -DEHKNR 34ADS66 -DENPS
The position variability patterns of myoglobins and lysozymes
The observed number and contribution of three correlation types in four different protein families The correlation sets consist of 2 to over 20 residues The protein family (number of correlated positions/set) The correlation statistics Total number of correlation sets observed Number of dispersed sets Number of narrow clusters Number of undirected clusters Number of sets related to active center Eglin-like proteins (2-13) Bowman-Birk proteinase inhibitors (2-28) Myoglobins (2- 29) n.a. Lysozymes (2-15) All families125 (100%)59 (47.2%)38 (30.4%)28 (22.4%)-
Program FEEDBACK – what does it do? The program FEEDBACK is designed to analyze the multiple aligned protein sequences for correlated mutations occurrence. It returns in result all possible residues occurring at all sequence positions of aligned proteins for each residue occurring at each position. The result visualization is assisted by MS EXCEL. This application is available as freeware upon request.
The three types of distribution of correlated positions present in eglin-like proteins. The residue location and relative distribution is shown on tertiary structure of eglin C (P01051) Position no. and occurring residues Correlation versus position [–DGNT]D (8)G (9) 10 [–ELNQRST]ETLNQRS The dispersed correlation
The three types of distribution of correlated positions present in eglin-like proteins. The residue location and relative distribution is shown on tertiary structure of eglin C (P01051) Position no. and occurring residues Correlation versus position [DEKQRST]T (10)Q (6) 15 [CILV]CILV 17 [DGPRST]PRST 27 [EKLQT]EQKL 28 [AEKLQRT]KTEQR 30 [KMNRY]NKM 32 [DEKLNQRS]KLSDEN 56 [CFILPY]CIP 68 [–DFIKLNSTY]DFI–KNT The narrow correlation cluster
The three types of distribution of correlated positions present in eglin-like proteins. The residue location and relative distribution is shown on tertiary structure of eglin C (P01051) Position no. and occurring residues Correlation versus position [KMNRY]K (6)N (15) 18 [DGPRST]SDGPRT 27 [EKLQT]LEQ 29 [DEHQ]DEQ 33 [AILVY]AILV 35 [–AINV]IV–AN 68 [–DFIKLNSTY]–NSDFIKLS Y The spot correlation cluster
The three types of distribution of correlated positions present in Bowman-Birk inhibitor family The residue location and relative distribution is shown on tertiary structure of Bowman- Birk inhibitor from soybean (P01055) The dispersed correlation Position no. and occurring residues Correlation versus position [DEFIL]L (37)E (12) 6 [EGHKPSTW]EGKSTW 13 [–ADFIKLMPRSTV]–AFILPRTM 40 [AEGILMV]AILMVE 48 [KLMPQR]KLMQR
The three types of distribution of correlated positions present in Bowman-Birk inhibitor family The residue location and relative distribution is shown on tertiary structure of Bowman- Birk inhibitor from soybean (P01055) The narrow correlation cluster Position no. and occurring residues Correlation versus position [–ADFIKLMPRSTV]L (11)M (10) A (8) 4 [–RSTVY]V–SS 5 [–KPST]K–SS 7 [AEGKP]APP 11 [EFHIKLQRST]TEHQS 21 [EFIKMQT]TQEQ
The three types of distribution of correlated positions present in Bowman-Birk inhibitor family The residue location and relative distribution is shown on tertiary structure of Bowman- Birk inhibitor from soybean (P01055) The spot correlation cluster Position no. and occurring residues Correlation versus position [AEHMNQRSTV]A (15)V (9) 11 [EFHIKLQRST]EFKLRSHQ 23 [HQRSV]QR 50 [FHIQRSVY]HRSFI
The three types of distribution of correlated positions present in myoglobins. The residue location and relative distribution is shown on tertiary structure of human myoglobin (P0244, pdb1bzp) The dispersed correlation Position no. and occurring residues Correlation versus position [AGPQST]A (6)G (49)N (9) 128 [ABEHQ]QBEHQQ 137 [ILNSV]LLILNSV
The narrow correlation cluster The three types of distribution of correlated positions present in myoglobins The residue location and relative distribution is shown on tertiary structure of human myoglobin (P0244, pdb1bzp) Position no. and occurring residues Correlation versus position [AEGST]A (7)G (55)S (10) 22 [AEGPSTV]PSTAEGP STV P 26 [EGHKLQ]LQEGH QK Q 27 [ADEFLNT]AENADEF T E 30 [ILMTV]ILIMTVI 53 [ADEGQ]DEQADEGD 54 [ADEILQ]AELDELQE 59 [ADEF]DEADEFE 128 [ABEHQ]QABEH Q Q
The three types of distribution of correlated positions present in myoglobins The residue location and relative distribution is shown on tertiary structure of human myoglobin (P0244, pdb1bzp) The spot correlation cluster Position no. and occurring residues Correlation versus position [AMSTV]A (58)S (7) 27 [ADEFLNT]ADEFNTE 31 [GKRS]GKRSR 78 [AKLQ]KALQ 109 [DEGNT]DEGTE 116 [AEHKQST]AEHKQSA 117 [AEKNQS]AEKQSE 122 [BDEN]BDEND
The three types of distribution of correlated positions present in lysozymes The residue location and relative distribution is shown on tertiary structure of lysozyme from rat (P00697, pdb5lyz) The dispersed correlation Position no. and occurring residues Correlation versus position [GHKNR]G (7)H (31)N (16) 30 [ILMV]MVILMVV 40 [DFKNR]DNNFKNR
The three types of distribution of correlated positions present in lysozymes The residue location and relative distribution is shown on tertiary structure of lysozyme from rat (P00697, pdb5lyz) The narrow correlation cluster Position no. and occurring residues Correlation versus position [FL]F (38)L (18) 26 [–ILMV]–ILMVL 33 [AISTV]AISTVA 44 [FIMRTVY]FIMRTVYT 54 [–KRSTY]–KRSTYT 84 [AKNRS]AKNRSS
The three types of distribution of correlated positions present in lysozymes The residue location and relative distribution is shown on tertiary structure of lysozyme from rat (P00697, pdb5lyz) The spot correlation cluster Position no. and occurring residues Correlation versus position [–DWY]W (16)Y (36) 13 [–ILM]MIL 15 [–AEKNQRS]RAEKNQ RS 20 [DEGKN]KNDG 27 [–AEGPR]GAEP 44 [FIMRTVY]TFIRTY 46 [GHKNPRTY]RGHNRY 105 [–AGHPQRV]GRV–APQ 109 [DGKNQRST]NGKRST 121 [–HKQRT]THKQR
CONCLUSIONS Almost 50% of the observed correlated mutations refer to residues that are not in contact nor interact with each other The dispersed correlations are present in various protein families and they occur independently on the mechanism of structure stabilization The phenomenon of correlated mutations is not limited to interacting residues and/or known biological activity determination The current hypothesis of positive Darwinian selection does not fully explain the mechanism and occurrence of correlated mutations
Łukasz Becella 1 Monika Sobczyk 1 Jacek Leluk 1,2 1 Institute of Biochemistry and Molecular Biology, Univeristy of Wrocław 2 Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw Monika Grabiec 1 Correlated mutations team Similarity estimation team