Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transmembrane proteins in the Protein Data Bank: identification and classification Gabor, E. Tusnady, Zsuzanna Dosztanyi and Istvan Simon Bioinformatics,

Similar presentations


Presentation on theme: "Transmembrane proteins in the Protein Data Bank: identification and classification Gabor, E. Tusnady, Zsuzanna Dosztanyi and Istvan Simon Bioinformatics,"— Presentation transcript:

1 Transmembrane proteins in the Protein Data Bank: identification and classification Gabor, E. Tusnady, Zsuzanna Dosztanyi and Istvan Simon Bioinformatics, vol 20, issue 17, pages 2964- 2972

2 PDB contains only a few hundred membrane proteins. Annotation of the entries is poor – location of the lipid bilayer is not indicated in the PDB as the proteins are crystallized without the natural lipid bilayer Currently no method to detect the membrane plane from atomic coordinates

3 What they’ve done 1 – locating the most likely position of the lipid bilayer 3 – developed a database of all known transmembrane proteins and fragments - (automatically updated) TMDET 2 – a geometrical approach to distinguish between transmembrane and globular proteins using only structural information PDB_TM

4 Approximately 300 membrane protein structure files in PDB out of > 20,000 Transmembrane proteins are larger than globular proteins and thus harder to determine their structure using NMR Crystallography of membrane proteins is still a challenge Deposited structures have no information on the protein in its native conditions, immersed in a lipid bilayer

5 Problems in distinguishing between globular and transmembrane proteins: Surface of globular proteins is not always entirely hydrophilic Membrane embedded parts of transmembrane proteins may have polar and charged residues playing a role in ion transport or enzymatic activities Low resolution of many transmembrane proteins -> distorted secondary structure and hydrogen bond network

6 Method TMDET: search for the most probable position of the membrane planes relative to given atomic coordinates. (written in C, typical run takes a few seconds on a Pentium 4 processor) - measures the fitness of membrane localization with an objective function - proteins are classified based on the best value of the objective function (Q-value) hydrophobicity structure

7 Algorithm omit viral/pilus proteins and nucleotides Construct biological molecule Calculate water accesible area Calculate objective function hydrophobicity structure factor Q-value

8 Construction of the BIOMOLECULE crystallization artifacts removed analyzing internal symmetry, may help find membrane axis - If Q-value using rotational axis exceeds preset threshold, rotational axis considered as membrane normal. Otherwise non-identical chains are tested for most likely position of membrane.

9 Calculating the water-accessible surface area – Lee and Richards, 1971 Only atoms which potentially interact with lipid bilayer considered (for algorithm efficiency) Protein cut into 1Å wide slices along defined axis (axis defined in previous step) Test points are placed around each slice of atoms Atoms lying closest to any test point are defined to be possibly membrane exposed

10 Objective function Hydrophobicity + structure factor Fitness of a given membrane position to the protein Return value: Q-value (average of the products of the hydrophobic factor and structure factor for each slice) Hydrophobic factor Relative hydrophobic membrane exposed surface area Protein is cut into 1Å wide slices along the normal vector * For each slice: - the membrane exposed ‘water accessible surface area’ of hydrophilic and hydrophobic residues are summed up separately

11 Structure factor A product of three factors: 1- straightness factor – if the projection of c-alpha atoms of residue i, i-3, i+3 onto the normal vector of the membrane, are in a monotone decreasing or increasing order 2- turn factor – one minus the frequency of the i-th residue being in the centre of a turn (the c-alpha projection of residues i, i-3, and i+3 onto the normal vector is not in a monotone decreasing or increasing order). 3- end-chain factor – one minus the relative frequency of chain end residues in a given slice

12 Searching for the best membrane plane Simplest case: - internal symmetry analysis leads a rotational axis - if a 15Å slice exceeds a predefined threshold, it is accepted as the normal vector for the membrane Other cases: - exhaustive search carried out – unit vectors are used to sample possible membrane normals. Best vector (highest Q-value) is found by calculating the Q-value of 15Å slices along each unit vector.

13 Distribution of Q-values for transmembrane (solid line) and globular (dashed line) proteins

14 Results Classification of proteins was based on the Q-value i.e. if the best Q-value was below a lower selection limit, it was classified as a globular protein. Tested on 489 globular and 254 transmembrane proteins Q-values clearly separated, only 1.8% overlap (9 proteins) TMDET accuracy 98.7% Entire PDB was eventually scanned with TMDET - 22178 proteins, 324 classified as membrane proteins

15 Protein Data Bank of Transmembrane Proteins (PDB_TM) Classification of membrane proteins + localization of membrane planes + localization of transmembrane segments in the sequence. Searchable and public for academic users (http://www.enzim.hu/PDB_TM)http://www.enzim.hu/PDB_TM TM proteins grouped into structural families (10 Beta- barrel and 29 alpha helical families) Helps the structural analysis of transmembrane proteins and the validation of transmembrane topology prediction algorithms Automatically updated

16

17


Download ppt "Transmembrane proteins in the Protein Data Bank: identification and classification Gabor, E. Tusnady, Zsuzanna Dosztanyi and Istvan Simon Bioinformatics,"

Similar presentations


Ads by Google