Download presentation
1
The Anatomy and Taxonomy of Protein Structure
First few lectures: how do we look at protein structures? how do we classify and compare them? Today, a little about the protein backbone or main chain.
2
Backbone geometry in proteins
Ramachandran plot y f Angle w is almost always close to 180--the peptide bond is planar and trans. f and y may vary but are limited to certain combinations as shown at right. yellow and blue delineate sterically allowed conformations Red shows residues in helical secondary structure, cyan In beta-sheet, and black other. Squares indicate glycines
3
Hydrogen bond geometry
Hydrogen bond not really a covalent ”bond”--not much orbital overlap. Model as an electrostatic interaction between two dipoles consisting of the H-N bond and the O sp2 lone pair. In electrostatic theory, the optimal orientation of two such dipoles is head-to-tail. The energy of such an arrangement should decrease as the head and tail are brought together as long as atomic van der Waals radii are not violated (then repulsive forces quickly take over). “Ideal” hydrogen bond in this model would have r~3.0 Å, p=180°, b=0° and g=±60°. Convince yourself of this. In small molecule crystals, this is approximately what is observed, though there is a lot of variation in the angles b and g. Thus the precise C=O…H angle parameters are not critical. Main chain-main chain hydrogen bonds found in proteins will show various deviations from this geometry, partly due to the topological constraints imposed by forming secondary structures.
4
Criteria for identifying hydrogen bonds in protein structures
What is a reasonable hydrogen bond? Criteria for identifying hydrogen bonds are somewhat arbitrary and many have been used. Here are a couple of examples. Geometric criteria: Often H-bonds are just identified by two parameters, the O…N (acceptor-donor) distance r, and a O…H-N angle p. The angles describing the C=O…H geometry are ignored. Typical cutoffs: p > 120° and r < 3.5 Å. (Baker & Hubbard, 1984) Electrostatic criteria: One of the most commonly used criteria is a potential function based on a pure electrostatic model (Kabsch & Sander, 1983). Place partial positive and negative charges on the C,O (+q1,-q1) and N,H (+q2,-q2) atoms and compute a binding energy as the sum of repulsive and attractive interactions between these four atoms: E=q1q2(1/r(ON)+1/r(CH)-1/r(OH)-1/r(CN))*f where q1=0.42e and q2=0.20e, f is a dimensional factor (=332) to convert E to kcal/mol, and r(AB) is the interatomic distance between atoms A and B. A hydrogen bond is then identified by a binding energy less than some arbitrary cutoff, e.g. E< -0.5 kcal/mol. Note that the criteria defined above are only applicable when hydrogen atom positions are available. Crystal structures do not have hydrogens--however, their positions can be computed in many cases.
5
Secondary Structure Identification
Next week we’ll learn about predicting the locations of secondary structures along the amino acid sequence of a protein from the sequence information alone. To evaluate whether such a prediction is correct, one has to be able to identify secondary structures from an experimentally determined set of protein coordinates: i.e. how do you define where a secondary structure element begins and ends? A “trivial but difficult” problem (Richardson, 1981) There is no single and correct algorithm for assigning secondary structure type. Most commonly used criteria are backbone conformation (phi,psi) and hydrogen bonding pattern. DSSP (Kabsch & Sander, 1983) and STRIDE (Frishman & Argos, 1995) are two of the more common programs, though there are many ways of defining secondary structure boundaries.
6
DSSP: turn and helix definitions
‘>’ ‘3’ ‘3’ ‘<‘ notation -N-C-C--N-C-C--N-C-C--N-C-C- residues H O N O H O H O > < H-bond 4-turn: ‘>’ ‘4’ ‘4’ ‘4‘ ‘<‘ notation -N-C-C--N-C-C--N-C-C--N-C-C-N-C-C residues H O N O H O H O H O > < H-bond 5-turn (just an elaboration of 3- and 4-turn. A minimal helix is two consecutive N-turns-- for a minimal four helix from residue i to i+3: i <--residue >444< and >444< overlap to give >>44<< which defines a helix HHHH from i to i+3 ‘H’ is the notation for a residue in a 4-helix. Notice that the helix does not include the residues involved in the terminal H-bonds. Longer helices are overlapping minimal helices.
7
DSSP: bridge, ladder and sheet definitions
parallel bridge: ‘x’ notation -N-C-C--N-C-C--N-C-C- residues H O H O H O \ / H-bonds \ / (\ and /, .\ / or .) . \ / . H O H O H H residues -N-C-C--N-C-C--N-C-C- ‘x’ notations antiparallel bridge: ‘X’ notation -N-C-C--N-C-C--N-C-C- residues H O H O H O . ! ! H-bonds . ! ! (! or .) . ! ! . O H O H O H residues -C-C-N--C-C-N--C-C-N- ‘X’ notations ladder= set of one or more consecutive bridges of identical type sheet= set of one or more ladders connected by shared residues
8
STRIDE (2ndary STRucture IDEntification)
Uses what is known as a “knowledge-based” potential--we as a community of scientists know intuitively how to define secondary structures, we just can’t put our finger on it! So how do we quantify what we already know? Set of qualitative criteria--most common criteria used by crystallographers are backbone conformation and hydrogen bonding. “standard of truth”--collective wisdom of crystallographers--2ndary structure assignments made by crystallographers when they submitted structures to the Protein Databank. STRIDE makes potential energy functions for H-bonding and backbone conformation but leaves floating parameters which are adjusted to best reproduce crystallographers’ assignments.
9
Boundaries of a helix psi phi 2 Is 10 in the helix? How about 11?
12 2 2 psi 11 10 phi Is 10 in the helix? How about 11? How about 2? 11 10
10
Side chain conformation • side chains differ in
their number of degrees of conformational freedom (some don’t have any) •but side chains of very different size can have the same number of chi angles.
11
Names of canonical side chain conformations
name of conformation t=trans, g=gauche IUPAC nomenclature:
12
Rotamers a particular combination of angles c1, c2, etc. for a particular residue is known as a rotamer. for example, for aspartate, if one considers only the canonical staggered forms, there are nine (32) possible rotamers: g+g-, g+g+, g-g-, g-g+, tg+, g+t, tg-, g-t, tt not all rotamers are equally likely. for example, valine prefers its t rotamer. distribution of valine rotamers in protein structures (from Ponder & Richards, 1987) c1=0 180 360
13
Rotamer libraries one of the problems in designing and modelling/predicting protein structures is how to construct an appropriate group of rotamers to represent the possible side chain conformations observed in proteins without using so many as to make the problem computationally intractable. such groups of rotamers are known as rotamer libraries (Ponder & Richards, 1987). the probability of finding a particular rotamer is affected by what the backbone angles for that residue are (phi, psi). For instance, the g+ conformation is very rarely found in a helix. Thus, backbone-dependent rotamer libraries are also sometimes used. We’ll delve into this in more depth in about a week when we do homology modelling
14
side chain rotamers are not limited to canonical eclipsed forms--there are many subtly different rotamers from Xiang & Honig, 2001 an “x degree rotamer” in this figure means that at least one side chain angle differs by x degrees.
15
Surface and interior of proteins
do proteins have a lot of holes/empty space inside? how much of a protein’s molecular surface is in contact with the surrounding solvent (water in the case of globular, soluble proteins)? are certain residues more likely to be in contact with solvent than others?
16
Calculating Solvent Accessible Surface Area
Lee & Richards, 1971; Shrake & Rupley, 1973 First, represent atoms as spheres with appropriate van der Waals radii eliminate overlapping parts of spheres This gives a space-filling model similar to the picture at right
17
•Now roll a sphere of a given radius all around the
Van der Waals surface •the sphere will not make contact with the entire van der Waals surface •its center will trace out a continuous surface as it rolls
18
Now look at a cross-section:
Inner surfaces here are van der Waals. Outer surface is that traced out by the center of the sphere as it rolls around the van der Waals’ surface. If any part of the arc around a given atom is traced out, that atom is accessible to solvent. The solvent accessible surface of the atom is defined as the sum the arcs traced around an atom. there’s not much solvent accessible surface in the middle van der Waals surface solvent accessible surface from Lee & Richards, 1971 arc traced around atom
19
Fractional accessibility
calculate total solvent accessible surface of protein structure (also can calculate solvent accessible surface for individual residues/sidechains within the protein) can also model the accessible surface area in an unfolded protein using accessible surface area calculations on model tripeptides such as Ala-X-Ala or Gly-X-Gly. from these we can calculate what fraction of the surface is buried (inaccessible to solvent) by virtue of being within the folded, native structure of the protein. this is done by dividing the accessible surface area in the native protein structure by the accessible surface in the modelled unfolded protein. That’s the fractional accessibility. The residue fractional accessibility and side chain fractional accessibility refer to the same thing calculated for individual residues/sidechains within the structure.
20
Accessible surface area in protein structures
accessible surface area As in native states of proteins is a non-linear function of molecular weight (Miller, Janin, Lesk & Chothia, 1987): As = 6.3Mr0.73 ` where Mr is molecular wt this is an empirical correlation but it comes close to the expected two-thirds power law relating surface area to volume or mass. Why is the exponent a little larger?
21
How much surface area is buried when a protein folds?
estimate accessible surface area in unfolded proteins using the accessible surface areas in Gly-X-Gly or Ala-X-Ala models. This is a linear function of molecular weight At = 1.48Mr + 21 the total fractional accessibility is As/At ,and the fraction of surface area buried is 1- As /At what fraction of surface area is typically buried for a protein of molecular weight 5000 daltons? 30,000 daltons?
22
Distribution of residue fractional accessibilities
note that a sizable group are completely buried (hatched) or nearly completely buried note broad distribution among non-buried residues, and mean accessibility for non-buried residues of around 0.5 note that few residues are completely exposed to solvent, but that fractional accessibility of >1 is possible from (Miller et al, 1987)
23
Buried residues in proteins
the fraction of buried residues (defined by 0% or 5% ASA cutoffs) increases as a function of molecular weight--for your average protein around 25% of the residues will be buried. These form the core. size class mean Mr fraction of buried residues 0% ASA 5% ASA small medium large XL all
24
Core of 434 cro 8% accessibility cutoff
25
Residue fractional accessibility correlates with free energies of transfer for amino acids between water and organic solvents (Miller, Janin, Lesk & Chothia, 1987) (Fauchere & Pliska, 1983) the interior of a protein is akin to a nonpolar solvent in which the nonpolar sidechains are buried. Polar sidechains, on the other hand, are usually on the surface.
26
Hydropathy scales the correlation between a residue being polar or nonpolar and its tendency to be buried is a sequence-structure relationship-- a number of such relationships can be seen from examining protein structures. As we will see next week, such relationships are useful in trying to predict protein structure from amino acid sequence. many scientists have tried to develop hydrophobicity or hydropathy scales to quantify the tendency of residues to be buried. Most such scales are based on partitioning of the amino acid between water and some nonpolar solvent, or between the surface and interior of proteins.
27
Kyte-Doolittle Hydropathy
nonpolar on the bubble polar/ charged (Kyte & Doolittle, 1981)
28
Buried polar residues in proteins
while most of the protein interior is made up of nonpolar side chains, the average protein will have a few buried polar residues, even ones which are capable of carrying a formal charge, e.g. Lys, Arg, Glu, Asp. charged residues are almost always paired with other charged residues to make salt bridges, or hydrogen bonded to other polar groups. in general, a key rule of protein structure anatomy is that you rarely see buried hydrogen bond donors/acceptors not paired to other acceptors/donors. Arg10 buried salt bridge hydrogen bond to main chain Glu35 Arg5
29
Cavities in proteins protein interiors generally have high packing densities such that not much void space is present. nonetheless, proteins do sometimes have interior cavities big enough to fit water molecules.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.