Protein Properties Function, structure Residue features Targeting Post-trans modifications BIO520 BioinformaticsJim Lund Reading: Chapter , 11.7, 12.1, Ch 14
Protein function, structure Homology to known proteins through protein search and alignment Functional domains –NCBI CDD (conserved domain database) –Interpro (PROSITE, Pfam, SMART, etc.)
Simple Analyses Composition (#’s of each aa) Isoelectric point Extinction co-efficient Peptide cleavage Repetitive regions ExPASy, BCM Search Launcher, other sites.
CDD domains in Insulin-Like Growth Factor Receptor
Composition (Peptide) COMPOSITION of: (Human glyceraldehyde-3- phosphate dehydrogenase) *.pep Check: 5354 from: 1 to: 334 October 6, :18 ***** A: 33 C: 3 D: 23 E: 16 F: 15 G: 34 H: 10 I: 21 K: 27 L: 20 M: 9 N: 13 P: 11 Q: 4 R: 10 S: 20 T: 21 V: 32 W: 3 Y: 9 Other: 0 Total: 334
Isoelectric Point(pI) Number of Hydrogen Ions Bound Net pH Arg Lys His Tyr Cys Glu Asp NH2 COOH Total Charge
Extinction co-efficient S-S SH 1 g/ml Used for determining protein concentration. Based on number C, Y, and W residues.
PeptideMap PeptideSort Cleave polypeptide with proteases, reagents Predict HPLC properties of peptide fragments
Protein Analyses Secondary Structure Prediction Hydrophobicity/membrane insertion Antigenicity Surface Accessibility Baylor Search Launcher: predict.html ExPASy:
Secondary Structure Prediction Amino acid preferences Local aa interactions Non-local interactions Homology/Multiple alignments
Predict H, E, L (Helix, B-strand, Loop) ~75% aa accuracy Programs –APSSP –Jpred –PHDsec –SAM-T99 –PredictProtein Web sites –BCM Search Launcher –ExPASy Tools Secondary Structure Prediction
Membrane Association Hydropathy, Hydophobicity – G transfer (water-vapor) –% sidechains buried (100% or 95%) Identify membrane-associated regions Identify membrane-spanning regions (>16 aa, ~80 Å helix)
Hydrophobicity Membrane Topology Original paper: Kyte-Doolittle –J. Mol. Biol. (1978) 157: Further refinements: Hopp-Woods –Proc Natl Acad Sci U S A (1981) 78:3824 Gunnar von Heijne GES-scale (Goldman, Engelman, Steitz) –Engelman et al, 1982
SOAP (Kyte-Doolittle) Plots
Membrane Topology Hydrophobic helices Dipoles aligned Charges? What is the overall structure? External and internal domains?
Localization Signal sequences direct proteins –Usually N or C terminal signal sequences Targeted to: membrane, secretion, nucleus, mitochondria, chloroplast, lysosome, peroxisome, periplasm Program criteria: N-term motifs, aa composition, protein domains specific to locations, homology to proteins with experimentally determined locations. Accuracy varies with type of prediction, 50-80% Programs: –WoLF PSORT (eukaryotic), PSORTb (bacterial), PSORT (plant) –SUBLOC –TargetP
Surface Accessibility Emini et al (1985) 55:836 Can also calculate (for real) on a PDB structure via WWW heidelberg.de/ASC/scr1-form.html
Antigenicity Jameson and Wolf (CABIOS 4:181) –Sums secondary structure indices, surface accessibility, backbone flexibility –Many good epitopes are linear, surface loops Used when “picking” antigenic peptides
Transmembrane segment prediction Hydrophobicity Membrane association/topology Programs –PHDhtm –TopPred –TMHMM