Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein Structural Prediction. Protein Structure is Hierarchical.

Similar presentations


Presentation on theme: "Protein Structural Prediction. Protein Structure is Hierarchical."— Presentation transcript:

1 Protein Structural Prediction

2 Protein Structure is Hierarchical

3 Structure Determines Function What determines structure? Energy Kinematics How can we determine structure? Experimental methods Computational predictions The Protein Folding Problem

4 Primary Structure: Sequence The primary structure of a protein is the amino acid sequence

5 Primary Structure: Sequence Twenty different amino acids have distinct shapes and properties

6 Primary Structure: Sequence A useful mnemonic for the hydrophobic amino acids is "FAMILY VW"

7 Secondary Structure: , , & loops  helices and  sheets are stabilized by hydrogen bonds between backbone oxygen and hydrogen atoms

8 Secondary Structure:  helix

9 Secondary Structure:  sheet  sheet  buldge

10 Second-and-a-half-ary Structure: Motifs beta helix beta barrel beta trefoil

11 Tertiary Structure: Domains

12 Mosaic Proteins

13 Tertiary Structure: A Protein Fold

14 Protein Folds Composed of , , other

15 Quaternary Structure: Multimeric Proteins or Functional Assemblies Multimeric Proteins Macromolecular Assemblies Ribosome: Protein Synthesis Replisome: DNA copying Hemoglobin: A tetramer

16 Protein Folding The amino-acid sequence of a protein determines the 3D fold [Anfinsen et al., 1950s] Some exceptions:  All proteins can be denatured  Some proteins have multiple conformations  Some proteins get folding help from chaperones The function of a protein is determined by its 3D fold Can we predict 3D fold of a protein given its amino-acid sequence?

17 The Leventhal Paradox Given a small protein (100aa) assume 3 possible conformations/peptide bond 3 100 = 5 × 10 47 conformations Fastest motions 10- 15 sec so sampling all conformations would take 5 × 10 32 sec 60 × 60 × 24 × 365 = 31536000 seconds in a year Sampling all conformations will take 1.6 × 10 25 years Each protein folds quickly into a single stable native conformation ­ the Leventhal paradox

18 Quick Overview of Energy Strength (kcal/mole) Bond 3-7H-bonds 10Ionic bonds 1-2 Hydrophobic interactions 1 Van der vaals interactions 51Disulfide bridge

19 The Hydrophobic Effect Important for folding, because every amino acid participates! Trp2.25 Ile1.80 Phe1.79 Leu1.70 Cys1.54 Met1.23 Val1.22 Tyr0.96 Pro0.72 Ala0.31 Thr0.26 His0.13 Gly0.00 Ser-0.04 Gln-0.22 Asn-0.60 Glu-0.64 Asp-0.77 Lys-0.99 Arg-1.01 Experimentally Determined Hydrophobicity Levels Fauchere and Pilska (1983). Eur. J. Med. Chem. 18, 369-75.

20 Protein Structure Determination Experimental  X-ray crystallography  NMR spectrometry Computational – Structure Prediction (The Holy Grail) Sequence implies structure, therefore in principle we can predict the structure from the sequence alone

21 Protein Structure Prediction ab initio  Use just first principles: energy, geometry, and kinematics Homology  Find the best match to a database of sequences with known 3D- structure Threading Meta-servers and other methods

22 Ab initio Prediction Sampling the global conformation space  Lattice models / Discrete-state models  Molecular Dynamics  Pre-set libraries of fragment 3D motifs Picking native conformations with an energy function  Solvation model: how protein interacts with water  Pair interactions between amino acids Predicting secondary structure  Local homology  Fragment libraries

23 Lattice String Folding HP model: main modeled force is hydrophobic attraction  NP-hard in both 2-D square and 3-D cubic  Constant approximation algorithms  Not so relevant biologically

24 Lattice String Folding

25 ROSETTA http://www.bioinfo.rpi.edu/~bystrc/hmmstr/server.php http://depts.washington.edu/bakerpg/papers/Bonneau-ARBBS-v30-p173.pdf Monte Carlo based method Limit conformational search space by using sequence—structure motif I-Sites library (http://isites.bio.rpi.edu/Isites/)  261 patterns in library  Certain positions in motif favor certain residues Remove all sequences with <25% identity Find structures of the 25 nearest sequence neighbors of each 9-mer Rationale  Local structures often fold independently of full protein  Can predict large areas of protein by matching sequence to I- Sites ? ? ?

26 I-Sites Examples Non polar helix  Abundance of alanine at all positions  Non-polar side chains favored at positions 3, 6, 10 (methionine, leucine, isoleucine) Amphipathic helix  Non-polar side chains favored at positions 6, 9, 13, 16 (methionine, leucine, isoleucine)  Polar side chains favored at positions 1, 8, 11, 18 (glutamic acid, lysine)

27 ROSETTA Method New structures generated by swapping compatible fragments Accepted structures are clustered based on energy and structural size Best cluster is one with the greatest number of conformations within 4-Å rms deviation structure of the center Representative structures taken from each of the best five clusters and returned to the user as predictions ? ? ?

28 Robetta & Rosetta

29

30 Rosetta results in CASP

31 Rosetta Results In CASP4, Rosetta’s best models ranged from 6–10 Å rmsd C  For comparison, good comparative models give 2-5 Å rmsd C  Most effective with small proteins (<100 residues) and structures with helices

32 Only a few folds are found in nature

33 The SCOP Database Structural Classification Of Proteins FAMILY: proteins that are >30% similar, or >15% similar and have similar known structure/function SUPERFAMILY: proteins whose families have some sequence and function/structure similarity suggesting a common evolutionary origin COMMON FOLD: superfamilies that have same secondary structures in same arrangement, probably resulting by physics and chemistry CLASS: alpha, beta, alpha–beta, alpha+beta, multidomain

34 Status of Protein Databases SCOP: Structural Classification of Proteins. 1.67 release 24037 PDB Entries (15 May 2004). 65122 Domains. Class Number of folds Number of superfamilies Number of families All alpha proteins202342550 All beta proteins141280529 Alpha and beta proteins (a/b)130213593 Alpha and beta proteins (a+b)260386650 Multi-domain proteins40 55 Membrane and cell surface proteins 428291 Small proteins71104162 Total88714472630 EMBL PDB

35 Evolution of Proteins – Domains #members in different families obey power law 429 families common in all 14 eukaryotes; 80% of animal domains, 90% of fungi domains 80% of proteins are multidomain in eukaryotes; domains usually combine pairwise in same order --why? Evolution of proteins happens mainly through duplication, recombination, and divergence Chothia, Gough, Vogel, Teichmann, Science 300:1701-17-3, 2003

36 Homology-based Prediction Align query sequence with sequences of known structure, usually >30% similar Superimpose the aligned sequence onto the structure template, according to the computed sequence alignment Perform local refinement of the resulting structure in 3D 90% of new structures submitted to PDB in the past three years have similar folds in PDB The number of unique structural folds is small (possibly a few thousand)

37 Examples of Fold Classes

38 Homology-based Prediction Raw model Loop modeling Side chain placement Refinement

39 Homology-based Prediction


Download ppt "Protein Structural Prediction. Protein Structure is Hierarchical."

Similar presentations


Ads by Google