Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS177 Lecture 7 Computational Aspects of Protein Structure II Tom Madej 10.25.04.

Similar presentations


Presentation on theme: "CS177 Lecture 7 Computational Aspects of Protein Structure II Tom Madej 10.25.04."— Presentation transcript:

1 CS177 Lecture 7 Computational Aspects of Protein Structure II Tom Madej 10.25.04

2 Research news (Nature 10.21.04) Another milestone for the Human Genome Project. –Fills in approx. 99% of the “gene rich” portion of the genome (10% more than the 2001 drafts). –Only 341 remaining gaps, formerly hundreds of thousands. –New estimate of the number of genes: 20,000-25,000. Megabase deletions result in viable mice! –Researchers deleted 1.5 Mb and 0.8 Mb portions of the mouse genome, non-coding regions, and the mice seem to be fine!

3 Nature Oct. 21, 2004, 931-945

4

5 Example for last homework I searched “Structure” with the term “Leukemia”. The first structure was 1uc6A. I noticed a couple of Vast neighbors with low percent sequence identity but very similar folds, 1uemA (17.4%), 1uenA (13.7%). I ran PSI-BLAST with query sequence 1uc6A. The CD Search got a hit to “Fibronectin type 3”. 1uemA and 1uenA are also assigned to FN3, but for some reason 1uc6 is not (???). I got lucky, 1uemA and 1uenA were found by PSI- BLAST but did not cross the significance threshold prior to convergence!

6

7 Overview of lecture Protein structure –General principles –Structure hierarchy –Supersecondary structures –Superfolds and examples: TIM barrels, OB fold Protein structure comparison algorithms –VAST (Vector Alignment Search Tool) –CE (Combinatorial Extension) Protein fold classification databases –SCOP (Structural Classification of Proteins) –CATH (Class, Architecture, Topology, Homologous superfamily)

8 General principles Most protein structures are composed of two types of regular structural elements interconnected by less well- structured regions. Regular secondary structure elements (SSEs): α-helices and β-strands. Irregular regions: loops or coil. A pair of SSEs positioned next to each other in space may be parallel or anti-parallel.

9 General principles (cont.) Helices are stabilized by “internal” hydrogen bonds. Hydrogen bonds will form between an adjacent pair of strands. Strands will form larger structures such as β-sheets or β- barrels. Due to the residue side chains, there are favored packing angles between helices/helices, helices/sheets, and sheets/sheets.

10 Examples of protein architecture β-sheet with all pairs of strands parallel β-sheet with all pairs of strands anti-parallel Architecture refers to the arrangement and orientation of SSEs, but not to the connectivity.

11 Examples of protein topology Topology refers to the manner in which the SSEs are connected. Two β-sheets (all parallel) with different topologies.

12 Exercise Take a look at 1r7sA in Cn3D. Draw a topology diagram showing the way the strands are connected.

13 Angles between SSEs in contact The data on the next 3 slides gives the cosine of angles between a pair of SSE vectors. The SSE’s were required to be “in contact”, i.e. within 10 Å of each other. Note: The SSEs are not necessarily consecutive in the sequence!

14

15

16

17 Examples of structures formed by β-strands Triosphosphate isomerase 7timA Retinol binding protein 1rbp Porin 1oh2P

18 Higher level organization A single protein may consist of multiple domains. Examples: 1liy A, 1bgc A. The domains may or may not perform different functions. Proteins may form higher-level assemblies. Useful for complicated biochemical processes that require several steps, e.g. processing/synthesis of a molecule. Example: 1l1o chains A, B, C.

19 Example: Replication Protein A E. Bochkareva et al. The EMBO Journal (2002) 21 1855-1863 RPA binds to ssDNA, is involved in recombination, replication, and repair. It is a heterotrimer, consisting of three subunit proteins that bind together. See structure 1l1o.

20 Supersecondary structures β-hairpin α-hairpin βαβ-unit β4 Greek key βα Greek key

21 Supersecondary structure: simple units G.M. Salem et al. J. Mol. Biol. (1999) 287 969-981

22 Supersecondary structure: Greek key motifs G.M. Salem et al. J. Mol. Biol. (1999) 287 969-981

23 Examples of β4 Greek key motif 1hk0 Human Gamma-D Crystallin; residues 32 thru 64 in domain 1. OB fold (we’ll see this fold later).

24 Examples of βα Greek key motif 1bgw Topoisomerase; residues 487 thru 540 in domain 5. 1ris Ribosomal protein S6.

25 Protein folds There is a continuum of similarity! Fold definition: two folds are similar if they have a similar arrangement of SSEs (architecture) and connectivity (topology). Sometimes a few SSEs may be missing. Fold classification: To get an idea of the variety of different folds, one must adjust for sequence redundancy and also try to correctly assign homologs that have low sequence identity (e.g. below 25%).

26 Superfolds (Orengo, Jones, Thornton) Distribution of fold types is highly non-uniform. There are about 10 types of folds, the superfolds, to which about 30% of the other folds are similar. There are many examples of “isolated” fold types. Superfolds are characterized by a wide range of sequence diversity and spanning a range of non-similar functions. It is a research question as to the evolutionary relationships of the superfolds, i.e. do they arise by divergent or convergent evolution?

27 Superfolds and examples Globin 1hlm sea cucumber hemoglobin; 1cpcA phycocyanin; 1colA colicin α-up-down 2hmqA hemerythrin; 256bA cytochrome B562; 1lpe apolipoprotein E3 Trefoil 1i1b interleukin-1β; 1aaiB ricin; 1tie erythrina trypsin inhibitor TIM barrel 1timA triosephosphate isomerase; 1ald aldolase; 5rubA rubisco OB fold 1quqA replication protein A 32kDa subunit; 1mjc major cold- shock protein; 1bcpD pertussis toxin S5 subunit α/β doubly-wound 5p21 Ras p21; 4fxn flavodoxin; 3chy CheY Immunoglobulin 2rhe Bence- Jones protein; 2cd4 CD4; 1ten tenascin UB αβ roll 1ubq ubiquitin; 1fxiA ferredoxin; 1pgx protein G Jelly roll 2stv tobacco necrosis virus; 1tnfA tumor necrosis factor; 2ltnA pea lectin Plaitfold (Split αβ sandwich) 1aps acylphosphatase; 1fxd ferredoxin; 2hpr histidine-containing phosphocarrier

28 TIM barrels Classified into 21 families in the CATH database. Mostly enzymes, but participate in a diverse collection of different biochemical reactions. There are intriguing common features across the families, e.g. the active site is always located at the C- terminal end of the barrel.

29 N. Nagano et al. J. Mol. Biol. (2002) 321 741-785

30 TIM barrel evolutionary relationships (Nagano, Orengo, Thornton) Sequence analysis with advanced programs such as PSI-BLAST and IMPALA have identified further relationships among the families. Further interesting similarities observed from careful comparison of structures, e.g. a phosphate binding site commonly formed by loops 7, 8 and a small helix. In summary, there is evidence for evolutionary relationships between 17 of the 21 families.

31 OB (oligonucleotide/oligosaccharide- binding) fold 5-stranded β-barrel with Greek key topology. All OB folds have the same binding face that is involved in their biochemistry.

32 V. Arcus Curr. Opinion Struct. Biol. (2002) 12 794-801

33 OB evolutionary relationships SCOP lists 9 superfamilies. Bacterial enterotoxin superfamily consists of two families, almost certainly evolutionarily related. Nucleic acid-binding superfamily has 11 families, if evolutionarily related the ancestral protein would come from the LUCA (Last Universal Common Ancestor). Evidence for common ancestry of all OB folds is probably weaker than for TIM barrels.

34 Protein structure comparison How to compare 3D protein structures? Analogous computational considerations to sequence comparison, e.g. accuracy, efficiency for database searches, statistical significance of results, etc. Additional complication: working with atomic coordinates in 3D space!

35 Some protein structure comparison methods VAST (Vector Alignment Search Tool, NCBI) CE (Combinatorial Extension, RCSB/PDB) DALI (EBI)

36 VAST outline 1.Parse protein structures into SSEs (helices and strands). 2.Fit vectors to SSEs. 3.To compare a pair of proteins attempt to superpose as many vectors as possible, subject to constraints. 4.Evaluate the vector alignment for statistical significance( computer an E-value). 5.If the vector alignment is significant then proceed to a more detailed residue-to-residue alignment (“refined alignment”).

37 3chy1ipf A Two protein with vectors assigned to SSEs

38 Vector superpositionRefined alignment VAST comparison of 3chy and 1ipfA

39

40 SCOP (Structural Classification of Proteins) http://scop.mrc-lmb.cam.ac.uk/scop/ Levels of the SCOP hierarchy: –Family: clear evolutionary relationship –Superfamily: probable common evolutionary origin –Fold: major structural similarity

41

42 CATH (Class, Architecture, Topology, Homologous superfamily) http://www.biochem.ucl.ac.uk/bsm/cath/

43


Download ppt "CS177 Lecture 7 Computational Aspects of Protein Structure II Tom Madej 10.25.04."

Similar presentations


Ads by Google