Download presentation
Presentation is loading. Please wait.
Published byAugust Bridges Modified over 9 years ago
1
BMMB597E Protein Evolution Protein classification 1
2
Protein families The first protein structures determined by X-ray crystallography, myoglobin and haemoglobin, were solved (in 1959—60) before the amino acid sequences were determined It came as a surprise that the structures were quite similar Soon it became clear, on the basis of both sequences and structures, that there were families of proteins 2
3
myoglobin haemoglobin 3
4
50 years earlier, there were some hints … E.T. Reichert & A.P. Brown. The differentiation and specificity of corresponding proteins and other vital substances in relation to biological classification and organic evolution: the crystallography of hemoglobins. (Carnegie Institution of Washington, 1909) Crystallography 3 years before discovery of X- ray diffraction? 4
5
Reichert and Brown studied interfacial angles in haemoglobin crystals Stenö’s law (1669): different crystals of the same substance may have differerent sizes and shapes, but the angles between faces are constant for each substance They found that the angles differed from species to species Similarities in values of interfacial angles were consistent with classical taxonomic tree They even found differences between oxy- and deoxyhaemoglobin 5
6
Most premature scientific result ever? These results implied: – That proteins adopted (or at least could adopt) unique structures, to form a crystal – That protein structures varied between species – That this variation was parallel with the evolution of the species – That proteins could change structure as a result of changes in state of ligation In 1909! 6
7
M.O. Dayhoff Pioneer of bioinformatics Collected protein sequences First curated ‘database’ Recognized that proteins form families, on the basis of amino acid sequences Computational sequence alignments First evolutionary tree First amino-acid substitution matrix (later replaced by BLOSUM) 7
8
Can relationships among proteins be extended beyond families? Families = sets of proteins with such obvious similarities that we assume that they are related One question: how much similarity do we need to believe in a relationship? How far can evolution go? Convergent evolution? Cautionary tale: chymotrypsin / subtilisin 8
9
Chymotrypsin-subtilisin Both proteolytic enzymes – Chymotrypsin mammalian – subtilisin from B. subtilis Both have catalytic triads Same function – same mechanism Sequences 12% similar (near noise level) However, structures show them to be unrelated 9
10
Chymotrypsin / Subtilisin 10
11
Catalytic triad in serine proteinases 11
12
Chymotrypsin and subtilisin have similar catalytic triads 12
13
How can we classify proteins that belong to families? Align sequences Calculate phylogenetic tree (various ways to do this, depend on sequence alignment) Usually, phylogenetic tree of homologous proteins from different species follow phylogenetic tree based on classical taxonomy That is reassuring But what happens as divergence proceeds? 13
14
How can we classify proteins that do not obviously belong to families? Base this on structure rather than sequence Structural similarities are maintained as divergence proceeds, better than sequence similarities For closely related proteins, expect no difference between sequence-based and structure based classification How far can classification be extended? 14
15
SCOP Structural Classification of Proteins Idea of A.G. Murzin, based on old work by C. Chothia and M. Levitt Even if two proteins are not obviously homologous, they may share structural features, to a greater or lesser degree. For instance, the secondary structures of some proteins are only -helices Others, have -sheets but no -helices 15
16
SCOP SCOP is a database that gives a hierarchical classification of all protein domains Recall that a domain is a compact subunit of a protein structure that ‘looks as if’ it would have independent stability 16 Fragment of fibronectin
17
Dissection of structure into domains It is not always quite so obvious how to divide a protein into domains There is some (not a lot) of room for argument Note that sometimes the chain passes back and forth between domains In these cases one or both domains do not consist entirely of a consecutive set of residues 17
18
lactoferrin 18
19
SCOP, CATH, DALI Database classify protein structures SCOP (Structural Classification of Proteins) CATH (Class, Architecture, Topology, Homologous superfamily) DALI Database These web sites have many useful features: – information-retrieval engines, including search by keyword or sequence – presentation of structure pictures – links to other related sites including bibliographical databases. 19
20
SCOP http://www.scop.mrc-lmb.cam.ac.uk SCOP organizes protein structures in a hierarchy according to evolutionary origin and structural similarity. Domains -- extracted from the Protein Data Bank entries. Sets of domains are grouped into families: sets domains for which imilarities in structure, function and sequence imply a common evolutionary origin. 20
21
The SCOP hierarchy Families that share a common structure, or even a common structure and a common function, but lack adequate sequence similarity – so that the evidence for evolutionary relationship is suggestive but not compelling – are grouped into superfamilies Superfamilies that share a common folding topology, for at least a large central portion of the structure, are grouped as folds. Finally, each fold group falls into one of the general classes. 21
22
Major classes in SCOP – secondary structure all helical – secondary structure all sheet / – helices and sheets, but in different parts of structure + – contain - - supersecondary structure ‘small proteins’ – which often have little secondary structure and are held together by disulphide bridges or ligands; for instance, wheat- germ agglutinin) 22
23
Summary of SCOP hierarchy Class Fold Superfamily Family Domain 23
24
SCOP classification of flavodoxin Protein: Flavodoxin from Clostridium beijerinckii [TaxId: 1520][TaxId: 1520] Lineage: Root: scopscop Class: Alpha and beta proteins (a/b) [51349] Mainly parallel beta sheets (beta-alpha-beta units)Alpha and beta proteins (a/b) Fold: Flavodoxin-like [52171] 3 layers, a/b/a; parallel beta-sheet of 5 strand, order 21345Flavodoxin-like Superfamily: Flavoproteins [52218] Family: Flavodoxin-related [52219] binds FMNFlavoproteinsFlavodoxin-related Protein: Flavodoxin [52220] Species: Clostridium beijerinckii [TaxId: 1520] [52226]Clostridium beijerinckii [TaxId: 1520] PDB Entry Domains: 5nul5nul complexed with fmn; mutant chain a [31191] chain a 2fax2fax complexed with fmn; mutant chain a [31194] chain a … many others 24
25
Clostridium beijerinckii Flavodoxin (stereo pair) 25
26
Flavodoxin NADPH-cytochrome P450 reductase same superfamily, different family 26
27
Flavodoxin CHEY same fold, different superfamily 27
28
Flavodoxin Spinach ferredoxin reductase same class, different folds 28
29
Flavodoxin in the SCOP hierarchy To give some idea of the nature of the similarities expressed by the different levels of the hierarchy Flavodoxin from Clostridium beijerinckii and NADPH- cytochrome P450 reductase are in the same superfamily, but different families. Flavodoxin and the signal transduction protein CHEY are in the same fold category, but different superfamilies. Flavodoxin and Spinach ferredoxin reductase are in the same class – + – but have different folds. 29
30
CATH presents a classification scheme similar to that of SCOP CATH = Class, Architecture, Topology, Homologous superfamily, the levels of its hierarchy. In CATH, proteins with very similar structures, sequences and functions are grouped into sequence families. A homologous superfamily contains proteins for which similarity of sequence and structure gives evidence of common ancestry A topology or fold family comprises sets of homologous superfamilies that share the spatial arrangement and connectivity of helices and strands Architectures are groups of proteins with similar arrangements of helices and sheets, but with different connectivity. For instance, different four -helix bundles with different connectivities would share the same architecture but not the same topology in CATH General classes of architectures in CATH are: . , - (subsuming the / and + classes of SCOP), and domains of low secondary structure content. 30
31
Do different classification schemes agree? To classify protein structures (or any other set of objects) you need to be able to measure the similarities among them. The measure of similarity induces a tree-like representation of the relationships. CATH, SCOP, DALI and the others, agree, for the most part, on what is similar, and the tree structures of their classifications are therefore also similar. However, even an objective measure of similarity does not specify how to define the different levels of the hierarchy. These are interpretative decisions, and any apparent differences in the names and distinctions between the levels disguise the underlying general agreement about what is similar and what is different. 31
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.