Download presentation
Presentation is loading. Please wait.
Published byFelicity Hutchinson Modified over 9 years ago
1
A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006
2
Topics I.Global view of the protein structure universe II. Mapping of protein functions on the structural universe III. Global view of the evolution of proteins
3
J. Hou G. Sims I.-G. Choi S.-R. Jun C. Zhang
4
I. Mapping the Protein Structure Universe: Structural Demography
5
The Protein Universe 500 – 20,000 genes per organism >13.6 10 6 species >10 10 – 10 12 protein sequences but……….. ~10 5 protein sequence families ~10 4 protein structure families ~10 3 protein fold domain families
6
“Mapping” by Metric Matrix Distance Geometry (Classical Multidimensional Scaling) Pair-wise relational distances with “errors” Most likely (consistent) global relational “mapping” d 1,2 x1 x2x3 x4 d 2,4 d 1,3 d 2,3 d 3,4 d 1,4
7
Method Take all protein structures in PDB (>35,000) Construct a non-redundant set at 25% sequence identity (~2000 structures) Calculate all-to-all pair-wise structural similarities, then convert to dissimilarity scores Apply metric matrix distance geometry to find the global position of each structure in N- dimensional space 3-D plot to capture the major features of the protein structure space
8
Protein Structure Distance Matrix (~2000 structures with <25% sequence ID) P1P2P3P4 P5 P6 ……………P1898 P1 P2 P3 P4 P5 P6. P1898 D 3,4
9
Eigen values Positional coordinates in 1898 dimensional space. Major feature extraction in 3-dimension
10
The Protein Structure Universe (2005)
11
A1A2 A5 A3 A4 A1: (2ERL:_) MATING PHEROMONE ER-1; A2: (1ELW:B) TPR1-DOMAIN OF HOP; A3: (1A6M:_) MYOGLOBIN; A4: (1E85:A) CYTOCHROME C’; A5: (1M57:C) CYTOCHROME C OXIDASE; Four demographic regions of the protein structure universe
12
Four Protein Fold Classes nn n n m +
13
Major Features of the Protein Structural Space 1.Protein structural space is sparsely populated 2.Four elongated regions corresponding to four protein “fold” classes 3.Small to large size distribution along three of four “feature axes”
14
II. Mapping of Functions (1) Enzymatic functions
15
Molecular functions: Basic chemistry EC
16
EC3: Hydrolases
17
EC6: Ligases
18
II. Mapping of Functions (2) Metal Binding
19
Ca Co Cu Fe Mn Mo Ni Zn Multi-bound Not bound Metal Binding
20
Zn
21
Cu
22
Major Features of Functional Mapping Maximum diversity in architectural preference for a given molecular function: “scaffold” selection vs. design
23
III. Evolution of Proteins (a) “Ages” of Protein Families
24
Method: “Common Structural Ancestor”
25
The “age” of the “common structural ancestor” of a protein family “Age” of CSA
26
Ages of the Common Structural Ancestors Population averaged Chain length has similar distribution
27
III. Evolution of Proteins (b) Protein Fold Classes
28
ML Relative “age” of common structural ancestors
29
III. Evolution of Proteins (e) Protein Families
30
Hypothesis: Multiple Origins of Protein Families
31
Summary Mapping of protein structures— Sparse except four highly populated demographic regions (structural selection) Mapping of molecular functions— Opportunistic use of structural features for molecular function (selection, not design) Mapping of CSA ages— (1) Evolution of protein fold classes (2)”Multiple origin model” for the evolution of protein families
32
Organismic evolution by natural selection for environment may be founded on Molecular evolution by structural selection for function
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.