Presentation is loading. Please wait.

Presentation is loading. Please wait.

Average: 86.5% Median: 88% Stdev: 9%

Similar presentations


Presentation on theme: "Average: 86.5% Median: 88% Stdev: 9%"— Presentation transcript:

1 Average: 86.5% Median: 88% Stdev: 9%

2 Average: 89% Median: 91.5% Stdev: 8%

3 Ways to construct Protein Space
Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space. Each additional sequence position adds another dimension, doubling the diagram for the shorter sequence. Shown is the progression from a single sequence position (line) to a tetramer (hypercube). A four (or twenty) letter code can be accommodated either through allowing four (or twenty) values for each dimension (Rechenberg 1973; Casari et al. 1995), or through additional dimensions (Eigen and Winkler-Oswatitsch 1992). Eigen, M. and R. Winkler-Oswatitsch (1992). Steps Towards Life: A Perspective on Evolution. Oxford; New York, Oxford University Press. Eigen, M., R. Winkler-Oswatitsch and A. Dress (1988). "Statistical geometry in sequence space: a method of quantitative comparative sequence analysis." Proc Natl Acad Sci U S A 85(16): Casari, G., C. Sander and A. Valencia (1995). "A method to predict functional residues in proteins." Nat Struct Biol 2(2): 171-8 Rechenberg, I. (1973). Evolutionsstrategie; Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Stuttgart-Bad Cannstatt, Frommann-Holzboog.

4 Diversion: From Multidimensional Sequence Space to Fractals

5 one symbol -> 1D coordinate of dimension = pattern length

6 Two symbols -> Dimension = length of pattern
length 1 = 1D:

7 Two symbols -> Dimension = length of pattern
length 2 = 2D: dimensions correspond to position For each dimension two possibiities Note: Here is a possible bifurcation: a larger alphabet could be represented as more choices along the axis of position!

8 Two symbols -> Dimension = length of pattern
length 3 = 3D:

9 Two symbols -> Dimension = length of pattern
length 4 = 4D: aka Hypercube

10 Two symbols -> Dimension = length of pattern

11 Three Symbols (the other fork)

12 Four Symbols: I.e.: with an alphabet of 4, we have a hypercube (4D) already with a pattern size of 2, provided we stick to a binary pattern in each dimension.

13 hypercubes at 2 and 4 alphabets
2 character alphabet, pattern size 4 4 character alphabet, pattern size 2

14 Three Symbols Alphabet suggests fractal representation

15 3 fractal enlarge fill in outer pattern repeats inner pattern
= self similar = fractal

16 3 character alphabet 3 pattern fractal

17 3 character alphapet 4 pattern fractal
Conjecture: For n -> infinity, the fractal midght fill a 2D triangle Note: check Mandelbrot

18 Same for 4 character alphabet
1 position 2 positions 3 positions

19 4 character alphabet continued (with cheating I didn’t actually add beads)
4 positions

20 4 character alphabet continued (with cheating I didn’t actually add beads)
5 positions

21 4 character alphabet continued (with cheating I didn’t actually add beads)
6 positions

22 4 character alphabet continued (with cheating I didn’t actually add beads)
7 positions

23 Animated GIf 1-12 positions

24 Protein Space in JalView

25 Alignment of V F A ATPase ATP binding SU (catalytic and non-catalytic SU)

26 UPGMA tree of V F A ATPase ATP binding SU with line dropped to partition (and colour) the 4 SU types (VA cat and non cat, F cat and non cat). Note that details of the tree

27 PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree

28 Same PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree, but turned slightly. (Giardia A SU selected in grey.)

29 Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 5th axis. (Eukaryotic A SU selected in grey.)

30 Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 6th axis. (Eukaryotic B SU selected in grey - forgot rice.)

31 Problems Jalview’s approach requires an alignment - only homologous sequences can be depicted in the same space Solution: One could use pattern absence / presence as coordinates


Download ppt "Average: 86.5% Median: 88% Stdev: 9%"

Similar presentations


Ads by Google