Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to Raise the Dead: The Nuts & Bolts of Ancestral Sequence Reconstruction Jeffrey Boucher Theobald Laboratory.

Similar presentations


Presentation on theme: "How to Raise the Dead: The Nuts & Bolts of Ancestral Sequence Reconstruction Jeffrey Boucher Theobald Laboratory."— Presentation transcript:

1 How to Raise the Dead: The Nuts & Bolts of Ancestral Sequence Reconstruction Jeffrey Boucher Theobald Laboratory

2 Talk Outline All the Pretty Corals: Convergent or Divergent Evolution How’d They Do That?

3 45° 11-stranded β-barrel with chromophore positioned in center Green Fluorescent Protein (GFP)

4 GFP Chromophore - Structure & Synthesis Ser 65 Tyr 66 Gly 67 Wachter 2006 Excitation - UV Blue Emission - Green Auto-catalyzation begins upon folding 1. Cyclization 3. Oxidation 2. Dehydation 3. Dehydation 2. Oxidation

5 dendRFP clawGFP mc5 G5 2 scubGFP1 mc2 R2 mc3 mc4 mc1 R1 2 G1 2 Kaede Ugalde, Chang & Matz 2004 GFP Superfamily GFP-like Proteins from Coral

6 Colors of the Rainbow Wachter 2006 2 Chemical Reactions ( Oxidation, Dehydration ) 3 Chemical Reactions ( Oxidation, Dehydration & 2 nd Oxidation extends π-system) Was complexity gained or lost?

7 Convergent vs. Divergent Evolution Terrestrial Vertebrates Tree of Life (tolweb.org) - Slide courtesy of Kristine Mackin

8 0.1 substitutions/site dendRFP clawGFP mc5 G5 2 scubGFP1 mc2 R2 mc3 mc4 mc1 R1 2 G1 2 Kaede Ugalde, Chang & Matz 2004 ALL RED/GREEN Pre-RED RED dendRFP clawGFP

9 0.1 substitutions/site dendRFP clawGFP mc5 G5 2 scubGFP1 mc2 R2 mc3 mc4 mc1 R1 2 G1 2 Kaede 495 nm 505 nm 518 nm 578 nm ALL RED/GREEN Pre-RED RED Ugalde, Chang & Matz 2004

10 mc5 G5 2 scubGFP1 mc2 R2 mc3 mc4 mc1 R1 2 G1 2 Kaede Ugalde, Chang & Matz 2004

11 How to Resurrect a Protein 1) Acquire/Align Sequences 2) Construct Phylogeny (from Chang et al. 2002) 3) Infer Ancestral Nodes

12 Acquiring Sequences BLAST (Basic Local Alignment Search Tool) How does alignment compare to alignment of random sequences? – E-value of 1 E -3 is a 1:1000 chance of alignment of random sequences

13 Homology vs. Identity Significant BLAST hits inform us about evolutionary relationships Homologous - share a common ancestor – Homology is a hypothesis, identity is calculated – This is binary, not a percentile – Homology does not ensure common function

14 Aligning Sequences Gap Penalty of -8 (heuristically determined) : 4 -8 5 4 0 6 2 4 6 5 4 0 3 4 -8 7 1 1 = 40 Orangutan Chimpanzee Multiple sequences aligned by similar method ClustalW MUSCLE (MUltiple Sequence Comparison by Log-Expectation) Faster than ClustalW, but methods similar

15 How to Resurrect a Protein 1) Acquire/Align Sequences 2) Construct Phylogeny (from Chang et al. 2002) 3) Infer Ancestral Nodes

16 Visual Depiction of Alignment Scores Suppose alignment of 3 sequences… Orangutan Chimpanzee Mouse O C M MCO 1919 4040 - 1818 -4040 -1818 1919 M O C Neighbor-Joining

17 Phylogenetic Programs PHYLIP (PHYLogeny Inference Package) PAUP (Phylogeny Analysis Using Parsimony) – Now incorporates Maximum Likelihood PhyML (Phylogenetic Maximum Likelihood) MrBayes BAli-Phy (Bayesian Alignment and Phylogeny estimation)

18 Maximum Likelihood (ML) Likelihood: – How surprised we should be by the data – Maximizing the likelihood, minimize your surprise Example: – Roll 20-sided die 9 times: Likelihood = Probability(Data|Model)

19 Maximum Likelihood (ML) Fair Die Model: – 5% chance of rolling a 20 Trick Die Model: – 100% chance of rolling a 20 Likelihood = Probablity(Data|Model) Likelihood = (0.05) 9 = 2 E -11 Likelihood = (1) 9 = 1 Assuming trick model maximizes the likelihood

20 From Dice to Trees – Data - Sequences/Alignment – Model - Tree topology, Branch lengths & Model of evolution Starting Tree Choose model that maximizes the likelihood Likelihood = Probablity(Data|Model) New Tree  Likelihood  Likelihood

21 How to Resurrect a Protein 1) Acquire/Align Sequences 2) Construct Phylogeny (from Chang et al. 2002) 3) Infer Ancestral Nodes

22 Reconstruction Methods Consensus Parsimony Maximum Likelihood

23 Consensus Advantage: Easy & fast Disadvantages: Ignores phylogenetic relationships X X

24 Parsimony Parsimony Principle – Best-supported evolutionary inference requires fewest changes – Assumes conservation as model Advantage over consensus: – Takes phylogenetic relationships into account

25 Parsimony A B C D E F G H ABCDEFGHABCDEFGH

26 Parsimony V V V I L L Example adapted from David Hillis I L {V} {L} {V, I} {V, I, L} Changes = 4 V L I I I V L

27 Parsimony - Alternate Reconstructions Resolve ambiguous reconstructions Is conservation the best model?

28 ML Improvements Over Parsimony PAML (Phylogeny Analysis by Maximum Likelihood) Includes evolutionary process & branch lengths – Reduction in ambiguous sites Fit of model included in calculation – Removes a priori choices – Use more complex models (when applicable) Confidence in reconstruction – Posterior probabilities

29 Ugalde, Chang & Matz 2004 Ancestral GFP Reconstructions

30 Ugalde, Chang & Matz 2004 Sites with PP<0.8 considered ambiguous Alt residues considered if PP>0.2 Alternative ancestors did not affect the conclusions PositionResiduePPResiduePPResiduePP 168P0.999 169K0.407R0.236S0.210 170V0.730I0.270 171I0.742V0.158R0.034

31 Thanks for Your Attention Questions?


Download ppt "How to Raise the Dead: The Nuts & Bolts of Ancestral Sequence Reconstruction Jeffrey Boucher Theobald Laboratory."

Similar presentations


Ads by Google