Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.

Similar presentations


Presentation on theme: "Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers."— Presentation transcript:

1 Lecture 3 Molecular Evolution and Phylogeny

2 Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers of apparently homlogous intra-genomic (paralog) and inter-genomic (ortholog) genes Some genes, especially those related to the function of transcription and translation, are common to ALL life forms The closer two organisms seem to be phylogenetically, the more similar their genomes and corresponding genes are

3 Central dogma of molecular biology DNA RNA Protein

4 Closer related organisms have more similar genomes Highly similar genes are homologs (have the same ancestor) A universal ancestor exists for all life forms Molecular difference in homologous genes (or protein sequences) are positively correlated with evolution time Phylogenetic relation can be expressed by a dendrogram (a “tree”) Basic assumptions of molecular evolution

5 The five steps in phylogenetics dancing Modified from Hillis et al., (1993). Methods in Enzymology 224, 456-487 1 2 3 4 5 Sequence data Align Sequences Phylogenetic signal? Patterns—>evolutionary processes? Test phylogenetic reliability Distances methods Choose a method MBML Characters based methods Single treeOptimality criterion Calculate or estimate best fit tree LSMENJ Distance calculation (which model?) Model? MP Wheighting? (sites, changes)? Model?

6 Why protein phylogenies? For historical reasons - first sequences... For historical reasons - first sequences... Most genes encode proteins... Most genes encode proteins... To study protein structure, function and To study protein structure, function and evolution evolution Comparing DNA and protein based Comparing DNA and protein based phylogenies can be useful phylogenies can be useful Different genes - e.g. 18S rRNA versus EF-2 proteinDifferent genes - e.g. 18S rRNA versus EF-2 protein Protein encoding gene - codons versus amino acidsProtein encoding gene - codons versus amino acids

7 Protein were the first molecular sequences to be used for phylogenetic inference Fitch and Margoliash (1967) Construction of phylogenetic trees. Science 155, 279-284.

8 Statistical Physics and Biological Information Institute of Theoretical Physics University of California at Santa Barbara 2001 May 7 Most of what follows taken from:

9 Understanding trees Time 30 Mya Root 22 Mya 7 Mya same as

10 Understanding trees #2

11 Understanding trees #3

12 Difference in homologous sequences is a measure of evolution time Part of multiple sequence alignment of Mitochondrial Small Sub-Unit rRNA Full length is ~ 950 11 primate species with mouse as outgroup 靈長目 Change similarity matrix to distance matrix : d = 1 - S

13

14 From alignment construct pairwise distance* *Note: Alignment is not the only way to compute distance

15 Models of sequence evolution

16 Jukes-Cantor (minimal) Model All substitution rates =  all base frequency = 1/4 AC = 3 P ij (2t)

17 Let probability of site being a base at time t be P(t) After elapse time  t mutate to other three bases is – 3  t P(t) Gain from other bases is  t (1 - P(t)) Hence P(t +  t) = P(t) – 3  t P(t) +  t (1 - P(t)) dP(t)/dt =  P(t) Write P(t) = a exp(-bt) +c, solution is b= , c=1/4 P(t) = a exp(-  t) +1/4 If P(0) = 1, then a = ¾. If P(0) = 0, then a = -1/4 Finally P same (t) =1/4 +3/4 exp(-  t) P change (t) =1/4 - 1/4 exp(-  t) Derivation of Jukes-Cantor formula

18 Transition A G or C T Transversion A T or C G Hasegawa-Kishino-Yano model Has a more general substitution rate

19

20 Part of Jukes-Cantor distance matrix for primate examples (is much larger; for outgroup) Matrix will be used for clustering methods

21 Clustering

22 UPGMA

23 Neighbor-Joining Method

24 N-J Method produces an Unrooted, Additive tree

25 What is required for the Neighbour joining method? Distance matrix 0. Distance Matrix Neighbor-Joining Method An Example

26 PAM distance 3.3 (Human - Monkey) is the minimum. So we'll join Human and Monkey to MonHum and we'll calculate the new distances. Mon-Hum MonkeyHumanSpinachMosquitoRice 1. First Step

27 After we have joined two species in a subtree we have to compute the distances from every other node to the new subtree. We do this with a simple average of distances: Dist[Spinach, MonHum] = (Dist[Spinach, Monkey] + Dist[Spinach, Human])/2 = (90.8 + 86.3)/2 = 88.55 Mon-Hum MonkeyHumanSpinach 2. Calculation of New Distances

28 HumanMosquito Mon-Hum MonkeySpinachRice Mos-(Mon-Hum) 3. Next Cycle

29 HumanMosquito Mon-Hum MonkeySpinachRice Mos-(Mon-Hum) Spin-Rice 4. Penultimate Cycle

30 HumanMosquito Mon-Hum MonkeySpinachRice Mos-(Mon-Hum) Spin-Rice (Spin-Rice)-(Mos-(Mon-Hum)) 5. Last Joining

31 Human Monkey Mosquito Rice Spinach The result: Unrooted Neighbor-Joining Tree

32

33 Bootstrapping

34 Why are trees not exact?

35 Pairwise distances usually not tree-like

36

37 Searching tree space

38 Maximum likelihood criterion

39 Parsimony criterion

40 Parsimony with molecular data

41 Parsimony criterion Paul Higgs:

42

43 Is the best tree much better than others? L : likelihood at nodes

44 Use Maximum Likelihood to rank alternate trees yes same topology NJ tree is 2nd best

45 Use Parsimony to rank alternate trees different topology ; parsimony differentiates weakly

46 Quartet puzzling

47

48 MCMC: Markov chain with Monte Carlo

49 Topology probabilities according to MCMC

50 Clade probability compared from tree methods NJ method is very fast and close to being the best

51 Lecture and Book Lecture by Paul Higgs online.itp.ucsb.edu/online/infobio01/higgs/ see online.itp.ucsb.edu/online/infobio01/ for many lectures Book by Wen-Hsiong Li 李文雄 “Molecular Evolution” (Sinauer Associates, 1997)

52 CMS Molecular Biology Resource www.unl.edu/stc-95/ResTools/cmshp.html Phylogeny - Molecular Evolution www.unl.edu/stc-95/ResTools/biotools/biotools2.html The Tree of Life Web Project tolweb.org/tree/phylogeny.html Web Resources in Molecular Evolution and Systematics darwin.eeb.uconn.edu/molecular-evolution.html Some web sites on Molecular Evolution

53 On-line service www.ebi.ac.uk/clustalw/ clustalw.genome.ad.jp/ Softw are ftp-igbmc.u-strasbg.fr/pub/ClustalX/ ftp-igbmc.u-strasbg.fr/pub/ClustalW/ Some web sites on ClustalW

54


Download ppt "Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers."

Similar presentations


Ads by Google