Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measuring genetic change Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Section 5.2.

Similar presentations


Presentation on theme: "Measuring genetic change Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Section 5.2."— Presentation transcript:

1 Measuring genetic change Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Section 5.2

2 A Parallel 2 changes, no difference ACAC A Coincidental 2 changes, 1 difference AC AG A Single 1 change, 1 difference AC A Back 2 changes, no difference AC CA A Convergent 3 changes, no difference AC CT AT A Multiple 2 changes, 1 difference AC CT Types of substitution ACATGCCCTTAA

3 Types of substitution (continued) Multiple substitutions can greatly obscure actual evolutionary history, particularly in cases where there have been many mutations i.e. over long evolutionary time scales Final three examples have serious implications for inference of evolutionary history: Similarity inherited from an ancestor is called homology Independently acquired similarity is called homoplasy All tree-building methods rely on sufficient levels of homology

4 Types of substitution (continued) Substitutions that exchange a purine for another purine or a pyrimidine for another pyrimidine are called transitions A A T T G G C C Substitutions that exchange a purine for a pyrimidine or vice-versa are called transversions

5 Measuring evolutionary change Simplest measure is to count number of different sites Poor measure: Some sites may undergo repeated substitutions As sequences diverge, measure becomes less accurate Time since divergence (Myr) Base pair differences Saturation occurs - most sites changing have changed before

6 Time Sequence difference Correction of observed sequence differences Observed difference Expected difference ‘Correction’

7 A general framework of sequence evolution models Pt =Pt =Pt =Pt = p AA p CA p GA p TA p AC p CC p GC p TC p AG p CG p GG p TG p AT p CT p GT p TT P ii = 1 - p ij  jijijiji f = [f A f C f G f T ]

8 The Jukes-Cantor (JC) model Assumes that all four bases have equal frequencies and that all substitutions are equally likely Pt =Pt =Pt =Pt =---- f = [¼ ¼ ¼ ¼]

9 Kimura’s 2 parameter model (K2P) Takes into account different frequencies of transitions vs. transversions Pt =Pt =Pt =Pt =---- f = [¼ ¼ ¼ ¼] Transitions (  ) Transversions (  )

10 Felsenstein (1981) (F81) Takes into account differences in base composition Percentage (G + C) can range from 25% - 75% F81 model allows the frequencies of the four nucleotides to be different Does not allow for variation between genes/species f = [  A  C  G  T ] Pt =Pt =Pt =Pt =- --AAAAAA--AAAAAA CC--CCCCCC--CCCC- GGGG--GGGGGG--GG- TTTTTT--TTTTTT---

11 Hasegawa, Kishino and Yano (1985) (HKY85) Essentially merges the K2P and F81 models to allow transitions and transversions to occur at different rates as well as allowing base frequencies to vary f = [  A  C  G  T ] Pt =Pt =Pt =Pt =- --AAAAAA--AAAAAA CC--CCCCCC--CCCC- GGGG--GGGGGG--GG- TTTTTT--TTTTTT---

12 General reversible model (REV) Most general model - each substitution has its own probability f = [  A  C  G  T ] Pt =Pt =Pt =Pt =- --AaAaAbAbAcAc--AaAaAbAbAcAc CaCa--CdCdCeCeCaCa--CdCdCeCe- GbGbGdGd--GfGfGbGbGdGd--GfGf- TcTcTeTeTfTf--TcTcTeTeTfTf--- By constraining a-f it is possible to generate all the other models

13 Comparing the models JC  A =  C =  G =  T  =  JC  A =  C =  G =  T  =  HKY85  A  C  G  T  HKY85  A  C  G  T  REV  A  C  G  T a,b,c,d,e,f REV  A  C  G  T a,b,c,d,e,f K2P  A =  C =  G =  T  K2P  A =  C =  G =  T  Allow transition/ transversion bias Allow transition/ transversion bias F81  A  C  G  T  =  F81  A  C  G  T  =  Allow base frequencies to vary Allow base frequencies to vary

14 Comparing the models (continued) ACGTA C G T Observed ACGTA C G T JC ACGTA C G T K2P ACGTA C G T HKY85

15 Assumptions: independence Assumes that change at one site has no effect on other sites Good example is in RNA stem-loop structures ACCCCUUGC A U G GGGGAA Substitution may result in mismatched bases and decreased stem stability A CCC C UU G C A U G GGG C AA A CCCGUU G C A U G GGGCAA Compensatory change may occur to restore Watson-Crick base pairing

16 Assumptions: base composition Assumption that base composition is at equilibrium and that it is similar across all taxa studied In example opposite, trees inferred using models which do not allow for this will not group Thermus and Deinococcus AquifexThermotogaThermusDeinococcusOthers 64.063.763.255.553.9 % G + C

17 Assumptions: variation in substitution rate across sites All sites are not equally likely to undergo a substitution Functional constraints: Pseudogenes have lost all function and can evolve freely Fourfold degenerate sites do not change amino acid composition of proteins Non-degenerate sites are highly constrained 5’ flanking region 5’ untranslated region Non-degenerate sites Twofold degenerate sites Fourfold degenerate sites Introns 3’ untranslated region 3’ flanking region Pseudogenes Substitution / site / 10 9 years

18 Assumptions: variation in substitution rate across sites (continued) More rapidly evolving sequence shows most divergence initially but soon saturates Sequence A actually appears to be more rapidly evolving DNA divergence Divergence time (Myr) A 0.5% / Myr + 20% constraint B 2% / Myr + 50% constraint


Download ppt "Measuring genetic change Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Section 5.2."

Similar presentations


Ads by Google