Download presentation
Presentation is loading. Please wait.
Published byVivian Bryan Modified over 9 years ago
1
Chau-Ti Ting ctting@ntu.edu.tw Unless noted, the course materials are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Taiwan (CC BY-NC-SA 3.0)Attribution-NonCommercial-ShareAlike 3.0 Taiwan Chapter 4 Evolutionary Changes in Nucleotide Sequences
2
Introduction Calculate of the distance between two sequences is the simplest phylogenetic analysis Important because The first step in distance methods for phylogeny reconstruction Markov-process models of nucleotide substitution used in distance calculation form the basis of likelihood and Bayesian analysis The distance between two nucleotide sequences is defined as the expected number of nucleotide substitutions per site. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 3. Oxford University Press Inc,, New York, USA.
3
A simplest distance measure is the proportion of different sites, sometimes called the p-distance. If 10 sites are different between two sequences, each 100 bp long, then p= 10% = 0.1 However, a variable site may result from more than one substitutions that have occurred, and even a constant site may harbor back or parallel substitutions. Multiple hits: multiple substitutions at the same site (i.e., some changes are hidden) Note: p is usable only for high similar sequences, with p < 5%. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 3. Oxford University Press Inc,, New York, USA.
4
ACTGAACGTAACGC ACTGGAGGAATCGCACTGGAGGAATCGC AATGAAAGAATCGCAATGAAAGAATCGC Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 75. Sinauer Associates, Inc. Sunderland, MA, USA.
5
Jukes and Cantor’s one-parameter model This simple model assumes that substitutions occur with equal probability among the four nucleotide types. The rate of substitution for each nucleotide is 3 pre unit time, and the rate of substitution is in each of the three possible directions of change is . Because the model involves a single parameter, it is called the one-parameter model. A T G C National Taiwan University Chau-Ti Ting Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 68. Sinauer Associates, Inc. Sunderland, MA, USA.
6
Since we start with A, the probability hat this site is occupied by A at time 0 is P A(0) =1. At time 1, the probability of still having A at this site is given by P A(1) = 1 – 3 In which 3 is the probability of A changing to T, C or G, and 1 – 3 is the probability that A has remained unchanged. A T G C National Taiwan University Chau-Ti Ting Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 68. Sinauer Associates, Inc. Sunderland, MA, USA.
7
The probability of having A at time 2 is P A(2) = (1 – 3 P A(1) + P A(1) To derive this equation, we consider two possible scenarios: 1) the nucleotide has remained unchanged from time 0 to time 2, and 2) the nucleotide has changed to T, C, or G at time 1, but has subsequently reverted to A at time 2. t=0t=1t=2 IAAA IIANot AA P A(1) (1 – 3 P A(1) Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 68. Sinauer Associates, Inc. Sunderland, MA, USA.
8
Using the above formulation, we can show that the following recurrence equation applies to any t: P A(t+1) = (1 – 3 P A(t) + P A(t) We can rewrite this equation in terms of the amount of change in P A(t) per unit time as P A(t) = P A(t+1) P A(t) = {(1 – 3 P A(t) + P A(t) } P A(t) – 3 P A(t) + P A(t) – 4 P A(t) + P A(2) = (1 – 3 P A(1) + P A(1) Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 69. Sinauer Associates, Inc. Sunderland, MA, USA.
9
dt – 4 P A(t) + = d P A(t) ] e – 4 t +[ P A(0) – P A(t) = 4 1 4 1 e – 4 t + P A(t) = 4 1 4 3 When P A(0) = 1
10
e – 4 t + P A(t) = 4 1 4 3 e – 4 t + P AA(t) = 4 1 4 3 e – 4 t + P ii(t) = 4 1 4 3
11
] e – 4 t +[ P A(0) – P A(t) = 4 1 4 1 e – 4 t – P A(t) = 4 1 4 1 When P A(0) = 0
12
e – 4 t – P A(t) = 4 1 4 1 e – 4 t – P GA(t) = 4 1 4 1 e – 4 t – P ij(t) = 4 1 4 1 where i ≠ j
13
Time P 0.25 National Taiwan University Chau-Ti Ting
14
Kimura’s two-parameter model In this model, the rate of transitional substitution at each nucleotide site is per unit time, whereas the rate of each transversional substitution is per unit time. A T G C National Taiwan University Chau-Ti Ting Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 71. Sinauer Associates, Inc. Sunderland, MA, USA.
15
Let us consider the probability that a site that has A at time 0 will have A time t. After one time unit, the probability of A changing to G is , and the probability of A changing to either C or T is 2 . Thus the probability of A remaining unchanged after one time unit is P A(1) = 1 – A T G C National Taiwan University Chau-Ti Ting Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 72. Sinauer Associates, Inc. Sunderland, MA, USA.
16
t=0t=1t=2 IAAA IIAGA IIIACA IVATA transition transversion P A(2) = (1 – P A(1) + P T(1) + P C(1) + P G(1)
17
By extention, P A(t+1) = (1 – P A(t) + P T(t) + P C(t) + P G(t) Similarly, we can obtain P T(t+1) = P A(t) + (1 – P T(t) + P C(t) + P G(t) P C(t+1) = P A(t) + P T(t) + (1 – P C(t) + P G(t) P G(t+1) = P A(t) + P T(t) + P C(t) + (1 – P G(t) e – 2(a+ t + e – 4 t + P AA(t) = 2 1 4 1 4 1
18
e – 2(a+ t + e – 4 t + P AA(t) = 2 1 4 1 4 1 P AA(t) = P GG(t) = P CC(t) = P TT(t) e – 2(a+ t + e – 4 t + X (t) = 2 1 4 1 4 1
19
Let Y (t) = the probability that the initial nucleotide and the nucleotide at time t differ from each other by a transition. Y (t) = P AG(t) = P GA(t) = P TC(t) = P CT(t) e – 2(a+ t – e – 4 t + Y (t) = 2 1 4 1 4 1 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 73. Sinauer Associates, Inc. Sunderland, MA, USA.
20
The probability, Z (t), that the initial nucleotide and the nucleotide at time t differ by a specific type of transversion is given by e – 4 t – Z (t) = 4 1 4 1 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 73. Sinauer Associates, Inc. Sunderland, MA, USA.
21
e – 2(a+ t + e – 4 t + X (t) = 2 1 4 1 4 1 e – 2(a+ t – e – 4 t + Y (t) = 2 1 4 1 4 1 e – 4 t – Z (t) = 4 1 4 1 Note that each nucleotide subject to two types of transversion, but only one type of transition. Also X (t) + Y (t) + 2 Z (t) = 1 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 74. Sinauer Associates, Inc. Sunderland, MA, USA.
22
Number of nucleotide substitutions between two DNA sequences If two sequences of length N differ from each other at n site, then the proportion of differences, n/N, is referred to as the degree of divergence or Hamming distance. If the degree of divergence is substantial, then the observed number of differences is likely to be smaller than the actual number of substitutions due to multiple substitution or multiple hit at the same site. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 74. Sinauer Associates, Inc. Sunderland, MA, USA.
23
ACTGAACGTAACGC A C T G A C T G A C G G T A A A C T C G C A C Asingle substitution T G Asequential substitution A C ACoincidental substitution G T AParallel substitution A A TConvergent substitution C G C T CBack substitution Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 75. Sinauer Associates, Inc. Sunderland, MA, USA.
24
Number of nucleotide substitutions between two noncoding sequences Let us start with one-parameter model. In this model, it is sufficient to consider only I (t), which is the probability that the nucleotide at a given site at the time t is the same in both sequences. Suppose that the nucleotide at a given site was A at time 0. At time t, the probability that a descendant sequence will have A at this site is P AA(t), and consequently the probability that two descendant sequences have A at this site is P 2 AA(t). Similarly, the probabilities that both sequence have T, C, G at this site are P 2 AT(t) P 2 AC(t) P 2 AG(t), respectively. Therefore, I (t) = P 2 AA(t) +P 2 AT(t) +P 2 AC(t) +P 2 AG(t) Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA.
25
e – 8 t + I (t) = 4 1 4 3 I (t) = P 2 AA(t) +P 2 AT(t) +P 2 AC(t) +P 2 AG(t) Note that the probability that the two sequences are different at a site at time t is p = 1 I (t). Thus, (1-e – 8 t ) p = 4 3 or 8 t = – ln [ 1 – (4/3) p ] Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA.
26
The time of divergence between two sequences is usually given not known, and thus we can not estimate . Instead, we compute K, which is the number of substitutions per site since the time of divergence between two sequences. In the case of the one parameter model, K= 2(3 t), where 3 t is the number of substitutions per site in a single lineage. sequence 1 sequence 2 3 t ancestral sequence National Taiwan University Chau-Ti Ting Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA.
27
8 t = – ln [ 1 – (4/3) p ] K = 2(3 t) We can calculate K as K = – (3/4) ln [ 1 – (4/3) p] Where p is observed proportion of different nucleotides between two sequences. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA.
28
In the case of two-parameter model, the differences between two sequences are classified into transitions and transversions. Let P and Q be the proportion of transitional and transversional differences between two sequences, respectively. Then the number of nucleotide substitutions per site between two sequences, K, is estimated by K = (1/2) ln [ 1 / (1-2P-Q)]+(1/4) ln [1/(1-2Q)] Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA.
29
Ex.1: 2 sequences with 200 nucleotides that differ by 20 transitions and 4 transversions One-parameterTwo-parameter L = 200 p = 24/200 = 0.12 P = 20/200 =0.10 Q = 4/200 = 0.02K ≈ 0.13 K = (1/2) ln [ 1 / (1-2P-Q)]+(1/4) ln [1/(1-2Q)] One-parameter Two-parameter K = – (3/4) ln [ 1 – (4/3) p]
30
In this example, the two models give essentially the same estimate because the degree of divergence is small enough that the corrected degree of divergence (i.e., the number of nucleotide substitutions, K) is only only slightly larger than the uncorrected value (i.e., the number of nucleotide differences, p). One-parameterTwo-parameter L = 200 p = 24/200 = 0.12 P = 20/200 =0.10 Q = 4/200 = 0.02K ≈ 0.13 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 77. Sinauer Associates, Inc. Sunderland, MA, USA.
31
Ex.2: 2 sequences with 200 nucleotides that differ by 50 transitions and 16 transversions One-parameterTwo-parameter L = 200 p = 66/200 = 0.33 P = 50/200 =0.25 Q = 16/200 = 0.08 K ≈ 0.43K ≈ 0.48 K = (1/2) ln [ 1 / (1-2P-Q)]+(1/4) ln [1/(1-2Q)] One-parameter Two-parameter K = – (3/4) ln [ 1 – (4/3) p]
32
When the degree of divergence between two sequences is large, and especially in cases where there are prior reasons to believe that the rate of transition differs from the rate of transversion, the two parameter model tends to be more accurate than the one-parameter model. One-parameterTwo-parameter L = 200 p = 66/200 = 0.33 P = 50/200 =0.25 Q = 16/200 = 0.08 K ≈ 0.43K ≈ 0.48 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 77. Sinauer Associates, Inc. Sunderland, MA, USA.
33
Violation of assumptions Several assumptions have been made that are not necessary met by the sequences under study. 1)The rate of substitution was assumed to be the same at all sites. This assumption might not hold, as the rate may vary greatly from site to site. 2)The substitution occur in an independent manner. 3)The substitution matrix was assumed not to change in time, so that the nucleotide frequencies are maintained at a constant equilibrium value throughout their evolution. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 79. Sinauer Associates, Inc. Sunderland, MA, USA.
34
Substitution mutations Transitionchanges beween A and G, or between T and C Transversionchanges between a purine and a pyrimidine Synonymous (silent mutations) Nucleotide changes do not effect amino acid sequence. Nonsynonymous (replacement mutations) A change in single nucleotide in a codon can result in an amino acid replacement. DNA CCGCTGCTC mRNA CCGCUGCUC Amino acidProlineLeucineLeucine Source: Marjorie A. Hoy 2003. Insect molecular genetics: an introduction to principles and applications, 2 nd edition, p. 23. Academic Press. USA. Source: A. J. F. Griffiths, J. H. Miller, D. T. Suzuki, R. C. Lewontin, and W. M. Gelbart. 2000. An Introduction to Genetic Analysis, 7th edition. W. H. Freeman and Company. New York, USA. Source: A. J. F. Griffiths, J. H. Miller, D. T. Suzuki, R. C. Lewontin, and W. M. Gelbart. 2000. An Introduction to Genetic Analysis, 7th edition. W. H. Freeman and Company. New York, USA.
35
The transition/transversion rate ratio Three definitions of the ‘transition/transversion rate ratio’ are in use 1.The ratio of numbers of transitional and transversional differences between the two sequences, without correcting multiple hits. (E(S)/E(V)) 2., with meaning no rate difference between transitions and transversions 3.Average transition/transversion ratio (R): same as the first one but with correction Overall, R is convenient to use for comparing estimates under different models, while is more suitable for formulating the null hypothesis of no transition/transversion rate difference. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 18. Oxford University Press Inc,, New York, USA. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 17. Oxford University Press Inc,, New York, USA.
36
Models of amino acid and codon substitution Introduction With protein coding genes, we have the advantage of being able to distinguish synonymous or silent substitutions from the nonsynonymous or replacement substitutions. Synonymous and nonsynonymous mutations are under very different selection pressures and are fixed at very different rates. Thus, comparison between synonymous and nonsynonymous substitution rates provides a means to understand the effect of natural selection on the protein. This comparison does not require estimation of absolute substitution rates or knowledge of the divergence time. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 40. Oxford University Press Inc,, New York, USA.
37
Models of amino acid replacement Empirical models attempts to describe the relative rates of substitution between two amino acids without considering explicitly factors that influence the evolutionary process. They are often constructed by analyzing large quantities of sequence data, as compiled from database. Mechanistic models consider the biological process involved in amino acid substitution, such as mutation biases in the DNA, translation of the codons into amino acid after filtering by natural selection. Mechanistic models have more interpretative power and are particular useful for study the forces and mechanisms of gene sequence evolution. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 40-41. Oxford University Press Inc,, New York, USA.
38
The first empirical amino acid substitution matrix was constructed by Dayhoff and colleagues. They compiled and analyzed protein sequences available at the time, using a parsimony argument to reconstruct ancestral protein sequences and tabulating amino acid changes along branches on the phylogeny. Dayhoff et al. approximated the transition-probability matrix for an expected distance of 0.01 changes per site, call 1 PAM (for point-accepted mutations). Different PAM matrices are derived from the multiplication of the PAM1 matrix. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 41. Oxford University Press Inc,, New York, USA.
39
A R N D C Q E G H I L K M F P S T W Y V B Z X * A 2 -2 0 0 -2 0 0 1 -1 -1 -2 -1 -1 -3 1 1 1 -6 -3 0 0 0 0 -8 R -2 6 0 -1 -4 1 -1 -3 2 -2 -3 3 0 -4 0 0 -1 2 -4 -2 -1 0 -1 -8 N 0 0 2 2 -4 1 1 0 2 -2 -3 1 -2 -3 0 1 0 -4 -2 -2 2 1 0 -8 D 0 -1 2 4 -5 2 3 1 1 -2 -4 0 -3 -6 -1 0 0 -7 -4 -2 3 3 -1 -8 C -2 -4 -4 -5 12 -5 -5 -3 -3 -2 -6 -5 -5 -4 -3 0 -2 -8 0 -2 -4 -5 -3 -8 Q 0 1 1 2 -5 4 2 -1 3 -2 -2 1 -1 -5 0 -1 -1 -5 -4 -2 1 3 -1 -8 E 0 -1 1 3 -5 2 4 0 1 -2 -3 0 -2 -5 -1 0 0 -7 -4 -2 3 3 -1 -8 G 1 -3 0 1 -3 -1 0 5 -2 -3 -4 -2 -3 -5 0 1 0 -7 -5 -1 0 0 -1 -8 H -1 2 2 1 -3 3 1 -2 6 -2 -2 0 -2 -2 0 -1 -1 -3 0 -2 1 2 -1 -8 I -1 -2 -2 -2 -2 -2 -2 -3 -2 5 2 -2 2 1 -2 -1 0 -5 -1 4 -2 -2 -1 -8 L -2 -3 -3 -4 -6 -2 -3 -4 -2 2 6 -3 4 2 -3 -3 -2 -2 -1 2 -3 -3 -1 -8 K -1 3 1 0 -5 1 0 -2 0 -2 -3 5 0 -5 -1 0 0 -3 -4 -2 1 0 -1 -8 M -1 0 -2 -3 -5 -1 -2 -3 -2 2 4 0 6 0 -2 -2 -1 -4 -2 2 -2 -2 -1 -8 F -3 -4 -3 -6 -4 -5 -5 -5 -2 1 2 -5 0 9 -5 -3 -3 0 7 -1 -4 -5 -2 -8 P 1 0 0 -1 -3 0 -1 0 0 -2 -3 -1 -2 -5 6 1 0 -6 -5 -1 -1 0 -1 -8 S 1 0 1 0 0 -1 0 1 -1 -1 -3 0 -2 -3 1 2 1 -2 -3 -1 0 0 0 -8 T 1 -1 0 0 -2 -1 0 0 -1 0 -2 0 -1 -3 0 1 3 -5 -3 0 0 -1 0 -8 W -6 2 -4 -7 -8 -5 -7 -7 -3 -5 -2 -3 -4 0 -6 -2 -5 17 0 -6 -5 -6 -4 -8 Y -3 -4 -2 -4 0 -4 -4 -5 0 -1 -1 -4 -2 7 -5 -3 -3 0 10 -2 -3 -4 -2 -8 V 0 -2 -2 -2 -2 -2 -2 -1 -2 4 2 -2 2 -1 -1 -1 0 -6 -2 4 -2 -2 -1 -8 B 0 -1 2 3 -4 1 3 0 1 -2 -3 1 -2 -4 -1 0 0 -5 -3 -2 3 2 -1 -8 Z 0 0 1 3 -5 3 3 0 2 -2 -3 0 -2 -5 0 0 -1 -6 -4 -2 2 3 -1 -8 X 0 -1 0 -1 -3 -1 -1 -1 -1 -1 -1 -1 -1 -2 -1 0 0 -4 -2 -1 -1 -1 -1 -8 * -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 1 PAM matrix National Taiwan University Chau-Ti Ting
40
BLOSUM (BLOcks of Amino Acid SUbstitution Matrix) http://en.wikipedia.org/wiki/BLOSUM
41
Features of these matrices: 1.amino acids with similar physico-chemical properties tend to interchange with each other at high rates than dissimilar amino acids. (D E or I V) 2.The “mutational distance” between amino acids determined by the structure of the genetic code. Amino acids separated by differences of two or three codon positions have lower rates than amino acids separated by a difference of one codon position. (R K for nuclear proteins or for mitochondrial proteins) Both factors may be operating at the same time. Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 42. Oxford University Press Inc,, New York, USA.
42
Estimate synonymous and nonsynonymous substitutions rates Two distances are usually calculated between protein-coding DNA sequences, for synonymous and nonsynonymous substitutions, respectively. d S or K S : the number of synonymous changes per synonymous site d N or K N : the number of nonsynonymous changes per nonsynonymous site Two classes of methods: heuristic counting methods and the ML method Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 49. Oxford University Press Inc,, New York, USA.
43
Counting Methods Three steps: 1. Count synonymous and nonsynonymous sites 2. Count synonymous and nonsynonymous differences 3. Calculate the proportion of differences and correct for multiple hits Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 50. Oxford University Press Inc,, New York, USA.
44
A T G C National Taiwan University Chau-Ti Ting Wikipedia
45
Nei and Gojobori (1986) 1.Count synonymous and nonsynonymous sites: S and N 2.Count synonymous and nonsynonymous differences: S d and N d 3.Calculate the proportion of differences (p S and p N ) as apply the JC69 correction for multiple hits Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 50,. Oxford University Press Inc,, New York, USA.
46
TTT Phe TCTTCT TGTTGTTATTAT Leu CTT Ile ATT Val GTT TTC Phe 1/3 TTA Leu TTG Leu 2/3 Ser Cys Tyr National Taiwan University Chau-Ti Ting
47
Ser Thr Glu Met Cys Leu TCA ACT GAG ATG TGT TTA TCG ACA GAG ATA TGT CTA Ser Thr Glu Ile Cys Leu National Taiwan University Chau-Ti Ting
48
CCC (Pro) CAA (Gln) S R Path ICCC CCA CAA (Pro) (Pro) (Gln) R R Path IICCC CAC CAA (Pro) (His) (Gln) National Taiwan University Chau-Ti Ting
49
Nei and Gojobori (1986) 1.Count synonymous and nonsynonymous sites: S and N 2.Count synonymous and nonsynonymous differences: S d and N d 3.Calculate the proportion of differences (p S and p N ) as apply the JC69 correction for multiple hits Source: Ziheng Yang 2006. Computational Molecular Evolution., p. 50,. Oxford University Press Inc,, New York, USA.
50
Number of substitutions between two protein-coding genes nondegernerate (L 0 ): all the possible changes at this site are nonsynonymous twofold degenerate (L 2 ): one of the three possible changes is synonymous fourfold degenerate (L 4 ): all possible changes at the site are synonymous The nucleotide differences in each class are further classified into transitional (S i ) and transversional (V i ) differences, where i = 0, 2, and 4 denoted nondegerneracy, twofold degeneracy and fourfold degeneracy, respectively. All the substitutions at nondegenerate sites are nonsynonymous. All the substitutions at fourfold degenerate sites are synonymous. At twofold degenerate site, transitional changes are synonymous, whereas transversitional changes are nonsynonymous. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 83. Sinauer Associates, Inc. Sunderland, MA, USA.
51
UUUPhe UUCPhe UUALeu UUGLeu UCUSer UCCSer UCASer UCGSer twofold degenerate/transition fourfold degenerate twofold degenerate/transition Transversion nondegernerate National Taiwan University Chau-Ti Ting
52
The proportion of transitional differences at i-fold degenerate sites between two sequences is calculated as P i = LiLi SiSi Similarly, the proportion of transversional differences at i-fold degenerate sites between two sequences is calculated as Q i = LiLi ViVi Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 84. Sinauer Associates, Inc. Sunderland, MA, USA.
53
Kimura’s two-parameter method is used to estimate the number of transitional (A i ) and transversional (B i ) substitutions per ith type site. A i = (1/2) ln (a i ) – (1/4) ln (b i ) B i = (1/2) ln (b i ) Where a i =1/(1– 2 P i –Q i ), b i = 1/(1– 2Q i ) K = -(3/4) ln (1 – (4/3) p K = A i + B i Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 84. Sinauer Associates, Inc. Sunderland, MA, USA.
54
The total number of substitutions per ith type of degenerate site, K i, is given by K i = A i +B i A 2 and B 2 denote the numbers of synonymous and nonsynonymous substitutions per twofold degenerate site, respectively. K 4 = A 4 +B 4 denote the numbers of synonymous substitutions per fourfold degenerate site. K 0 = A 0 +B 0 denote the numbers of nonsynonymous substitutions per nondegenerate site. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 84. Sinauer Associates, Inc. Sunderland, MA, USA.
55
then, the number of synonymous substitutions per synonymous site (K S ) and the number of nonsynonymous substitutions per nonsynonymous site (K A ) can be obtained by K S = (L 2 / 3) + L 4 L 2 A 2 + L 4 K 4 K A = (2L 2 / 3) + L 0 L 2 B 2 + L 0 K 0 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 85. Sinauer Associates, Inc. Sunderland, MA, USA.
56
Li (1993) and Pamilo and Bianchi (1993) proposed to calculated the number of symnonymous substitution by taking (L 2 A 2 + L 4 K 4 )/ (L 2 + L 4 ) as an estimate of the transition component of nucleotide substitution at twofold and fourfold degenerate site K S = L 2 + L 4 L 2 A 2 + L 4 K 4 K A = L 2 + L 0 L 2 B 2 + L 0 K 0 + B 4 + A 0 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 85. Sinauer Associates, Inc. Sunderland, MA, USA.
57
Indirect estimations of the number of nucleotide substitution Indirect estimate of K values are subject to much larger sampling errors than those based on direct comparisons of nucleotide sequence. National Taiwan University Chau-Ti Ting Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 85. Sinauer Associates, Inc. Sunderland, MA, USA.
58
Number of Amino acid replacements between two proteins From the comparison of two amino acid sequences, we can calculate the observed proportion of different amino acid between two sequences as p = n/L where n is the number of amino acid differences between two sequences an L is the length of the aligned sequences. A simple model that can be used to convert p into the number of amino acid replacements between two sequences is the Poisson process. The number of amino acid replacements per site, d, is estimated as d = – ln (1– p) Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 86. Sinauer Associates, Inc. Sunderland, MA, USA.
59
Comparison of two homologous sequences involves the identification if the location of deletions and insertions that might have occurred in either of the two lineages since their divergence from a common ancestor. This process is referred to as sequence alignment. There are three types of aligned pairs: A matched pair is on in which that same nucleotide appears in both sequence. A mismatched pair is a pair in which different nucleotides are found in the two sequences. A gap is a pair consisting a base from one sequence and a null base from the other. Null base are denoted by -. A gap indicates that a deletion has occurred in one sequence or an insertion has occurred in the other. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 86. Sinauer Associates, Inc. Sunderland, MA, USA.
60
ATTGTCAAAGGCTTGAGCTGATGCAT GGCAGGCTTTACTTACAAGGGTATCG Range of Alignment Mismatch Gap S = (identities, mismatches) - (gap penalties) Score Max(S) National Taiwan University Chau-Ti Ting
61
In evolutionary terms, each pair in an alignment represents an inference concerning positional homology, i.e., a claim to the effect that the two members of the pair descended from a common ancestral nucleotide. ATCGCATGGTTAACGACTG ATCACATGGTTAA– –ACTCACC National Taiwan University Chau-Ti Ting Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 87. Sinauer Associates, Inc. Sunderland, MA, USA.
62
Manual alignment by visual inspection Advantages: 1)it uses the most powerful and trainable of all tools – the brain, 2)it allows the direct integration of additional data. The main disadvantage of this method is that it is subjective and unscalable, i.e., its results cannot be compared to those derived from other methods. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 87. Sinauer Associates, Inc. Sunderland, MA, USA.
63
The dot matrix In a dot matrix, the two sequences to be aligned are written out as column and row headings of a two-dimensional matrix. A dot is put in the dot matrix plot at a position where the nucleotides in the two sequences are identical. The alignment is defined by a path through the matrix starting with the upper-left element and ending with the lower-right element. There are four possible types of steps in this path: 1)a diagonal step through a dot indicates a match, 2)a diagonal step through an empty element of the matrix indicates a mismatch, 3)a horizontal step indicates a null nucleotide in the sequence on the top of the matrix, 4)a vertical step indicates a null nucleotide in the sequence on the left of the matrix. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 87. Sinauer Associates, Inc. Sunderland, MA, USA.
64
Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 88. Sinauer Associates, Inc. Sunderland, MA, USA.
65
Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 90. Sinauer Associates, Inc. Sunderland, MA, USA.
66
Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 91. Sinauer Associates, Inc. Sunderland, MA, USA. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 91. Sinauer Associates, Inc. Sunderland, MA, USA.
67
Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 92. Sinauer Associates, Inc. Sunderland, MA, USA.
68
Distance and similarity methods The best possible alignment between two sequences, or the optimal alignment, is the one in which the numbers of mismatches and gaps are minimized according to certain criteria. Unfortunately, reducing the number of mismatches usually results in an increase in the number of gaps, and vice versa. A: TCAGACGATTG L A =11 B: TCGGAGCTG L B =9 (I) TCAG-ACG-ATTG # of mismatches = 0 TC-GGA-GC-T-G # of gaps = 6 (II) TCAGACGATTG # of mismatches = 5 TCGGAGCTG-- # of gaps = 1 (III) TCAG-ACGATTG # of mismatches = 2 TC-GGA-GCTG- # of gaps = 4 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 90. Sinauer Associates, Inc. Sunderland, MA, USA.
69
As a consequence, we must find a common denominator with which to compare gaps and mismatches. The common denominator is called the gap penalty or gap cost. The gap penalty is a factor by which gap values are multiplied to make the gaps equivalent in value to the mismatches. Fro any given alignment, we can calculate a distance or dissimilarity index (D) between the two sequences in the alignment as D = m i y i + w k z k where y i is the number of mismatches of type i, m i is the mismatches penalty for an i-type of mismatch, z k is the number of gaps of length k, and w k is a positive number representing the penalty of gaps of length k. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 93. Sinauer Associates, Inc. Sunderland, MA, USA.
70
Alternatively, the similarity between two sequences in an alignment may be measured by a similarity index (S). For any given alignment, the similarity between two sequences is S = x – w k z k where x is the number of matches, z k is the number of gaps of length k, and w k is a positive number representing the penalty of gaps of length k. In the most frequent used gap penalty systems, it is assumed that the gap penalty has two components, a gap-opening penalty and a gap-extension penalty. Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 93. Sinauer Associates, Inc. Sunderland, MA, USA.
71
Using a linear gap penalty system in which the mismatch penalty is 1, the gap-open penalty is 2 and the gap-extension penalty is 6. (I) TCAG-ACG-ATTG # of mismatches = 0 TC-GGA-GC-T-G # of gaps = 6 D = (0 x 1)+(6 x 2)+6 (1–1)=12 (II) TCAGACGATTG # of mismatches = 5 TCGGAGCTG-- # of gaps = 1 D = (5 x 1)+(1 x 2)+6 (2–1)=13 (III) TCAG-ACGATTG # of mismatches = 2 TC-GGA-GCTG- # of gaps = 4 D = (2 x 1)+(4 x 2)+6 (1–1)=10 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 90. Sinauer Associates, Inc. Sunderland, MA, USA.
72
Using a different penalty system in which the mismatch penalty is 1, the gap-open penalty is 3 and the gap-extension penalty is 0. (I) TCAG-ACG-ATTG # of mismatches = 0 TC-GGA-GC-T-G # of gaps = 6 D = (0 x 1)+(6 x 3)=18 (II) TCAGACGATTG # of mismatches = 5 TCGGAGCTG-- # of gaps = 1 D = (5 x 1)+(1 x 3)=8 (III) TCAG-ACGATTG # of mismatches = 2 TC-GGA-GCTG- # of gaps = 4 D = (2 x 1)+(4 x 3)=14 Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 94. Sinauer Associates, Inc. Sunderland, MA, USA.
73
The Needleman-Wunsch algorithm used dynamic programming, which is a general computational technique used in many fields of study. Dynamic programming can be applied to alignment problems because similarity indices obey the following rule: S 1 x, 1 y = max S 1 x-1, 1 y-1 + S x,y In which S 1 x, 1 y is the similarity index for the two sequences up to residue x in the first sequence and residue y in the second sequence, max S 1 x-1, 1 y-1 is the similarity index for the best alignment up to residue x-1 in the first sequence and y-1 in the second sequence, and S x,y is the similarity score for aligning residues x and y. Alignment algorithms Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 94. Sinauer Associates, Inc. Sunderland, MA, USA.
74
Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 96. Sinauer Associates, Inc. Sunderland, MA, USA.
75
Multiple sequence alignment can be viewed as an extension of pairwise sequences alignment, but the complexity of the computation grows exponentially with the number of sequences being considered and, therefore, it is not feasible to search exhaustively for optimal alignment. Most of the programs use some sort of incremental or progressive algorithm, in which a new sequences is added to a group of already aligned sequences in order of decreasing similarity. It is usually advisable to take a look at the final multiple alignment, as such alignments can be frequently improved by visual inspection. Multiple sequence alignment Source: Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 97. Sinauer Associates, Inc. Sunderland, MA, USA.
76
76 WorkLicensingAuthor/SourcePage “Calculate of the distance between two sequences is the simplest phylogenetic analysis … The distance between two nucleotide sequences is defined as the expected number of nucleotide substitutions per site.” Ziheng Yang 2006. Computational Molecular Evolution., p. 3. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P2 “A simplest distance measure is the proportion of different sites, … site (i.e., some changes are hidden) Note: p is usable only for high similar sequences, with p < 5%.” Ziheng Yang 2006. Computational Molecular Evolution., p. 3. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P3 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 75. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P4 “This simple model assumes that substitutions occur with … is . Because the model involves a single parameter, it is called the one-parameter model Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 68. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P5 Copyright Declaration
77
77 WorkLicensingAuthor/SourcePage National Taiwan University Chau-Ti Ting P5, P6 “Since we start with A, the probability hat this site is occupied by A at time … probability that A has remained unchanged.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 68. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P6 “The probability of having A at time 2 is … 1) the nucleotide has remained unchanged from time 0 to time 2, and 2) the nucleotide has changed to T, C, or G at time 1, but has subsequently reverted to A at time 2.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 68. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P7 “Using the above formulation, we can show that the following recurrence equation applies to any … We can rewrite this equation in terms of the amount of change in P A(t) per unit time as” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 69. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P8 National Taiwan University Chau-Ti Ting P13
78
78 WorkLicensingAuthor/SourcePage “In this model, the rate of transitional substitution at each nucleotide site is per unit time, whereas the rate of each transversional substitution is per unit time. “ Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 71. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P14 National Taiwan University Chau-Ti Ting P14, P15 “Let us consider the probability that a site that has A at time 0 will have A time t. After one time unit, … of A changing to either C or T is 2 . Thus the probability of A remaining unchanged after one time unit is “ Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 72. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P15 “Let Y (t) = the probability that the initial nucleotide and the nucleotide at time t differ from each other by a transition. “ Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 73. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P19 “The probability, Z (t), that the initial nucleotide and the nucleotide at time t differ by a specific type of transversion is given by “ Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 73. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P20
79
79 WorkLicensingAuthor/SourcePage "Note that each nucleotide subject to two types of transversion, but only one type of transition. Also” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 74. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P21 “If two sequences of length N differ from each other at n site, then the proportion of … the actual number of substitutions due to multiple substitution or multiple hit at the same site.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 74. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P22 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 75. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P23 Let us start with one- parameter model. In this model, it is sufficient to consider only I (t), which is the probability that the nucleotide …, C, G at this site are P 2 AT(t) P 2 AC(t) P 2 AG(t), respectively. Therefore, Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P24
80
80 WorkLicensingAuthor/SourcePage “Note that the probability that the two sequences are different at a site at time t is p = 1 I (t). Thus,” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P25 National Taiwan University Chau-Ti Ting P26 “The time of divergence between two sequences is usually given not known, and thus we can not estimate . Instead, we compute … K= 2(3 t), where 3 t is the number of substitutions per site in a single lineage.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P26 “Where p is observed proportion of different nucleotides between two sequences.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P27 “In the case of two-parameter model, the differences between two sequences are classified into transitions and … substitutions per site between two sequences, K, is estimated by “ Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 76. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P28
81
81 WorkLicensingAuthor/SourcePage “In this example, the two models give essentially the same estimate because the … substitutions, K) is only only slightly larger than the uncorrected value (i.e., the number of nucleotide differences, p).” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 77. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P30 “When the degree of divergence between two sequences is large, and especially in cases where … of transversion, the two parameter model tends to be more accurate than the one- parameter model.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 77. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P32 “Several assumptions have been made that are not necessary … change in time, so that the nucleotide frequencies are maintained at a constant equilibrium value throughout their evolution.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 79. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P33 “Transitionchanges beween A and G, or between T and C Transversionchanges between a purine and a pyrimidine” Marjorie A. Hoy 2003. Insect molecular genetics: an introduction to principles and applications, 2 nd edition, p. 23. Academic Press. USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P34
82
82 WorkLicensingAuthor/SourcePage “Synonymous (silent mutations) Nucleotide changes do not effect amino acid sequence. A. J. F. Griffiths, J. H. Miller, D. T. Suzuki, R. C. Lewontin, and W. M. Gelbart. 2000. An Introduction to Genetic Analysis, 7th edition. W. H. Freeman and Company. New York, USA. http://www.ncbi.nlm.nih.gov/books/NBK21878/ It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P34 “Nonsynonymous (replacement mutations) A change in single nucleotide in a codon can result in an amino acid replacement.” A. J. F. Griffiths, J. H. Miller, D. T. Suzuki, R. C. Lewontin, and W. M. Gelbart. 2000. An Introduction to Genetic Analysis, 7th edition. W. H. Freeman and Company. New York, USA. http://www.ncbi.nlm.nih.gov/books/NBK21878/ It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P34 “Three definitions of the ‘transition/transversion rate ratio’ are in use … ratio (R): same as the first one but with correction” Ziheng Yang 2006. Computational Molecular Evolution., p. 17. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P35 “Overall, R is convenient to use for comparing estimates under different models, while is more suitable for formulating the null hypothesis of no transition/transversion rate difference.” Ziheng Yang 2006. Computational Molecular Evolution., p. 18. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P35
83
83 WorkLicensingAuthor/SourcePage “With protein coding genes, we have the advantage of being able to distinguish … comparison does not require estimation of absolute substitution rates or knowledge of the divergence time.” Ziheng Yang 2006. Computational Molecular Evolution., p. 40. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P36 “Empirical models attempts to describe the relative rates of substitution between two amino acids … interpretative power and are particular useful for study the forces and mechanisms of gene sequence evolution” Ziheng Yang 2006. Computational Molecular Evolution., p. 41 Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P37 “The first empirical amino acid substitution matrix was constructed by Dayhoff and colleagues … multiplication of the PAM1 matrix.” Ziheng Yang 2006. Computational Molecular Evolution., p. 41 Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P38 National Taiwan University Chau-Ti Ting P39 Features of these matrices: … Both factors may be operating at the same time. Ziheng Yang 2006. Computational Molecular Evolution., p. 42. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P41
84
84 WorkLicensingAuthor/SourcePage “Two distances are usually calculated between protein- coding DNA sequences, … methods: heuristic counting methods and the ML method” Ziheng Yang 2006. Computational Molecular Evolution., p. 49. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P42 “Three steps: 1. Count synonymous and nonsynonymous sites 2. Count synonymous and nonsynonymous differences 3. Calculate the proportion of differences and correct for multiple hits” Ziheng Yang 2006. Computational Molecular Evolution., p. 50. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P43 Wikipedia http://en.wikipedia.org/wiki/Codon It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 Wikipedia Fundation Terms of UseTerms of Use P44 National Taiwan University Chau-Ti Ting P44 “Nei and Gojobori (1986) 1. Count synonymous and nonsynonymous sites: S and N … (p S and p N ) as apply the JC69 correction for multiple hits “ Ziheng Yang 2006. Computational Molecular Evolution., p. 50. Oxford University Press Inc,, New York, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P45, P49
85
85 WorkLicensingAuthor/SourcePage National Taiwan University Chau-Ti Ting P46 National Taiwan University Chau-Ti Ting P47 National Taiwan University Chau-Ti Ting P48 “nondegernerate (L 0 ): all the possible changes at this site … At twofold degenerate site, transitional changes are synonymous, whereas transversitional changes are nonsynonymous.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 83. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P50, P52 National Taiwan University Chau-Ti Ting P51 “The proportion of transitional differences at i- fold degenerate … degenerate sites between two sequences is calculated as” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 84. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P53
86
86 WorkLicensingAuthor/SourcePage “Kimura’s two-parameter method is used to estimate the number of transitional (A i ) and transversional (B i ) substitutions per ith type site … Where a i =1/(1– 2 P i –Q i ), b i = 1/(1– 2Q i )” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 84. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P54 “The total number of substitutions per ith type … denote the numbers of nonsynonymous substitutions per nondegenerate site.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 84. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P55 “then, the number of synonymous substitutions per synonymous site (K S ) and the number of nonsynonymous substitutions per nonsynonymous site (K A ) can be obtained by” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 85. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P56 “Li (1993) and Pamilo and Bianchi (1993) proposed to calculated the number of symnonymous substitution by … of the transition component of nucleotide substitution at twofold and fourfold degenerate site.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 85. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P57
87
87 WorkLicensingAuthor/SourcePage “Indirect estimate of K values are subject to much larger sampling errors than those based on direct comparisons of nucleotide sequence” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 85. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P58 National Taiwan University Chau-Ti Ting P58 “From the comparison of two amino acid sequences, … The number of amino acid replacements per site, d, is estimated as” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 86. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P59 Comparison of two homologous sequences involves the identification if the location of deletions and insertions that might have … A gap indicates that a deletion has occurred in one sequence or an insertion has occurred in” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 86. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P60 National Taiwan University Chau-Ti Ting P61
88
88 WorkLicensingAuthor/SourcePage “In evolutionary terms, each pair in an alignment represents an inference concerning positional homology, i.e., a claim to the effect that the two members of the pair descended from a common ancestral nucleotide.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 87. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P62 National Taiwan University Chau-Ti Ting P62 “Advantages: 1)it uses the most powerful and trainable of all tools – the brain, 2)it allows the direct … its results cannot be compared to those derived from other methods.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 87. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P63 “In a dot matrix, the two sequences to be aligned are written out as column and row headings of … a vertical step indicates a null nucleotide in the sequence on the left of the matrix.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 87. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P64 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 88. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P65
89
89 WorkLicensingAuthor/SourcePage Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 90. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P66 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 91. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P67 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 91. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P67 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 92. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P68
90
90 WorkLicensingAuthor/SourcePage “The best possible alignment between two sequences, or the optimal alignment, is the one in which the numbers of mismatches and gaps are minimized according to certain criteria. … TC-GGA- GCTG-# of gaps = 4” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 90. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P69 “As a consequence, we must find a common denominator with which to compare gaps and mismatches. … z k is the number of gaps of length k, and w k is a positive number representing the penalty of gaps of length k” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 93. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P70 “Alternatively, the similarity between two sequences in an alignment may be measured by … gap penalty systems, it is assumed that the gap penalty has two components, a gap-opening penalty and a gap-extension penalty.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 93. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P71 Using a linear gap penalty system in which the mismatch penalty is 1, the gap-open penalty … D = (2 x 1)+(4 x 2)+6 (1–1)=10 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 90. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P72
91
91 WorkLicensingAuthor/SourcePage “Using a different penalty system in which the mismatch penalty is 1, the gap-open penalty is 3 and the … D = (2 x 1)+(4 x 3)=14” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 94. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P73 “The Needleman-Wunsch algorithm used dynamic programming, which is a general computational technique used in many fields of study. … is the similarity score for aligning residues x and y.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 94. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P74 Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 96. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P75 “Multiple sequence alignment can be viewed as an extension of pairwise sequences … as such alignments can be frequently improved by visual inspection.” Dan Graur and Wen-Hsiung Li 2000. Fundamentals of Molecular Evolution., p. 97. Sinauer Associates, Inc. Sunderland, MA, USA. It is used subject to the fair use doctrine of: Taiwan Copyright Act Articles 52 & 65 The "Code of Best Practices in Fair Use for OpenCourseWare 2009 (http://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf)" by A Committee of Practitioners of OpenCourseWare in the U.S. The contents are based on Section 107 of the 1976 U.S. Copyright Acthttp://www.centerforsocialmedia.org/sites/default/files/10-305-OCW- Oct29.pdf P76
92
Synonymous sites = 0 + 0 + 1/3 Non-synonymous sites = 1+1+2/3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.