Presentation is loading. Please wait.

Presentation is loading. Please wait.

MAT 4830 Mathematical Modeling

Similar presentations


Presentation on theme: "MAT 4830 Mathematical Modeling"— Presentation transcript:

1 MAT 4830 Mathematical Modeling
4.5 Phylogenetic Distances I

2 Preview Phylogenetic: of or relating to the evolutionary development of organisms Estimate the amount of total mutations (observed and hidden mutations).

3 Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0

4 Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0
Observed mutations: 2

5 Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0
Actual mutations: 5

6 Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0
Actual mutations: 5, (some are hidden mutations)

7 Distance of Two Sequences
We want to define the “distance” between two sequences. It measures the average no. of mutations per site that occurred, including the hidden ones.

8 Distance of Two Sequences
Let d(S0,S) be the distance between sequences S0 and S. What properties it “should” have? 1. 2. 3.

9 Jukes-Cantor Model Assume α is small.
Mutations per time step are “rare”.

10 Jukes-Cantor Model q(t)=conditional prob. that the base at time t is the same as the base at time 0 A

11 Jukes-Cantor Model q(t)=fraction of sites with no observed mutations A

12 Jukes-Cantor Model p(t)=1-q(t)=fractions of sites with observed mutations A

13 Jukes-Cantor Model p(t)=1-q(t)=fractions of sites with observed mutations A

14 Jukes-Cantor Model p can be estimated from the two sequences A

15 Example from 4.1 Observed mutations: 2

16 Jukes-Cantor Distance
Given p (and t), the J-C distance between two sequences S0 and S1 is defined as

17 Jukes-Cantor Distance
Given p (and t), the J-C distance between two sequences S0 and S1 is defined as Why?

18 Jukes-Cantor Distance

19 Jukes-Cantor Distance

20 Jukes-Cantor Distance

21 Example from 4.3 Suppose a 40-base ancestral and descendent DNA sequences are

22 Example from 4.3 Suppose a 40-base ancestral and descendent DNA sequences are

23 Example from 4.3 0.275 observed sub. per site.
sub. estimated per site.

24 Example from 4.3 11 observed sub. 13.7 sub. estimated.

25 Performance of JC distance (Homework Problem 4)
Write a program to simulate of the mutations of a sequence for t time step using the Jukes-Cantor model with parameter α.

26 Performance of JC distance (Homework Problem 4)
Write a program to simulate of the mutations of a sequence for t time step using the Jukes-Cantor model with parameter α. Count the number of base substitutions occurred.

27 Performance of JC distance (Homework Problem 4)
Write a program to simulate of the mutations of a sequence for t time step using the Jukes-Cantor model with parameter α. Count the number of base substitutions occurred. Compute the Jukes-Cantor distance of the initial and finial sequence.

28 Performance of JC distance (Homework Problem 4)
Write a program to simulate of the mutations of a sequence for t time step using the Jukes-Cantor model with parameter α. Count the number of base substitutions occurred. Compute the Jukes-Cantor distance of the initial and finial sequence. Compare the actual number of base substitutions and the estimation from the Jukes-Cantor distance.

29 Performance of JC distance (Homework Problem 4)

30 Maple: Strings Handling II
Concatenating two strings

31 Maple: Strings Handling II
However, no “re-assignment”.

32 Classwork Work on HW #1, 2


Download ppt "MAT 4830 Mathematical Modeling"

Similar presentations


Ads by Google