Download presentation
Presentation is loading. Please wait.
1
Phylogeny
2
Tree construction methods
Character based Parsimony Fitch Sankoff Probabilistic Maximum likelihood Distance based UPGMA
3
Maximum Likelihood Method
Input: π strings of length π (multiple alignment) Substitution matrix Character frequency Output: A tree topology with the input strings at the leaves
4
Maximum Likelihood Method
Input: π strings of length π (multiple alignment) Substitution matrix Character frequency for each possible tree topology with leaf labeling π: for each position π from 1 to π: πΏ π = π(π πππππππ πππππ ππππ ππππππππ) πΏ π = πΏ 1 β πΏ 2 ββ¦β πΏ π Pick the tree with the Maximal Likelihood
5
Maximum Likelihood Computation for a specific tree
Given a tree topology and a leaf labeling Every possible inner node labeling should be considered How many different trees with inner nodes labeling exist for the given tree? (DNA alphabet) 4 4 =256 T G C A
6
Maximum Likelihood Computation for a specific tree
Compute the likelihood for the given tree: A T G C T G C A 0.1 0.2 1 *All other options for inner node labeling were already computed, and their sum is 0.2 π π΄ =0.3, π π =0.3 π πΆ =0.2, π πΊ =0.2
7
Maximum Likelihood Computation for a specific tree
Compute the likelihood for the given tree: A T G C T G C A 0.1 0.2 1 *All other options for inner node labeling were already computed, and their sum is 0.2 π π΄ =0.3, π π =0.3 π πΆ =0.2, π πΊ =0.2 πΏ(π)=0.2+π π΄ βπ π΄βπ βπ πβπΆ βπ πΆβπ βπ πΆβπΊ βπ πβπΆ βπ π΄βπΊ βπ πΊβπ΄ βπ πΊβπΊ
8
Maximum Likelihood Computation for a specific tree
Compute the likelihood for the given tree: A T G C T G C A 0.1 0.2 1 *All other options for inner node labeling were already computed, and their sum is 0.2 π π΄ =0.3, π π =0.3 π πΆ =0.2, π πΊ =0.2 πΏ(π)=0.2+π π΄ βπ π΄βπ βπ πβπΆ βπ πΆβπ βπ πΆβπΊ βπ πβπΆ βπ π΄βπΊ βπ πΊβπ΄ βπ πΊβπΊ =0.3β0.1β0.2β0.2β0.1β0.2β0.2β0.2β1= β 10 β7 =
9
UPGMA UPGMA is a greedy algorithm that constructs a phylogenetic tree, given π species and a table π·[πΓπ] of distances between each 2 species.
10
Some definitions Additive distance matrix: A distance matrix is called additive if there exists a tree in which the distances between the leaves correspond to the matrixβs distances. Another definition is the β4 point criterionβ, which is easier to verify.
11
Additive matrix The β4 points criterionβ: A matrix is said to be additive if every 4 objects (species) can be labeled as π₯,π¦,π§,π€ so that: z x c a x y b d w π+π + π+π β€ π+π₯+π + π+π₯+π = π+π₯+π +(π+π₯+π)
12
Additive matrix The β4 points criterionβ: A matrix is said to be additive if every 4 objects (species) can be labeled as π₯,π¦,π§,π€ so that: z x c a x y b d w π+π + π+π β€ π+π₯+π + π+π₯+π = π+π₯+π +(π+π₯+π)
13
Additive matrix The β4 points criterionβ: A matrix is said to be additive if every 4 objects (species) can be labeled as π₯,π¦,π§,π€ so that: z x c a x y b d w π+π + π+π β€ π+π₯+π + π+π₯+π = π+π₯+π +(π+π₯+π)
14
Additive matrix The β4 points criterionβ: A matrix is said to be additive if every 4 objects (species) can be labeled as π₯,π¦,π§,π€ so that: z x c a x y b d w π+π + π+π β€ π+π₯+π + π+π₯+π = π+π₯+π +(π+π₯+π)
15
Additive matrix π π΄,π΅ +π πΆ,π· =12+6=18 π π΄,πΆ +π π΅,π· =14+12=26 π π΄,π· +π(π΅,πΆ)=14+12=26 π π΄,π΅ +π πΆ,π· β€π π΄,πΆ +π π΅,π· =π π΄,π· +π(π΅,πΆ) 18 26 26
16
Non-Additive matrix π π΄,π΅ +π πΆ,π· =2+2 π π΄,πΆ +π π΅,π· =2+2 π π΄,π· +π π΅,πΆ =2+3 A B C D 2 3
17
Ultrametric Distance Matrix
A distance matrix is called ultrametric if there exists a tree corresponding to the matrixβs distances, in which all leaves have equal distance from the root. Notice that by definition, ultrametric is a special case of additive.
18
Some definitions The β3 point criterionβ: Like the additive case, ultrametric has another definition: If all 3 taxa can be relabeled as π₯,π¦,π§ so that:
19
Some definitions Ultrametric distance: For example, this is an ultrametric tree:
20
UPGMA algorithm UPGMA - Unweighted Pair Group Method with Arithmatic Mean Input β a distance matrix D. Each cell [π,π] represents the distance π(π,π) between species π and species π. Output β an ultrametric phylogenetic tree T, with leaf labeling
21
UPGMA algorithm Input: π·[πΓπ] β distance matrix
Initialize: π={ πΆ 1 ,β¦, πΆ π } While π >1 cluster taxa: Pick shortest distance π(π,π) Cβ C π , C j Define node at height π πΆ π , πΆ π 2 Tβ T \ { C i , C j } TβT U {C} Update D: β πΆ π βπ, πΆ π β πΆ π , πΆ π π πΆ, πΆ π = π πΆ π , πΆ π | πΆ π |+π πΆ π , πΆ π | πΆ π | πΆ π +| πΆ π |
22
UPGMA Example Given the distance matrix below, build a phylogenetic tree using UPGMA A B C D E 2 4 6 F 8
23
Example A B C D E 2 4 6 F 8 We begin by choosing a minimal distance, and clustering the nodes chosen.
24
UPGMA Example Then we calculate the distances between our new cluster and all the rest of the nodes, to create an updated distance matrix D. The distances not including our clusterβs nodes remain exactly the same.
25
Example The updated distance matrix: A B C D E 2 4 6 F 8 AB C D E 4 6
π πΆ, πΆ π = π πΆ π , πΆ π | πΆ π |+π πΆ π , πΆ π | πΆ π | πΆ π +| πΆ π | A B C D E 2 4 6 F 8 π π΄π΅, πΆ π = π π΄, πΆ π π΄ +π π΅, πΆ π π΅ π΄ +|π΅| π π΄π΅, πΆ = 4 π΄ +4 π΅ π΄ +|π΅| =4 AB C D E 4 6 F 8 The updated distance matrix:
26
Example AB C D E 4 6 F 8 Now, we carry on doing the exact same procedure, until we are left with only one cluster.
27
Example AB C D E 4 6 F 8 Now, we carry on doing the exact same procedure, until we are left with only one cluster.
28
Example π πΆ, πΆ π = π πΆ π , πΆ π | πΆ π |+π πΆ π , πΆ π | πΆ π | πΆ π +| πΆ π | AB C D E 4 6 F 8 AB C DE 4 6 F 8 π π·πΈ, π΄π΅ = π π·, π΄π΅ π· +π πΈ, π΄π΅ πΈ πΈ +|π·| = =6
29
Example AB C DE 4 6 F 8
30
Example AB C DE 4 6 F 8
31
Example π πΆ, πΆ π = π πΆ π , πΆ π | πΆ π |+π πΆ π , πΆ π | πΆ π | πΆ π +| πΆ π | AB C DE 4 6 F 8 AB,C DE 6 F 8 π π΄π΅πΆ, π·πΈ = π π΄π΅, π·πΈ π΄π΅ +π πΆ, π·πΈ πΆ π΄π΅ +|πΆ| = 6β2+6β1 3 =6 π π΄π΅πΆ, πΉ = π π΄π΅, πΉ π΄π΅ +π πΆ, πΉ πΆ π΄π΅ +|πΆ| = 8β2+8β1 3 =8
32
Example AB,C DE 6 F 8 (AB,C),DE F 8
AB,C DE 6 F 8 (AB,C),DE F 8 π π΄π΅πΆπ·πΈ, πΉ = π π΄π΅πΆ, πΉ π΄π΅πΆ +π π·πΈ, πΉ π·πΈ π΄π΅πΆ +|π·πΈ| = 8β3+8β2 5 =8
33
UPGMA Example Our output tree! Lovely, isnβt it?
(AB,C),DE F 8 Our output tree! Lovely, isnβt it? Can there be more the one tree?
34
UPGMA Example A B C D E 2 4 6 F 8 What can be said about the distance matrix? Is it additive? Ultrametric?
35
UPGMA downfalls UPGMA will always return an ultrametric tree.
It assumes all species mutate at the same rate (molecular clock). What will happen if we will try and reconstruct a tree such as this one?
36
UPGMA downfalls This tree corresponds to the following distance matrix: A B C D E 5 4 7 10 6 9 F 8 11
37
UPGMA downfalls If we run UPGMA on the matrix shown, will get this output: Compared to the original tree:
38
UPGMA downfalls UPGMA returns the right tree if the distance matrix is ultrametric. Even then, we canβt be certain the original tree was also ultrametric. If the distance matrix D is not additive, UPGMA will generate a heuristic solution that does not fit D
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.