Distance matrix methods calculate a measure of distance between each pair of species, then find a tree that predicts the observed set of distances.
Branch lengths and times in distance matrix methods, branch lengths reflect the expected amount of evolution in different branches of the tree. branch length = r i t i rate of evolution elapsed time
The least squares method ABCDE A0D ab D ac D ad D ae BD ab 0D bc D bd D be CD ac D bc 0D cd D ce DD ad D bd D cd 0D de ED ae D be D ce D de 0 Observed matrix minimise the difference between the observed matrix of distances and the matrix of distances predicted by the tree.
The least squares method ABCDE A0d ab d ac d ad d ae Bd ab 0d bc d bd d be Cd ac d bc 0d cd d ce D d ad d bd d cd 0d de Ed ae d be d ce d de 0 Expected matrix c e a b d
The least squares method c e a b d ABCDE A0 B0 C0 D0 E Expected matrix
The least squares method c e a b d ABCDE A00.23 B0 C0 D0 E Expected matrix
The least squares method c e a b d ABCDE A B C D E Expected matrix
The least squares method Q = w ij (D ij – d ij ) 2 i=1j=1 n n observed distance between species i and j expected distance between species i and j Q is a measure for the discrepancy between the observed and the expected matrix.
The least squares method Q = w ij (D ij – d ij ) 2 i=1j=1 n n weight (1, 1/D 2, 1/D) distances can be weighed or not.
The least squares method c e a b d v1 v7 v2 v4 v5 v3 v6 x ij,k = 1 if branch k is on the path between species j and k = 0 if branch k is not on the path between species j and k X ij, k is a handy variable
The least squares method c e a b d v1 v7 v2 v4 v5 v3 v6 X a-b,1 = 1
The least squares method c e a b d v1 v7 v2 v4 v5 v3 v6 X a-b,1 = 1 X a-b,7 = 1
The least squares method c e a b d v1 v7 v2 v4 v5 v3 v6 X a-b,1 = 1 X a-b,7 = 1 X a-b,3 = 0
The least squares method Q = w ij (D ij – d ij ) 2 i=1j=1 n n d ij = x ij,k v k k rewrite d ij, the expected values
The least squares method Q = w ij (D ij – x ij,k v k ) 2 i=1j=1 n n k
The least squares method Q = w ij (D ij – x ij,k v k ) 2 i=1j=1 n n k = -2 w ij x ij, k (D ij – x ij,k v k ) i=1j=1 n n dQ dv k k differentiate Q and equate the derivative to zero
The least squares method = -2 x ij, k (D ij – x ij,k v k ) = 0 i=1j=1 n n dQ dv k k for the unweighted case
The least squares method = -2 x ij, 1 (D ij – x ij,k v k ) = 0 i=1j:j≠1 n n dQ dv 1 k x AB,1 (D AB - x AB k v k ) + x AC,1 (D AC - x AC k v k ) + x AD,1 (D AD - x AD k v k ) + x AB,1 (D AE - x AE k v k ) + x BC,1 (D BC - x BC k v k ) + x BD,1 (D BD - x BD k v k )+ x BE,1 (D BE - x BE k v k ) + x CD,1 (D CD - x CD k v k ) + x CE,1 (D CE - x CE k v k ) + x DE,1 (D DE - x DE k v k ) = 0 i=1 i=2 i=3 i=4 j=2j=3j=4j=5 j=3j=4j=5 j=4j=5 written in full
The least squares method c e a b d v1 v7 v2 v4 v5 v3 v6 X ij,1 ABCDE A-1111 B-000 C-00 D-0 E-
The least squares method = -2 x ij, 1 (D ij – x ij,k v k ) = 0 i=1j=1 n n dQ dv 1 k 1 (D AB - x AB k v k ) + 1 (D AC - x AC k v k )+ 1 (D AD - x AD k v k )+ 1 (D AE - x AE k v k ) + 0 (D BC - x BC k v k ) + 0 (D BD - x BD k v k )+ 0 (D BE - x BE k v k ) + 0 (D CD - x CD k v k ) + 0 (D CE - x CE k v k ) + 0 (D DE - x DE k v k ) = 0 X ij,1 ABCDE A-1111 B-000 C-00 D-0 E- many terms are zero
The least squares method = -2 x ij, 1 (D ij – x ij,k v k ) = 0 i=1j=1 n n dQ dv 1 k (D AB - x AB,k v k ) + (D AC - x AC k v k ) + (D AD - x AD k v k ) + (D AE - x AE k v k ) = 0 c e a b d v1 v7 v2 v4 v5 v3 v6 =1v 1 + 1v 2 + 0v 3 + 0v 4 + 0*v 5 + 0v 6 + 1*v 7 non-zero terms expanded
The least squares method = -2 x ij, 1 (D ij – x ij,k v k ) = 0 i=1j=1 n n dQ dv 1 k (D AB - x AB k v k ) + (D AC - x AC k v k ) + (D AD - x AD k v k ) + (D AE - x AE k v k ) = 0 c e a b d v1 v7 v2 v4 v5 v3 v6 =1v 1 + 0v 2 + 1v 3 + 0v 4 + 0*v 5 + 1v 6 + 0*v 7
The least squares method = -2 x ij, 1 (D ij – x ij,k v k ) = 0 i=1j=1 n n dQ dv 1 k (D AB - x AB k v k ) + (D AC - x AC k v k ) + (D AD - x AD k v k ) + (D AE - x AE k v k ) = 0 D AB + D AC + D AD + D AE – 4v 1 – v 2 – v 3 – v 4 – v 5 – 2v 6 – 2v 7 = 0 D AB + D AC + D AD + D AE = 4v 1 + v 2 + v 3 + v 4 + v 5 + 2v 6 + 2v 7 rearranging to
The least squares method = -2 x ij, 1 (D ij – x ij,k v k ) = 0 i=1j=1 n n dQ dv 1 k (D AB - x AB k v k ) + (D AC - x AC k v k ) + (D AD - x AD k v k ) + (D AE - x AE k v k ) = 0 D AB + D AC + D AD + D AE – 4v 1 – v 2 – v 3 – v 4 – v 5 – 2v 6 – 2v 7 = 0 D AB + D AC + D AD + D AE = 4v 1 + v 2 + v 3 + v 4 + v 5 + 2v 6 + 2v 7 equation for v1
The least squares method D AB + D AC + D AD + D AE = 4v 1 + v 2 + v 3 + v 4 + v 5 + 2v 6 + 2v 7 D AB + D BC + D BD + D BE = v 1 + 4v 2 + v 3 + v 4 + v 5 + 2v 6 + 3v 7 equation for v1 equation for v2 mutatis mutandis for v2
The least squares method D AB + D AC + D AD + D AE = 4v 1 + v 2 + v 3 + v 4 + v 5 + 2v 6 + 2v 7 D AB + D BC + D BD + D BE = v 1 + 4v 2 + v 3 + v 4 + v 5 + 2v 6 + 3v 7 D AC + D BC + D CD + D DE = v 1 + v 2 + 4v 3 + v 4 + v 5 + 3v 6 + 2v 7 D AD + D BD + D CD + D DE = v 1 + v 2 + v 3 + 4v 4 + v 5 + 2v 6 + 3v 7 D AE + D BE + D CE + D DE = v 1 + v 2 + v 3 + v 4 + 4v 5 + 3v 6 + 2v 7 D AC + D AE + D CE + D BE + D CD + D DE = 2v 1 + 2v 2 + 3v 3 + 2v 4 + 3v 5 + 6v 6 + 4v 7 D AB + D AD + D BC + D CD + D BE + D DE = 2v 1 + 3v 2 + 2v 3 + 3v 4 + 2v 5 + 4v 6 + 6v 7 equation for v1 equation for v2 v3 v4 v5 v6 v7 and all other branches
The least squares method solving linear equations with matrices x + 2y = 4 3x - 5y = A == B A -1 = | A | = 1 1*(-5)- 3* = X = A -1 B = = =
Clustering algorithms clustering methods have no criterion but apply algorithms to come up with trees
Clustering algorithms: UPGMA an ultrametric tree UPGMA assumes that evolutionary rates are the same in all lineages Unweighted Pair Group Method with Arithmetic mean
Clustering algorithms: UPGMA dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and j.
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. sea lionseal 12
Clustering algorithms: UPGMA dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and j. 3.Lump i and j into a new group. dogbearraccoonweaselSScatmonkey dog bear raccoon weasel SS 0 cat monkey
Clustering algorithms: UPGMA dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). dogbearraccoonweaselSScatmonkey dog bear raccoon weasel SS 0 cat monkey
Clustering algorithms: UPGMA dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey dogbearraccoonweaselSScatmonkey dog bear raccoon weasel SS 0 cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey dogbearraccoonweaselSScatmonkey dog bear raccoon weasel SS 0 cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey dogbearraccoonweaselSScatmonkey dog bear raccoon weasel SS cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. dogbearraccoonweaselSScatmonkey dog bear raccoon weasel SS cat monkey
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. sea lionseal 12 raccoon bear 13
Clustering algorithms: UPGMA dogbearraccoonweaselSScatmonkey dog bear raccoon weasel SS cat monkey dogBRweaselSScatmonkey dog BR weasel SS cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. dogBRweaselSScatmonkey dog BR weasel SS cat monkey
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. sea lionseal 12 raccoon bear
Clustering algorithms: UPGMA dogBRweaselSScatmonkey dog BR weasel SS cat monkey dogBRSSweaselcatmonkey dog BRSS weasel cat monkey Find species i and j with the smallest distance. 2.Calculate branch length between i and. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. dogBRSSweaselcatmonkey dog BRSS weasel cat monkey
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. sea lionseal 12 raccoon bear weasel
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). dogBRSSweaselcatmonkey dog BRSS weasel cat monkey dogBRSSWcatmonkey dog BRSSW 0 cat monkey = (4* *51)/5 4 species in BRSS 1 species in weasel
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). dogBRSSweaselcatmonkey dog BRSS weasel cat monkey dogBRSSWcatmonkey dog BRSSW cat monkey = (4* *51)/5 4 species in BRSS 1 species in weasel
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). dogBRSSweaselcatmonkey dog BRSS weasel cat monkey dogBRSSWcatmonkey dog BRSSW cat monkey
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. dogBRSSWcatmonkey dog BRSSW cat monkey
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. sea lionseal 12 raccoon bear weasel dog 22.9
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). dogBRSSWcatmonkey dog BRSSW cat monkey BRSSWDcatmonkey BRSSWD 0 cat 0148 monkey 1480 = (5* *98)/6 1 species in dog 5 species in BRSSW
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). dogBRSSWcatmonkey dog BRSSW cat monkey BRSSWDcatmonkey BRSSWD cat monkey 1480 = (5* *98)/6 1 species in dog 5 species in BRSSW
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). dogBRSSWcatmonkey dog BRSSW cat monkey BRSSWDcatmonkey BRSSWD cat monkey = (5* *98)/6 1 species in dog 5 species in BRSSW
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. BRSSWDcatmonkey BRSSWD cat monkey
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. sea lionseal 12 raccoon bear weasel dog 22.9 cat
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). BRSSWDcatmonkey BRSSWD cat monkey BRSSWDmonkey BRSSWD 0 monkey 0 = (6* *148)/7 1 species in cat 6 species in BRSSWD
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. Lump i and j into a new group. 3.Lump i and j into a new group. 4.Compute distance between new group and all other groups (weigh for number of species in groups). BRSSWDcatmonkey BRSSWD cat monkey BRSSWDmonkey BRSSWD monkey = (6* *148)/7 1 species in cat 6 species in BRSSWD
Clustering algorithms: UPGMA 1.Find species i and j with the smallest distance. 2.Calculate branch length between i and j. sea lionseal 12 raccoonbear weaseldog 22.9 cat monkey
Clustering algorithms: Neighbour-joining 1.Calculate S x = ( D x )/(n-2) dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey
Clustering algorithms: Neighbour-joining 1.Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey = =
Clustering algorithms: Neighbour-joining 1.Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey
Clustering algorithms: Neighbour-joining 1.Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey branch length cat-cm = 148/2 + ( )/2 = 47.08
Clustering algorithms: Neighbour-joining 1.Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey branch length cat-cm = 148/2 + ( )/2 = branch length monkey-cm = 148/2 + ( )/2 =
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog 1.Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 4.Join the two species and make all other taxa in form of a star.
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog cm Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 4.Join the two species and make all other taxa in form of a star.
Clustering algorithms: Neighbour-joining dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey dogbearraccoonweaselsealsea lioncm dog bear raccoon weasel seal sea lion cm 1.Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 4.Join the two species and make all other taxa in form of a star. 5.Create a new matrix. Calculate the distances between the new node and other taxa as D xij =(D ix +D jx -D ij )/2 ( )/2 = 49 ( )/2 = 49
Clustering algorithms: Neighbour-joining dogbearraccoonweaselsealsea lioncatmonkey dog bear raccoon weasel seal sea lion cat monkey dogbearraccoonweaselsealsea lioncm dog bear raccoon weasel seal sea lion cm Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 4.Join the two species and make all other taxa in form of a star. 5.Create a new matrix. Calculate the distances between the new node and other taxa as D xij =(D ix +D jx -D ij )/2 ( )/2 = 49 ( )/2 = 49
Clustering algorithms: Neighbour-joining dogbearraccoonweaselsealsea lioncm dog bear raccoon weasel seal sea lion cm Calculate S x = ( D x )/(n-2)
Clustering algorithms: Neighbour-joining dogbearraccoonweaselsealsea lioncm dog bear raccoon weasel seal sea lion cm Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij dogbearraccoonweaselsealsea lioncm dog bear raccoon weasel seal sea lion cm
Clustering algorithms: Neighbour-joining dogbearraccoonweaselsealsea lioncm dog bear raccoon weasel seal sea lion cm Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 branch length seal-ss = 24/2 + ( )/2 = branch length sealion-ss = 24/2 + ( )/2 = 11.65
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog cm ss 1.Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 4.Join the two species and make all other taxa in form of a star.
Clustering algorithms: Neighbour-joining dogbearraccoonweaselsealsea lioncm dog bear raccoon weasel seal sea lion cm Calculate S x = ( D x )/(n-2) 2.Calculate M ij = D ij -S i -S j and select pair with smallest M ij 3.Create a node that joins this pair and calculate branch lengths as (D ij /2)+(S i -S j )/2 4.Join the two species and make all other taxa in form of a star. 5.Create a new matrix. Calculate the distances between the new node and other taxa as D xij =(D ix +D jx -D ij )/2 dogbearraccoonweaselsscm dog bear raccoon weasel ss cm
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog cm ss br Round 3 bear+raccoon
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog cm ss br brd Round 4 (bear+raccoon)+dog
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog cm ss br brd cmw Round 5 (cat+monkey)+weasel
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog cm ss br bdr cmw bdrss Round 6 (seal+sealion)+(bear+raccoon+dog)
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog cm ss br bdr cmw bdrss
Clustering algorithms: Neighbour-joining cat sea lion seal monkey weasel bear raccoon dog sea lionsealraccoonbearweaseldogcatmonkey UPGMA