Tutorial 5 Phylogenetic Trees
Agenda How to construct a tree using Neighbor Joining algorithm Phylogeny.fr tool Cool story of the day: Horizontal gene transfer
Neighbor Joining vs. UPGMA Assumption: Divergence of sequences is assumed to occur at a constant rate Distance to root is equal Constructs an unrooted guide tree from a distance matrix We do not assume constant rate of evolution
Unrooted guide tree 4
Neighbor Joining Algorithm 2 matrices Calculate all pairwise distances. Find 2 nodes i and j, such that the relative distance between i and j is minimal. Remove the rows and columns of i and j Add a new row and column k (the parent of i and j), and compute the distance from k to any other remaining node. Continue until two nodes remain – connect them with an edge.
Step 1. Calculate all pairwise distances A, B, C, D and E are tree nodes. Each character represents a sequence. How can we measure distance between sequences? E D C B A 41 39 22 43 20 18 10
Step 1. Calculate all pairwise distances Distance between sequences Euclidean Distance: Given a multiple sequence alignment, calculate the square root of the sum of the score at every position between two sequences. The score increases as the dissimilarity between residues increases.
Step 1. Calculate all pairwise distances The distance between each pair of sequences is based on multiple sequence alignment Multiple sequence alignment a: A T G G C b: A A G C C c: C A G C C d: G G G C G e: A T G C C 𝒂 𝟏 𝒂 𝟐 𝒂 𝟑 𝒂 𝟒 𝒂 𝟓 A T G G C A A G C C 𝒃 𝟏 𝒃 𝟐 𝒃 𝟑 𝒃 𝟒 𝒃 𝟓
Step 2. Two nodes with minimal relative distance If we assume constant evolution rate we may construct a wrong tree. Closest leaves aren’t necessarily neighbors: i and j are neighbors, but (dij = 13) > (djk = 12)
Step 2. Two nodes with minimal relative distance Find a pair of leaves that are close to each other, but far from other leaves. This is called “relative distance”. Negative values As the average distance from the common ancestor to the rest of the nodes increases, Mij has a lower value. Select pair that produce lowest value Reevaluate M with every iteration
Step 2. Two nodes with minimal relative distance Relative distance between i and j Distance between i and j (from the distance table) Distance between i and all other nodes Number of leaves (=sequences) left in the tree
Step 2. Two nodes with minimal relative distance Distances matrix: E D C B A 41 39 22 43 20 18 10 A B C D E
Step 2. Two nodes with minimal relative distance Distances matrix: E D C B A 41 39 22 43 20 18 10 𝑟 𝐴 =47 𝑟 𝐵 =49 𝑟 𝐶 =39.3 𝑟 𝐷 =36 𝑟 𝐸 =38
Step 2. Two nodes with minimal relative distance The relative distance table: E D C B A -44 -47.3 -74 - -57.3 -64 A,B is the pair with the minimal Mi,j distance. The Mij Table is used only to choose the closest pairs (lowest value) and not for calculating the distances
Steps 3+4. Remove i, j and add k to the matrix The distance from k to any other leaf m can be computed as: Dkm = (Dim + Djm – Dij)/2 Compress i and j into k, iterate algorithm for rest of tree
Steps 3+4. Remove i, j and add k to the matrix Now we’ll calculate the distance from X to all other nodes: E D C k 31 29 20 18 10 A B C D E K
Steps 5. Continue till 2 nodes remain The final tree: B A 12 10 20 5 6 4 9 K What is missing? E Y Z C D
Phylogeny.fr
One click mode
One click mode
One click mode Multiple sequence alignment (MSA) is a basic step in phylogenetic tree construction and is the base for the distance matrix
“A la carte” mode
“On-click” default settings
Horizontal gene transfer Cool Story of the day Horizontal gene transfer
Is horizontal gene transfer possible?
Viruses
Horizontal gene transfer in Bacteria Horizontal gene transfer is the primary reason for bacterial antibiotic resistance and plays an important role in the evolution of bacteria. Horizontal gene transfer is very abundant in bacteria, it is hard to talk about a bacteria’s genome, but more of the genome of a “society of bacteria”. http://en.wikipedia.org/wiki/Horizontal_gene_transfer
Sea slug The sea slug Elysia chlorotica incorporates chloroplasts from the algae that it eats into its body. Photosynthesis continues for up to 12 months using genes within the chloroplast, which are directed by algal nuclear genes that were transferred to the nuclei of the slug. http://en.wikipedia.org/wiki/Horizontal_gene_transfer
Until the full speciation… Bioinformatics/ David W.Mount p. 244