. Robustness to Noise in Distance-Based Phylogenetic Reconstruction Methods Tutorial #12 © Ilan Gronau.
. Distance-Based Phylogenetic Reconstruction The distance-based approach: Estimate evolutionary distances between every two species. Reconstruct Phylogenetic tree (best) fitting the dissimilarity matrix. You saw in class: A phylogenetic tree is uniquely defined by its induced metric. (metrics which can be realized by some tree are called additive) There are efficient methods for reconstructing this tree. Problems: How do we estimate evolutionary distances? We don’t discuss this in this course. How ‘close’ do these distances have to be to the ‘real’ distances? This tutorial.
. a b c d e f g h a b c d e f g h A phylogenetic tree is uniquely defined by its induced metric. How dense is this “space”? Can we tolerate some small noise? The Phylogenetic “Error-Correction Code” All matrices in a ball surrounding each additive metric uniquely define the topology of the tree. The radius of these balls depend on the “center tree” (weight of minimal edge). T1T1 T2T2 T3T3 [Atteson ‘99]
. l ∞ norm (worst-case noise): A dissimilarity matrix D is near-additive if there is a binary tree T s.t. ||D,D T || ∞ < ½ *w min (T) Near-additive matrices uniquely define a tree topology. They define tangent balls. The Phylogenetic “Error-Correction Code” [Atteson ‘99] ||D,D T1 || ∞ = ½ *w min (T 1 ) ||D,D T2 || ∞ = ½ *w min (T 2 ) T1T1 T2T2 T 1 and T 2 have different topologies ! You show in HW 5. We show how to reconstruct the correct topology given near-additive dissimilarities D
. Input: a dissimilarity matrix D over S. Output: A phylogenetic tree over S. a)Choose root r S. b)Calculate LCA-depths from r : Stopping condition: if L=[w], return T = Otherwise: 1.Choose a ‘mutually deepest’ pair (i,j) ( L(i,j) = max k≠i { L(i,k) } = max k≠j { L(k,j) } ) 2.Replace i,j with new element v, and reduce L : L(v,v) = L(i,j) For k≠v, L(v,k) = αL(i,k) + (1-α)L(j,k) ( 0 ≤ α ≤ 1 ) 3.Recursively execute the algorithm on the reduced matrix 4.Add i,j as daughter nodes of v with edges of weight: w(v,i) = max{ 0, L(i,i) – L(i,j) } ; w(v,j) = max{ 0, L(j,j) – L(i,j) } r x w Deepest LCA Neighbor Joining convex reduction
. Sketch of consistency proof (shown in class): If D is additive, consistent with tree T, then L=LCA(D,r) contains the distances of all taxon-pair LCAs from r. A ‘mutually deepest’ taxon-pair (i,j) is a neighbor-pair (cherries). The reduction computes the ‘real’ LCA-depths corrsponding to v – the parent of (i,j). - L(v,v) = L(i,j). ( v is the LCA of i and j ). - for k≠v, L(v,k) = L(i,k) = L(j,k). Deepest LCA Neighbor Joining
. B C A E D D is additive: Deepest LCA Neighbor Joining - Example D: ABCDE A B C 0116 D 07 E 0 L: ABCD A 7331 B 3941 C 3461 D 1117 root B/C C B A/B/C ( B,C ) is the only mutually deepest pair. We can tolerate noise smaller than ±½. row maxima In general we can tolerate any noise which maintains the off-diagonal maximum in every row.
. Robustness of DLCA Theorem: If ||D,D T || ∞ < ½*w min (T), then the tree returned by DLCA on input D has the same topology as T. (for any selection of root) DTDT D Let L be the matrix calculated in stage (b) ( L = LCA(D,r) ). Let L T be the “true” LCA matrix ( L T = LCA(D T,r) ). 1.We show that L weakly preserves the order of each row in L T. ( L T (i,j)> L T (i,k) L(i,j)> L(i,k) ) 2.We prove by induction that this implies that the recursive procedure outputs a tree with the same topology as T.
. Robustness of DLCA (cont) L T (i,j) > L T (i,k) ½(D T (r,i)+D T (r,j)-D T (i,j)) > ½(D T (r,i)+D T (r,k)-D T (i,k)) D T (r,j)-D T (i,j)) > D T (r,k)-D T (i,k) D T (r,j)+D T (i,k)) > D T (r,k)+D T (i,j) D T (r,j)+D T (i,k)) ≥ D T (r,k)+D T (i,j)+2 * w min (T) D(r,j)+D(i,k)) > D(r,k)+D(i,j) D(r,j)-D(i,j)) > D(r,k)-D(i,k) ½(D(r,i)+D(r,j)-D(i,j)) > ½(D(r,i)+D(r,k)-D(i,k)) L(i,j) > L(i,k) 1.If ||D,D T || ∞ L T (i,k) L(i,j)> L(i,k) ) k r i j w ≥ w min (T) T : 4-point condition ||D,D T || ∞ < ½*w min (T)
. Robustness of DLCA (cont) 2.If L weakly preserves the order of each row in L T, then the recursive procedure returns a tree with the same topology as T. a)The pair (i’,j’) chosen in step (1) is a neighbor-pair in T. (i’,j’) is a mutually deepest pair in L For every k≠i’,j’, max{L(i’,k), L(j’,k)} ≤ L(i’,j’) For every k≠i’,j’, max{L T (i’,k), L T (j’,k)} ≤ L T (i’,j’) i’ and j’ are neighbors in T. Assume: L T (i,j)> L T (i,k) L(i,j)> L(i,k) shown in class Base case is immediate
. Robustness of DLCA (cont) 2.If L weakly preserves the order of each row in L T, then the recursive procedure returns a tree with the same topology as T. Assume: L T (i,j)> L T (i,k) L(i,j)> L(i,k) a)The pair (i’,j’) chosen in step (1) is a neighbor-pair in T. b)The reduced matrix L’ calculated in step (2) weakly preserves the order of each row in the reduced L’ T. Assume L’ T (i,j)> L’ T (i,k). If i,j,k≠v (new vertex), then L’(i,j)> L’(i,k) by the induction hypothesis. If i=v, then L’ T (v,j) =L T (i’,j) =L T (j’,j) and L’ T (v,k) =L T (i’,k) =L T (j’,k) min{L T (i’,j), L T (j’,j)} > max{L T (i’,k), L T (j’,k)} min{L(i’,j), L(j’,j)} > max{L(i’,k), L(j’,k)} L’(v,j) > L’(v,k) Can be similarly shown when j=v or k=v. convex reduction
. Robustness of DLCA (cont) 2.If L weakly preserves the order of each row in L T, then the recursive procedure returns a tree with the same topology as T. Assume: L T (i,j)> L T (i,k) L(i,j)> L(i,k) a)The pair (i’,j’) chosen in step (1) is a neighbor-pair in T. b)The reduced matrix L ’ calculated in step (2) weakly preserves the order of each row in the reduced L’ T. c)The induction hypothesis implies that the tree (over S\{i’,j’} U {v} ) returned by the recursive call in step (3) has the same topology as T (with i’,j’ replaced by v ). d)In step (4) we add i’ and j’ as sons of v and the resulting tree has the same topology as T. Q.E.D
. Robustness of Other Algorithms Many other algorithms also reconstruct the correct topology given near-additive input: Other neighbor joining algorithms: Saitou and Nei’s NJ, AddTree … All quartet-based algorithms (you show this in HW 5) Atteson defines two reconstruction radii: An algorithm A has l ∞ -radius of ε iff it is guaranteed to return binary tree T given D s.t. ||D,D T || ∞ < ε *w min (T) An algorithm A has edge l ∞ -radius of ε iff it correctly reconstructs all edges in of weight > (1/ ε)* ||D,D T || ∞ edge l ∞ -radius ≤ l ∞ -radius ≤ ½
. Generalized Robustness An algorithm A has edge l ∞ -radius of ε iff it correctly reconstructs all edges in of weight > (1/ ε)* ||D,D T || ∞ DLCA has optimal edge l ∞ -radius of ½. NJ has edge l ∞ -radius of ¼. Typically, NJ reconstructs more edges than DLCA. Why is this? Worst-case noise ( ||D,D T || ∞ ) is typically much larger than average-case noise.