Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brandon Andrews CS6030.  What is a phylogenetic tree?  Goals in a phylogenetic tree generator  Distance based method  Fitch-Margoliash Method Example.

Similar presentations


Presentation on theme: "Brandon Andrews CS6030.  What is a phylogenetic tree?  Goals in a phylogenetic tree generator  Distance based method  Fitch-Margoliash Method Example."— Presentation transcript:

1 Brandon Andrews CS6030

2  What is a phylogenetic tree?  Goals in a phylogenetic tree generator  Distance based method  Fitch-Margoliash Method Example  Verification  Demo

3  Also known as an evolutionary tree  Attempts to map the genetic similarity of organisms into a tree where longer branches indicate more dissimiliarity A B C B and C are similar A and B are more similar than A and C which have a longer distance

4  Given the sequences and calculated or known dissimilarity construct a tree which correctly maps this data  Naïve method: Generate every possible tree and grade its quality

5  Take a distance matrix that stores the distance from every sequence to every other sequence  Construct a tree which preserves these distances Most don’t 100% preserve the distances

6  Clustering algorithm that works bottom up to create an unrooted tree  Weights are used to help lower the error rate for long paths

7  Calculate a distance matrix Hamming distance can be used, but a better dissimilarity function is advised ABCDE A02239 41 B00 43 C0001820 D000010 E00000

8  Add all the sequences to an array of nodes and mark them as leaves  Select the closest nodes by scanning the distance matrix  Those two nodes, in our example D and E will make up the two branches in a 3-branch calculation to find the branch lengths D E A, B, C d e abc dist(ABC, D) is the average distance from ABC to D Dist(ABC, E) is the average distance from ABC to E d = (dist(D, E) + (dist(ABC, D) - dist(ABC, E))) / 2; e = dist(D, E) - d; abc = dist(ABC, D) - d;

9  dist(ABC, D) and dist(ABC, E) Calculate by taking the distance from each of the elements A, B, and C and averaging them d = (10 + (32.6… - 34.6…)) / 2 = 4 e = 10 - 4 = 6 abc = 32.6… - 4 = 28.6… ABCDE 032.6…34.6… D0010 E000

10  Now we can create a new node with distance 28.6… and set D and E to their respective distances  Since D and E are leaves their distance are kept. However, if they weren’t then the average of the child distances would be subtracted as seen later D E A, B, C 4 6 28.6…

11  The final step in this iteration is to recalculate the nodes and distance matrix The nodes array has the new merged node DE appended to the end and D and E are removed The distance matrix is updated with DE merged and D and E are removed: ABCDE A0223940 B004142 C00019 DE0000

12  Look at the new distance matrix find the closest pair, C and DE  Now there is a special step. C is a leaf so it gets the calculated distance DE is not a leaf so we need to subtract from DE the average child distance C DE A, B c de ab dist(AB, C) is the average distance from AB to C Dist(AB, DE) is the average distance from AB to DE c = (dist(C, DE) + (dist(AB, C) - dist(AB, DE))) / 2; de = dist(C, DE) - c; ab = dist(AB, C) - c;

13  Merging A and B to calculate the average distance to C and DE. dist(AB, C) dist(AB, DE) ABCDE AB04041 C0019 DE000

14  Average child distance example Recursively take the average of each branches ((5 + ((2 + (4 + 6) / 2) + 3) / 2) + 1) / 2 = 5.5 4 6 3 1 2 5

15  So for DE which has two child nodes we need to subtract the average of the children. Since DE has two leaf nodes we perform:  (4 + 6) / 2 = 5  So now we calculate c, de, and ab:  c = (dist(C, DE) + (dist(AB, C) - dist(AB, DE))) / 2 = (19 + (40 – 41)) / 2 = 9  de = dist(C, DE) – c – AverageDistance(DE) = 19 – 9 – (4 + 6) / 2 = 5  ab = dist(AB, C) – c = 40 – 9 = 31  Notice that the distance at de replaces whatever was previously there

16  With the new node added:  Recalculated distance matrix: C A, B 9 5 31 D E 4 6 ABCDE A02239.5 B0041.5 CDE000

17  As before choose the next closest nodes by looking at the distance matrix A and B are chosen Now a and b can be calculated since they are leaves, but notice we’re linking two trees at cde, so we need a special step to subtract the average distance A CDE a b cde B dist(CDE, A) is the average distance from CDE to A Dist(CDE, B) is the average distance from CDE to B a = (dist(A, B) + (dist(CDE, A) - dist(CDE, B))) / 2 = 10 b = dist(A, B) - c = 12 cde = dist(CDE, A) - a = 29.5

18  So 29.5 - AverageDistance(CDE) 29.5 - ((5 + (4 + 6) / 2) + 9) / 2 = 29.5 - 9.5 = 20 C A, B 9 5 D E 4 6 A CDE 10 12 cde B 29.5 C 9 5 D E 4 6 A 10 12 B 20

19  So we have a completely defined unrooted tree. How do we root it? Just take the last branch and divide it by two C 9 5 D E 4 6 A 10 12 B 10

20  Original:  From the generated tree:  Exact match Rare to happen Usually off by a small amount ABCDE A02239 41 B00 43 C0001820 D000010 E00000 ABCDE A02239 41 B00 43 C0001820 D000010 E00000

21  http://sirisian.com/javascript/CS6030Project.html

22  Distance based methods such as the Fitch-Margoliash method produce very accurate trees given an accurate distance matrix in a very timely manner

23 Bacardit, J., Krasnogor, N. Phylogenetic Trees [PPT document]. Retrieved from http://www.cs.nott.ac.uk/~jqb/G53BIO/Slides/Phylogenetic%20Trees.ppt Louhisuo K. (2004, May 4). Constructing phylogenetic trees with UPGMA and Fitch- Margoliash. Retrieved from http://www.niksula.cs.hut.fi/~klouhisu/Bioinfo/phyltree.pdf


Download ppt "Brandon Andrews CS6030.  What is a phylogenetic tree?  Goals in a phylogenetic tree generator  Distance based method  Fitch-Margoliash Method Example."

Similar presentations


Ads by Google