Presentation is loading. Please wait.

Presentation is loading. Please wait.

2015-10-131 Phylogentic Tree. 2015-10-132 Evolution Evolution of organisms is driven by Diversity  Different individuals carry different variants of.

Similar presentations


Presentation on theme: "2015-10-131 Phylogentic Tree. 2015-10-132 Evolution Evolution of organisms is driven by Diversity  Different individuals carry different variants of."— Presentation transcript:

1 2015-10-131 Phylogentic Tree

2 2015-10-132 Evolution Evolution of organisms is driven by Diversity  Different individuals carry different variants of the same basic blue print Mutations  The DNA sequence can be changed due to single base changes, deletion/insertion of DNA segments, etc.

3 2015-10-133 Basic Assumptions Closer related organisms have more similar genomes. Highly similar genes are homologous (have the same ancestor). A universal ancestor exists for all life forms. Phylogenetic relation can be expressed by a dendrogram (a “tree”).

4 2015-10-134 phylogenetic tree phylogenetic tree is a tree that describes the sequence of speciation events that lead to the forming of a set of current day species;

5 2015-10-135 Ancestral Node or ROOT of the Tree Internal Nodes Branches or Lineages Terminal Nodes A B C D E Common Phylogenetic Tree Terminology

6 2015-10-136 Phylogenetic trees diagram the evolutionary relationships between the taxa ((A,(B,C)),(D,E)) = The above phylogeny as nested parentheses Taxon A Taxon B Taxon C Taxon E Taxon D No meaning to the spacing between the taxa, or to the order in which they appear from top to bottom. This dimension either can have no scale, can be proportional to genetic distance or amount of change (for ‘phylograms’ or ‘additive trees’), or can be proportional to time. These say that B and C are more closely related to each other than either is to A, and that A, B, and C form a clade that is a sister group to the clade composed of D and E. If the tree has a time scale, then D and E are the most closely related.

7 2015-10-137 Historical Note Until mid 1950’s phylogenies were constructed by experts based on their opinion (subjective criteria) Since then, focus on objective criteria for constructing phylogenetic trees  Thousands of articles in the last decades Important for many aspects of biology  Classification  Understanding biological mechanisms

8 2015-10-138 Morphological vs. Molecular Classical phylogenetic analysis: morphological features: number of legs, lengths of legs, etc. Modern biological methods allow to use molecular features  Gene sequences  Protein sequences Analysis based on homologous sequences in different species

9 2015-10-139 Morphological topology Archonta Glires Ungulata Carnivora Insectivora Xenarthra (Based on Mc Kenna and Bell, 1997)

10 2015-10-1310 RatQEPGGLVVPPTDA RabbitQEPGGMVVPPTDA GorillaQEPGGLVVPPTDA CatREPGGLVVPPTEG From sequences to a phylogenetic tree There are many possible types of sequences to use.

11 2015-10-1311 Perissodactyla Carnivora Cetartiodactyla Rodentia 1 Hedgehogs Rodentia 2 Primates Chiroptera Moles+Shrews Afrotheria Xenarthra Lagomorpha + Scandentia Mitochondrial ( 线粒体 ) topology (Based on Pupko et al.,)

12 2015-10-1312 What can we get from phylogenetic trees? A few examples of what can be inferred from phylogenetic trees built from DNA or protein sequence data:  Which species are the closest living relatives of modern humans?  Did the infamous Florida Dentist infect his patients with HIV?

13 2015-10-1313 Which species are the closest living relatives of modern humans? Mitochondrial DNA, most nuclear DNA-encoded genes, and DNA/DNA hybridization all show that bonobos and chimpanzees are related more closely to humans than either are to gorillas. MYA Chimpanzees Orangutans Humans Bonobos Gorillas 0 14

14 2015-10-1314 Did the Florida Dentist infect his patients with HIV? DENTIST Patient D Patient F Patient C Patient A Patient G Patient B Patient E Patient A Local control 2 Local control 3 Local control 9 Local control 35 Local control 3 Yes: The HIV sequences from these patients fall within the clade of HIV sequences found in the dentist. No Phylogenetic tree of HIV sequences from the DENTIST, his Patients, & Local HIV-infected People:

15 2015-10-1315 Types of trees Unrooted tree represents the same phylogeny without the root node

16 2015-10-1316 Rooted versus unrooted trees Tree A a b Tree B c Tree C Represents the three rooted trees

17 2015-10-1317 Inferring evolutionary relationships between the taxa requires rooting the tree: To root a tree mentally, imagine that the tree is made of string. Grab the string at the root and tug on it until the ends of the string (the taxa) fall opposite the root: A B C Root D A B C D Note that in this rooted tree, taxon A is no more closely related to taxon B than it is to C or D. Rooted tree Unrooted tree

18 2015-10-1318 Now, try it again with the root at another position: A B C Root D Unrooted tree Note that in this rooted tree, taxon A is most closely related to taxon B, and together they are equally distantly related to taxa C and D. C D Root Rooted tree A B

19 2015-10-1319 An unrooted, four-taxon tree theoretically can be rooted in five different places to produce five different rooted trees The unrooted tree 1: AC B D Rooted tree 1d C D A B 4 Rooted tree 1c A B C D 3 Rooted tree 1e D C A B 5 Rooted tree 1b A B C D 2 Rooted tree 1a B A C D 1 These trees show five different evolutionary relationships among the taxa!

20 2015-10-1320 x C A B D AD B E C A D B E C F Each unrooted tree theoretically can be rooted anywhere along any of its branches N (2N - 5)!/(2N - 3 (N - 3)!) (2N - 3)!/(2N - 2 (N - 2)!)

21 2015-10-1321 By outgroup: Uses taxa (the “outgroup”) that are known to fall outside of the group of interest (the “ingroup”). Requires some prior knowledge about the relationships among the taxa. There are two major ways to root trees: A B C D 10 2 3 5 2 By midpoint or distance: Roots the tree at the midway point between the two most distant taxa in the tree, as determined by branch lengths. This assumption is built into some of the distance-based tree building methods. outgroup d (A,D) = 10 + 3 + 5 = 18 Midpoint = 18 / 2 = 9

22 2015-10-1322 Two Methods of Tree Construction Distance- A tree that recursively combines two nodes of the smallest distance. Parsimony – A tree with a total minimum number of character changes between nodes.

23 2015-10-1323 Types of data used in phylogenetic inference: Character-based methods: Use the aligned characters, such as DNA or protein sequences, directly during tree inference. Taxa Characters Species AATGGCTATTCTTATAGTACG Species BATCGCTAGTCTTATATTACA Species CTTCACTAGACCTGTGGTCCA Species DTTGACCAGACCTGTGGTCCG Species ETTGACCAGTTCTCTAGTTCG Distance-based methods: Transform the sequence data into pairwise distances (dissimilarities), and then use the matrix during tree building. A B C D E Species A---- 0.20 0.50 0.45 0.40 Species B0.23 ---- 0.40 0.55 0.50 Species C0.87 0.59 ---- 0.15 0.40 Species D0.73 1.12 0.17 ---- 0.25 Species E0.59 0.89 0.61 0.31 ----

24 2015-10-1324 Distance-Based Method Input: distance matrix between species For two sequences s i and s j, perform a pairwise (global) alignment. Let f = the fraction of sites with different residues. Then Outline: Cluster species together Initially clusters are singletons At each iteration combine two “closest” clusters to get a new one (Jukes-Cantor Model)

25 2015-10-1325 Unweighted Pair Group Method using Arithmetic Averages (UPGMA) UPGMA is a type of Distance-Based algorithm UPGMA steps:. 1. Cluster the two species with the smallest distance putting them into a single group. 2. Recalculate the distance matrix with the new group against other groups: 3. With the new distance matrix repeat 1 until all species have been grouped.

26 Algorithm 2015-10-1326

27 2015-10-1327 UPGMA Step 1 SpeciesABCD B9 ––– C811 –– D121510 – E1518135 Merge D & E DE SpeciesABC B9 –– C811 – DE13.516.511.5 d(DE)A = 0.5 * (dDA+dEA) = 0.5*(12+15) = 13.5 d(DE)B = 0.5 * (dDB+dEB) = 0.5*(15+18) = 16.5 d(DE)C = 0.5 * (dDC+dEC) = 0.5*(10+13) = 11.5

28 2015-10-1328 UPGMA Step 2 Merge A & C DE SpeciesABC B9 –– C811 – DE13.516.511.5 AC SpeciesBAC 10 – DE16.512.5

29 2015-10-1329 UPGMA Steps 3 & 4 Merge B & AC DEAC SpeciesBAC 10 – DE16.512.5 B Merge ABC & DE DEACB (((A,C)B)(D,E))

30 2015-10-1330 Optimality criterion: The ‘most-parsimonious’ tree is the one that requires the fewest number of evolutionary events (e.g., nucleotide substitutions, amino acid replacements) to explain the sequences. Parsimony-score: Number of character-changes ( mutations ) along the evolutionary tree Example: Most Parsimonious Tree (MP Tree) AGA AAA AAG GGA 1 1 02 0 0 1 0 01 0 1 AAA AGA AAA AAG GGA AAA AGA Most parsimonious tree:  Tree with minimal parsimony score Score = 4 Score = 3

31 2015-10-1331 We cannot go over all the trees. We will try to find a way to find the best tree. There are approximate solutions… But what if we want to make sure we find the global maximum. There is a way more efficient than just go over all possible tree. It is called BRANCH AND BOUND and is a general technique in computer science, that can be applied to phylogeny. There are many trees..,

32 2015-10-1332 BRANCH AND BOUND To exemplify the BRANCH AND BOUND (BNB) method, we will use an example not connected to evolution. Later, when the general BNB method is understood, we will see how to apply this method to finding the MP tree. We will present the traveling sales person path problem (TSP).

33 2015-10-1333 Branch and Bound for TSP Find a minimum cost round-trip path that visits each intermediate city exactly once Greedy approach: A,G,E,F,B,D,C,A = 251 A C F E D G B 93 46 20 35 68 12 57 31 15 82 17 82 59

34 2015-10-1334 Search all possible paths All paths A  G (20) A  G  F (88) AGFBAGFBAGFEAGFEAGFCAGFC A  G  E (55) A  B (46)A  C (93) A  C  B (175) A  C  B  E (257) ACDACDACFACF  Best estimate: 251

35 2015-10-1335 Back to finding the MP tree Finding the MP tree BNB helps, though it is still exponential…

36 2015-10-1336 The MP search tree 1 2 3 4 is added to branch 1. 1 2 3 4 1 2 3 4 1 2 3 4 5 is added to branch 2. There are 5 branches

37 2015-10-1337 The MP search tree 4 is added to branch 1. 30 4339 5254525358615659616953514247 55

38 2015-10-1338 MP-BNB 4 is added to branch 1. 30 4339 5254525358615659616953514247 55 Best (minimum) value = 52

39 2015-10-1339 MP-BNB 4 is added to branch 1. 30 4339 5254525358615659616953514247 55 Best record = 52

40 2015-10-1340 MP-BNB 4 is added to branch 1. 30 4339 5254525358615659616953514247 55 Best record = 52

41 2015-10-1341 MP-BNB 30 4339 525452535853514247 55 Best record = 52

42 2015-10-1342 MP-BNB 30 4339 525452535853514247 55 Best record = 52

43 2015-10-1343 MP-BNB 30 4339 525452535853514247 55 Best record = 52 51 5358

44 2015-10-1344 MP-BNB 30 4339 525452535853514247 55 Best record = 52 51 42

45 2015-10-1345 MP-BNB 30 4339 525452535853514247 55 Best record = 52 51 42

46 2015-10-1346 MP-BNB 30 4339 525452535853514247 55 Best record = 52 51 42

47 2015-10-1347 MP-BNB 30 4339 525452535853514247 55 Best TREE. MP score = 42 Total # trees visited: 14

48 2015-10-1348 Order of Evaluation Matters 30 4339 53514247 55 Evaluate all 3 first Total tree visited: 9 The bound after searching this subtree will be 42.


Download ppt "2015-10-131 Phylogentic Tree. 2015-10-132 Evolution Evolution of organisms is driven by Diversity  Different individuals carry different variants of."

Similar presentations


Ads by Google