2015-10-131 Phylogentic Tree. 2015-10-132 Evolution Evolution of organisms is driven by Diversity  Different individuals carry different variants of.

Slides:



Advertisements
Similar presentations
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Advertisements

Molecular Phylogeny Analysis, Part II. Mehrshid Riahi, Ph.D. Iranian Biological Research Center (IBRC), July 14-15, 2012.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Phylogenetic Trees Lecture 12
. Intro to Phylogenetic Trees Lecture 5 Sections 7.1, 7.2, in Durbin et al. Chapter 17 in Gusfield Slides by Shlomo Moran. Slight modifications by Benny.
Phylogenetic Analysis 1 Phylogeny (phylo =tribe + genesis)
Based on lectures by C-B Stewart, and by Tal Pupko Phylogenetic Analysis based on two talks, by Caro-Beth Stewart, Ph.D. Department of Biological Sciences.
Phylogenetic Analysis
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Phylogenies Preliminaries Distance-based methods Parsimony Methods.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Phylogenetic Reconstruction: Distance Matrix Methods Anders Gorm Pedersen Molecular Evolution Group Center for.
. Phylogenetic Trees Lecture 11 Sections 7.1, 7.2, in Durbin et al.
Phylogenetic reconstruction
The Tree of Life From Ernst Haeckel, 1891.
Phylogenetic reconstruction
. Phylogenetic Trees Lecture 1 Credits: N. Friedman, D. Geiger, S. Moran,
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Phylogenetic Analysis
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Sections 7.1, 7.2, in Durbin et al. Chapter 17 in Gusfield (updated April 12, 2009)
. Phylogenetic Trees Lecture 11 Sections 7.1, 7.2, in Durbin et al.
Phylogenetic trees Sushmita Roy BMI/CS 576
. Phylogenetic Trees Lecture 11 Sections 7.1, 7.2, in Durbin et al.
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Molecular phylogenetics
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Phylogenetic Inference Data Optimality Criteria Algorithms Results Practicalities BIO520 BioinformaticsJim Lund Reading: Ch8.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Phylogenetic Tree Reconstruction
Intro. To Phylogenetic Analysis Slides modified by David Ardell From Caro-Beth Stewart, Paul Higgs, Joe Felsenstein and Mikael Thollesson.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Introduction to Phylogenetic Trees
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Evolutionary Biology Concepts Molecular Evolution Phylogenetic Inference BIO520 BioinformaticsJim Lund Reading: Ch7.
Introduction to Phylogenetics
CSCE555 Bioinformatics Lecture 12 Phylogenetics I Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Calculating branch lengths from distances. ABC A B C----- a b c.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Introduction to Phylogenetic trees Colin Dewey BMI/CS 576 Fall 2015.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Phylogenetic Trees - Parsimony Tutorial #13
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Part 9 Phylogenetic Trees
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
PHYLOGENETIC ANALYSIS. Phylogenetics Phylogenetics is the study of the evolutionary history of living organisms using treelike diagrams to represent pedigrees.
What is phylogenetic analysis and why should we perform it? Phylogenetic analysis has two major components: (1) Phylogeny inference or “tree building”
Bioinformatics Lecture 3 Molecular Phylogenetic By: Dr. Mehdi Mansouri Mehr 1395.
Phylogenetic basis of systematics
Inferring a phylogeny is an estimation procedure.
Multiple Alignment and Phylogenetic Trees
The Tree of Life From Ernst Haeckel, 1891.
Phylogenetic Trees.
Reading Phylogenetic Trees
Presentation transcript:

Phylogentic Tree

Evolution Evolution of organisms is driven by Diversity  Different individuals carry different variants of the same basic blue print Mutations  The DNA sequence can be changed due to single base changes, deletion/insertion of DNA segments, etc.

Basic Assumptions Closer related organisms have more similar genomes. Highly similar genes are homologous (have the same ancestor). A universal ancestor exists for all life forms. Phylogenetic relation can be expressed by a dendrogram (a “tree”).

phylogenetic tree phylogenetic tree is a tree that describes the sequence of speciation events that lead to the forming of a set of current day species;

Ancestral Node or ROOT of the Tree Internal Nodes Branches or Lineages Terminal Nodes A B C D E Common Phylogenetic Tree Terminology

Phylogenetic trees diagram the evolutionary relationships between the taxa ((A,(B,C)),(D,E)) = The above phylogeny as nested parentheses Taxon A Taxon B Taxon C Taxon E Taxon D No meaning to the spacing between the taxa, or to the order in which they appear from top to bottom. This dimension either can have no scale, can be proportional to genetic distance or amount of change (for ‘phylograms’ or ‘additive trees’), or can be proportional to time. These say that B and C are more closely related to each other than either is to A, and that A, B, and C form a clade that is a sister group to the clade composed of D and E. If the tree has a time scale, then D and E are the most closely related.

Historical Note Until mid 1950’s phylogenies were constructed by experts based on their opinion (subjective criteria) Since then, focus on objective criteria for constructing phylogenetic trees  Thousands of articles in the last decades Important for many aspects of biology  Classification  Understanding biological mechanisms

Morphological vs. Molecular Classical phylogenetic analysis: morphological features: number of legs, lengths of legs, etc. Modern biological methods allow to use molecular features  Gene sequences  Protein sequences Analysis based on homologous sequences in different species

Morphological topology Archonta Glires Ungulata Carnivora Insectivora Xenarthra (Based on Mc Kenna and Bell, 1997)

RatQEPGGLVVPPTDA RabbitQEPGGMVVPPTDA GorillaQEPGGLVVPPTDA CatREPGGLVVPPTEG From sequences to a phylogenetic tree There are many possible types of sequences to use.

Perissodactyla Carnivora Cetartiodactyla Rodentia 1 Hedgehogs Rodentia 2 Primates Chiroptera Moles+Shrews Afrotheria Xenarthra Lagomorpha + Scandentia Mitochondrial ( 线粒体 ) topology (Based on Pupko et al.,)

What can we get from phylogenetic trees? A few examples of what can be inferred from phylogenetic trees built from DNA or protein sequence data:  Which species are the closest living relatives of modern humans?  Did the infamous Florida Dentist infect his patients with HIV?

Which species are the closest living relatives of modern humans? Mitochondrial DNA, most nuclear DNA-encoded genes, and DNA/DNA hybridization all show that bonobos and chimpanzees are related more closely to humans than either are to gorillas. MYA Chimpanzees Orangutans Humans Bonobos Gorillas 0 14

Did the Florida Dentist infect his patients with HIV? DENTIST Patient D Patient F Patient C Patient A Patient G Patient B Patient E Patient A Local control 2 Local control 3 Local control 9 Local control 35 Local control 3 Yes: The HIV sequences from these patients fall within the clade of HIV sequences found in the dentist. No Phylogenetic tree of HIV sequences from the DENTIST, his Patients, & Local HIV-infected People:

Types of trees Unrooted tree represents the same phylogeny without the root node

Rooted versus unrooted trees Tree A a b Tree B c Tree C Represents the three rooted trees

Inferring evolutionary relationships between the taxa requires rooting the tree: To root a tree mentally, imagine that the tree is made of string. Grab the string at the root and tug on it until the ends of the string (the taxa) fall opposite the root: A B C Root D A B C D Note that in this rooted tree, taxon A is no more closely related to taxon B than it is to C or D. Rooted tree Unrooted tree

Now, try it again with the root at another position: A B C Root D Unrooted tree Note that in this rooted tree, taxon A is most closely related to taxon B, and together they are equally distantly related to taxa C and D. C D Root Rooted tree A B

An unrooted, four-taxon tree theoretically can be rooted in five different places to produce five different rooted trees The unrooted tree 1: AC B D Rooted tree 1d C D A B 4 Rooted tree 1c A B C D 3 Rooted tree 1e D C A B 5 Rooted tree 1b A B C D 2 Rooted tree 1a B A C D 1 These trees show five different evolutionary relationships among the taxa!

x C A B D AD B E C A D B E C F Each unrooted tree theoretically can be rooted anywhere along any of its branches N (2N - 5)!/(2N - 3 (N - 3)!) (2N - 3)!/(2N - 2 (N - 2)!)

By outgroup: Uses taxa (the “outgroup”) that are known to fall outside of the group of interest (the “ingroup”). Requires some prior knowledge about the relationships among the taxa. There are two major ways to root trees: A B C D By midpoint or distance: Roots the tree at the midway point between the two most distant taxa in the tree, as determined by branch lengths. This assumption is built into some of the distance-based tree building methods. outgroup d (A,D) = = 18 Midpoint = 18 / 2 = 9

Two Methods of Tree Construction Distance- A tree that recursively combines two nodes of the smallest distance. Parsimony – A tree with a total minimum number of character changes between nodes.

Types of data used in phylogenetic inference: Character-based methods: Use the aligned characters, such as DNA or protein sequences, directly during tree inference. Taxa Characters Species AATGGCTATTCTTATAGTACG Species BATCGCTAGTCTTATATTACA Species CTTCACTAGACCTGTGGTCCA Species DTTGACCAGACCTGTGGTCCG Species ETTGACCAGTTCTCTAGTTCG Distance-based methods: Transform the sequence data into pairwise distances (dissimilarities), and then use the matrix during tree building. A B C D E Species A Species B Species C Species D Species E

Distance-Based Method Input: distance matrix between species For two sequences s i and s j, perform a pairwise (global) alignment. Let f = the fraction of sites with different residues. Then Outline: Cluster species together Initially clusters are singletons At each iteration combine two “closest” clusters to get a new one (Jukes-Cantor Model)

Unweighted Pair Group Method using Arithmetic Averages (UPGMA) UPGMA is a type of Distance-Based algorithm UPGMA steps:. 1. Cluster the two species with the smallest distance putting them into a single group. 2. Recalculate the distance matrix with the new group against other groups: 3. With the new distance matrix repeat 1 until all species have been grouped.

Algorithm

UPGMA Step 1 SpeciesABCD B9 ––– C811 –– D – E Merge D & E DE SpeciesABC B9 –– C811 – DE d(DE)A = 0.5 * (dDA+dEA) = 0.5*(12+15) = 13.5 d(DE)B = 0.5 * (dDB+dEB) = 0.5*(15+18) = 16.5 d(DE)C = 0.5 * (dDC+dEC) = 0.5*(10+13) = 11.5

UPGMA Step 2 Merge A & C DE SpeciesABC B9 –– C811 – DE AC SpeciesBAC 10 – DE

UPGMA Steps 3 & 4 Merge B & AC DEAC SpeciesBAC 10 – DE B Merge ABC & DE DEACB (((A,C)B)(D,E))

Optimality criterion: The ‘most-parsimonious’ tree is the one that requires the fewest number of evolutionary events (e.g., nucleotide substitutions, amino acid replacements) to explain the sequences. Parsimony-score: Number of character-changes ( mutations ) along the evolutionary tree Example: Most Parsimonious Tree (MP Tree) AGA AAA AAG GGA AAA AGA AAA AAG GGA AAA AGA Most parsimonious tree:  Tree with minimal parsimony score Score = 4 Score = 3

We cannot go over all the trees. We will try to find a way to find the best tree. There are approximate solutions… But what if we want to make sure we find the global maximum. There is a way more efficient than just go over all possible tree. It is called BRANCH AND BOUND and is a general technique in computer science, that can be applied to phylogeny. There are many trees..,

BRANCH AND BOUND To exemplify the BRANCH AND BOUND (BNB) method, we will use an example not connected to evolution. Later, when the general BNB method is understood, we will see how to apply this method to finding the MP tree. We will present the traveling sales person path problem (TSP).

Branch and Bound for TSP Find a minimum cost round-trip path that visits each intermediate city exactly once Greedy approach: A,G,E,F,B,D,C,A = 251 A C F E D G B

Search all possible paths All paths A  G (20) A  G  F (88) AGFBAGFBAGFEAGFEAGFCAGFC A  G  E (55) A  B (46)A  C (93) A  C  B (175) A  C  B  E (257) ACDACDACFACF  Best estimate: 251

Back to finding the MP tree Finding the MP tree BNB helps, though it is still exponential…

The MP search tree is added to branch is added to branch 2. There are 5 branches

The MP search tree 4 is added to branch

MP-BNB 4 is added to branch Best (minimum) value = 52

MP-BNB 4 is added to branch Best record = 52

MP-BNB 4 is added to branch Best record = 52

MP-BNB Best record = 52

MP-BNB Best record = 52

MP-BNB Best record =

MP-BNB Best record =

MP-BNB Best record =

MP-BNB Best record =

MP-BNB Best TREE. MP score = 42 Total # trees visited: 14

Order of Evaluation Matters Evaluate all 3 first Total tree visited: 9 The bound after searching this subtree will be 42.