Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.

Similar presentations


Presentation on theme: ". Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different."— Presentation transcript:

1 . Class 9: Phylogenetic Trees

2 The Tree of Life

3 Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different species l Speciation caused by physical separation into groups where different genetic variants become dominant u Any two species share a (possibly distant) common ancestor

4 Phylogenies u A phylogeny is a tree that describes the sequence of speciation events that lead to the forming of a set of current day species u Leafs - current day species u Nodes - hypothetical most recent common ancestors u Edges length - “time” from one speciation to the next AardvarkBisonChimpDogElephant

5 u Until mid 1950’s phylogenies were constructed by experts based on their opinion (subjective criteria) u The Linnaeus classification scheme implicitly assumes tree structure u Since then, focus on objective criteria for constructing phylogenetic trees l Thousands of articles in the last decades u Important for many aspects of biology l Classification (systematics) l Understanding biological mechanisms

6 Morphological vs. Molecular u Classical phylogenetic analysis: morphological features l number of legs, lengths of legs, etc. u Modern biological methods allow to use molecular features l Gene sequences l Protein sequences u Analysis based on homologous sequences (e.g., globins) in different species

7 Dangers in Molecular Phylogenies u We have to remember that gene/protein sequence can be homologous for different reasons: u Orthologs -- sequences diverged after a speciation event u Paralogs -- sequences diverged after a duplication event u Xenologs -- sequences diverged after a horizontal transfer (e.g., by virus)

8 Dangers of Paralogs Speciation events Gene Duplication 1A 2A 3A3B 2B1B

9 Dangers of Paralogs Speciation events Gene Duplication 1A 2A 3A3B 2B1B u If we only consider 1A, 2B, and 3A...

10 Types of Trees u A natural model to consider is that of rooted trees Common Ancestor

11 Types of Trees u Depending on the model, data from current day species does not distinguish between different placements of the root vs

12 Types of trees u Unrooted tree represents the same phylogeny with the root node

13 Positioning Roots in Unrooted Trees u We can estimate the position of the root by introducing an outgroup: l a set of species that are definitely distant from all the species of interest AardvarkBisonChimpDogElephant Falcon Proposed root

14 Type of Data u Distance-based l Input is a matrix of distances between species l Can be fraction of residue they disagree on, or alignment score between them, or … u Character-based l Examine each character (e.g., residue) separately

15 Simple Distance-Based Method Input: distance matrix between species Outline: u Cluster species together u Initially clusters are singletons u At each iteration combine two “closest” clusters to get a new one

16 UPGMA Clustering  Let C i and C j be clusters, define distance between them to be  When we combine two cluster, C i and C j, to form a new cluster C k, then

17 Molecular Clock u UPGMA implicitly assumes that all distances measure time in the same way 1 23 4 2341

18 Additivity u A weaker requirement is additivity l In “real” tree, distances between species are the sum of distances between intermediate nodes a b c i j k

19 Consequences of Additivity u Suppose input distances are additive u For any three leaves u Thus a b c i j k m

20 Neighbor Joining u Can we use this fact to construct trees? u Let where Theorem: if D(i,j) is minimal (among all pairs of leaves), then i and j are neighbors in the tree

21 Neighbor Joining  Set L to contain all leaves Iteration:  Choose i,j such that D(i,j) is minimal  Create new node k, and set  remove i,j from L, and add k Terminate: when |L| =2, connect two remaining nodes

22 Distance Based Methods u If we make strong assumptions on distances, we can reconstruct trees u In real-life distances are not additive u Sometimes they are close to additive

23 Parsimony u Character-based method Assumptions: u Independence of characters (no interactions) u Best tree is one where minimal changes take place

24 Simple Example u Suppose we have five species, such that three have ‘C’ and two ‘T’ at a specified position u Minimal tree has one evolutionary change: C C C C C T T T T  C

25 Another Example u What is the parsimony score of AardvarkBisonChimpDogElephant A : CAGGTA B : CAGACA C : CGGGTA D : TGCACT E : TGCGTA

26 Evaluating Parsimony Scores u How do we compute the Parsimony score for a given tree? u Weighted Parsimony Each change is weighted by the score c(a,b)

27 Evaluating Parsimony Scores Dynamic programming on the tree Initialization:  For each leaf i set S(i,a) = 0 if i is labeled by a, otherwise S(i,a) =  Iteration:  if k is node with children i and j, then S(k,a) = min b (S(i,b)+c(a,b)) + min b (S(j,b)+c(a,b)) Termination:  cost of tree is min a S(r,a) where r is the root

28 Example AardvarkBisonChimpDogElephant A : CAGGTA B : CAGACA C : CGGGTA D : TGCACT E : TGCGTA

29 Cost of Evaluating Parsimony  If there are n nodes, m characters, and k possible values for each character, then complexity is O(nmk) u Using this procedure, we can reconstruct most parsimonious values at each ancestor node


Download ppt ". Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different."

Similar presentations


Ads by Google