Lecture 6A – Introduction to Trees & Optimality Criteria Branches: n-taxa -> 2n-3 branches 1, 2, 4, 6, & 7 are external (leaves) 3 & 5 are internal branches.

Slides:



Advertisements
Similar presentations
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Advertisements

Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Introduction to Trees Chapter 6 Objectives
Data Structures: A Pseudocode Approach with C 1 Chapter 6 Objectives Upon completion you will be able to: Understand and use basic tree terminology and.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Tree Reconstruction.
Problem Set 2 Solutions Tree Reconstruction Algorithms
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Phylogeny Tree Reconstruction
Phylogenetic trees as a visualization tools for evolutionary classification.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
CS2420: Lecture 13 Vladimir Kulyukin Computer Science Department Utah State University.
. Phylogenetic Trees - Parsimony Tutorial #12 Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in:
Building phylogenetic trees Jurgen Mourik & Richard Vogelaars Utrecht University.
5 - 1 Chap 5 The Evolution Trees Evolutionary Tree.
Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.
NJ was originally described as a method for approximating a tree that minimizes the sum of least- squares branch lengths – the minimum – evolution criterion.
Multiple sequence alignment
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Phylogenetic Reconstruction: Parsimony Anders Gorm Pedersen
Chapter 9: Huffman Codes
Phylogeny Tree Reconstruction
. Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Consensus Trees Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Static Dictionaries Collection of items. Each item is a pair.  (key, element)  Pairs have different keys. Operations are:  initialize/create  get (search)
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Phylogeny Tree Reconstruction
Counting evolutionary changes the parsimony method requires an algorithm that counts the number of evolutionary changes in a tree. Fitch W.M Syst.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Trees, Stars, and Multiple Biological Sequence Alignment Jesse Wolfgang CSE 497 February 19, 2004.
Busby, Dodge, Fleming, and Negrusa. Backtracking Algorithm Is used to solve problems for which a sequence of objects is to be selected from a set such.
Parsimony and searching tree-space Phylogenetics Workhop, August 2006 Barbara Holland.
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
Lecture 81 Data Structures, Algorithms & Complexity Tree Algorithms GRIFFITH COLLEGE DUBLIN.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
Evolutionary tree reconstruction
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
1/24 Introduction to Graphs. 2/24 Graph Definition Graph : consists of vertices and edges. Each edge must start and end at a vertex. Graph G = (V, E)
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Huffman’s Algorithm 11/02/ Weighted 2-tree A weighted 2-tree T is an extended binary tree with n external nodes and each of the external nodes is.
Foundation of Computing Systems
1 Alignment Matrix vs. Distance Matrix Sequence a gene of length m nucleotides in n species to generate an… n x m alignment matrix n x n distance matrix.
Parsimony and searching tree-space. The basic idea To infer trees we want to find clades (groups) that are supported by synapomorpies (shared derived.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Chapter 6 – Trees. Notice that in a tree, there is exactly one path from the root to each node.
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Chapter AGB. Today’s material Maximum Parsimony Fixed tree versions (solvable in polynomial time using dynamic programming) Optimal tree search.
Discrete Structures Li Tak Sing( 李德成 ) Lectures
Lecture on Data Structures(Trees). Prepared by, Jesmin Akhter, Lecturer, IIT,JU 2 Properties of Heaps ◈ Heaps are binary trees that are ordered.
Phylogenetic Trees - Parsimony Tutorial #12
Chapter 5 : Trees.
Greedy Technique.
Lecture 6B – Optimality Criteria: ML & ME
Lecture 6A – Introduction to Trees & Optimality Criteria
Chapter 9: Huffman Codes
Lecture 19-Problem Solving 4 Incremental Method
CS 581 Tandy Warnow.
CSCI2950-C Lecture 8 Molecular Phylogeny: Parsimony and Likelihood
Lecture 36 Section 12.2 Mon, Apr 23, 2007
Lecture 6B – Optimality Criteria: ML & ME
Phylogeny.
Lecture 6A – Introduction to Trees & Optimality Criteria
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
Unit II Game Playing.
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

Lecture 6A – Introduction to Trees & Optimality Criteria Branches: n-taxa -> 2n-3 branches 1, 2, 4, 6, & 7 are external (leaves) 3 & 5 are internal branches (edges) Nodes A – E are terminals x, y, & z are internal (vertices)

If we break branch 3, we have two sub-trees (A,B) and (C,(D,E)). ((A,B),C,(D,E)). Newick Format

Rooting – The tree is an unrooted tree.

Also note that there is free rotation around nodes:

The Scope of the Problem TaxaUnrooted Trees , , X X X X X 10 2, mil 5 X 10 68,667,340

II. Optimality Criteria A. Parsimony First, the score of a tree (i.e., its length) for the entire data set is given by: l i is the length of character i when optimized on tree . w i is the weight we assign to character i.

The Fitch Algorithm: state sets and accumulated lengths. We erect a state set at each terminal node and assign an accumulated length of zero to terminal nodes. This is the minimum number of changes in the daughter subtree.

The Fitch Algorithm: state sets and accumulated lengths. 1 – Form the intersection of the state sets of the two daughter nodes. If the intersection is non-empty, assign the set for the internal node equal to the intersection. The accumulated length of the internal node is the sum of those of the daughter nodes. 2 – If the intersection is empty, we assign the union of the two daughter nodes to the state set for the internal node. The accumulated length is the sum of those of the daughter nodes plus one. empty Union: 0+0+1=1 non-empty Intersection: 0+0=0 empty Union: 1+0+1=2 So l i = 2

Sankoff Algorithm – Character-state vectors and step-matrices Step Matrix – define c i,j ACGT A--414 C4--41 G14--4 T414-- Step one: Fill in the character-state vectors for terminal nodes. Each cell is the s k(i)

Step two: Fill in vectors for other nodes, descending tree. s 1(A) = c AG + c AA = = 1, s 1(C) = c CG + c CA = = 8, s 1(G) = c GG + c GA = = 1, s 1(T) = c TG + c TA = = 8 Node 1: Node 2: s 2(A) = = 8 s 2(C) = = 0 s 2(G) = = 8 s 2(T) = = 2

For nodes below, we must calculate the cost for each possible state assignment for daughter nodes. s 3(A) = min[s 1A + c Aj ] + min[s 2A + c Aj ] s 3(C) = min[s 1C + c Cj ] + min[s 2C + c Cj ] s 3(G) = min[s 1G + c Gj ] + min[s 2G + c Gj ] s 3(T) = min[s 1T + c Tj ] + min[s 2T + c Tj ] So we fill in the character-state vector for node 3. From daughter node 1 From step matrix = min[1,12,2,12] + min[8,4,9,6] = 1+4 = 5 5 = min [5,8,5,9] + min[12,0,12,3] = 5+0 = 5 5 = min [2,12,1,12] + min[9,4,8,6] = 1+4 = 5 5 = min [5,9,5,8] + min[12,1,12,2] = 5+1 = 6 6

Points to note: 1) Two types of weighting are possible: weighting of transformations within characters (which we demonstrated with the step matrix) and weighting among characters, which are reflected in the weighted sum of lengths across characters. 2) One can’t compare tree lengths across weighting schemes. In the first example, with all transformations having the same cost, the length of the character on this tree was 2. In the second, with a 4:1 step matrix to weight transversions, the length was 5.