Problem Set 2 Solutions Tree Reconstruction Algorithms

Slides:



Advertisements
Similar presentations
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Advertisements

. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Learning HMM parameters
Lecture 17 Path Algebra Matrix multiplication of adjacency matrices of directed graphs give important information about the graphs. Manipulating these.
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Molecular Evolution and Phylogenetic Tree Reconstruction
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Tree Reconstruction.
DNA Sequencing.
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
Applied Discrete Mathematics Week 12: Trees
CS262 Lecture 12, Win06, Batzoglou RNA Secondary Structure aagacuucggaucuggcgacaccc uacacuucggaugacaccaaagug aggucuucggcacgggcaccauuc ccaacuucggauuuugcuaccaua.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Overview of Phylogeny Artiodactyla (pigs, deer, cattle, goats, sheep, hippopotamuses, camels, etc.) Cetacea (whales, dolphins, porpoises)
Phylogeny Tree Reconstruction
Building phylogenetic trees Jurgen Mourik & Richard Vogelaars Utrecht University.
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
CS262 Lecture 9, Win07, Batzoglou Phylogeny Tree Reconstruction
CISC667, F05, Lec15, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (II) Distance-based methods.
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Phylogeny Tree Reconstruction
Fall 2007CS 2251 Graphs Chapter 12. Fall 2007CS 2252 Chapter Objectives To become familiar with graph terminology and the different types of graphs To.
Distance-Based Phylogenetic Reconstruction Tutorial #8 © Ilan Gronau, edited by Itai Sharon.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
Phylogeny Tree Reconstruction
Perfect Phylogeny MLE for Phylogeny Lecture 14
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
Phylogenetic Tree Construction and Related Problems Bioinformatics.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
CSC 2300 Data Structures & Algorithms February 6, 2007 Chapter 4. Trees.
Phylogenetic trees Sushmita Roy BMI/CS 576
CS 146: Data Structures and Algorithms July 21 Class Meeting
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetics II.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Phylogenetic Trees Tutorial 5. Agenda How to construct a tree using Neighbor Joining algorithm Phylogeny.fr tool Cool story of the day: Horizontal gene.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Calculating branch lengths from distances. ABC A B C----- a b c.
Evolutionary tree reconstruction (Chapter 10). Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships.
394C, Spring 2013 Sept 4, 2013 Tandy Warnow. DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT.
Evolutionary tree reconstruction
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Building Phylogenetic Trees.
S. Salzberg CMSC 828N 1 Three classic HMM problems 2.Decoding: given a model and an output sequence, what is the most likely state sequence through the.
Comp. Genomics Recitation 8 Phylogeny. Outline Phylogeny: Distance based Probabilistic Parsimony.
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
Phylogeny Ch. 7 & 8.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
1 Alignment Matrix vs. Distance Matrix Sequence a gene of length m nucleotides in n species to generate an… n x m alignment matrix n x n distance matrix.
Tutorial 5 Phylogenetic Trees.
Suppose G = (V, E) is a directed network. Each edge (i,j) in E has an associated ‘length’ c ij (cost, time, distance, …). Determine a path of shortest.
UNIVERSITY OF SOUTH CAROLINA College of Engineering & Information Technology Bioinformatics Algorithms and Data Structures Chapter : Multiple Alignment.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
. Perfect Phylogeny MLE for Phylogeny Lecture 14 Based on: Setubal&Meidanis 6.2, Durbin et. Al. 8.1.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
dij(T) - the length of a path between leaves i and j
Character-Based Phylogeny Reconstruction
Multiple Sequence Alignment
Phylogeny.
Presentation transcript:

Problem Set 2 Solutions Tree Reconstruction Algorithms Marc A. Schaub February 22nd, 2008 CS 262 Problem Session Problem Set 2 Solutions Tree Reconstruction Algorithms Based on slides by - Andreas Sundquist and George Asimenos (problem 1) - Serafim Batzoglou (tree reconstruction)

Problem 1(a)

Problem 1(b) Baum-Welch: Suppose Forward: Similar for Backward

Problem 1(b) Baum-Welch:

Problem 1(b) Baum-Welch:

Problem 1(b) Baum-Welch: Given Inductive step:  After training:

Problem 1(b) Viterbi: Viterbi parse may arbitrarily choose state k over state k’  Akl  Ak’l  a’kl  a’k’l

Problem 1(c) akl l=1 2 k=0 1 1/2 Akl l=1 2 k=0 1 ek(b) b=x y k=1 1 2 1/2 Akl l=1 2 k=0 1 ek(b) b=x y k=1 1 2 Ek(b) b=x y k=1 3 2 1

Problem 1(c) Viterbi akl l=1 2 k=0 1 1/2 x y 1 .9 .045 .3645 .1640 2 1/2 x y 1 .9 .045 .3645 .1640 2 .405 ek(b) b=x y k=1 1 2

Problem 1(c) Viterbi x y 1 .75 .1688 .1139 .0769 2 .0375 .0084 .0057 akl l=1 2 k=0 1 0.9 0.1 ek(b) b=x y k=1 0.75 0.25 2 0.5 akl l=1 2 k=0 1 ek(b) b=x y k=1 0.75 0.25 2 ?

Additive Distances 1 d1,4 12 4 8 3 7 9 5 11 10 6 2 Given a tree, a distance measure is additive if the distance between any pair of leaves is the sum of lengths of edges connecting them Given a tree T & additive distances dij, can uniquely reconstruct edge lengths: Find two neighboring leaves i, j, with common parent k Place parent node k at distance dkm = ½ (dim + djm – dij) from any node m  i, j

Neighbor-Joining Dij = (N – 2) dij – ki dik – kj djk Guaranteed to produce the correct tree if distance is additive May produce a good tree even when distance is not additive Step 1: Finding neighboring leaves Define Dij = (N – 2) dij – ki dik – kj djk Claim: The above “magic trick” ensures that Dij is minimal iff i, j are neighbors 1 3 0.1 0.1 0.1 0.4 0.4 2 4

Algorithm: Neighbor-joining Initialization: Define T to be the set of leaf nodes, one per sequence Let L = T Iteration: Pick i, j s.t. Dij is minimal Define a new node k, and set dkm = ½ (dim + djm – dij) for all m  L Add k to T, with edges of lengths dik = ½ (dij + ri – rj), djk = dij – dik where ri = (N – 2)-1 ki dik Remove i, j from L; Add k to L Termination: When L consists of two nodes, i, j, and the edge between them of length dij

Parsimony – direct method not using distances One of the most popular methods: GIVEN multiple alignment FIND tree & history of substitutions explaining alignment Idea: Find the tree that explains the observed sequences with a minimal number of substitutions Two computational subproblems: Find the parsimony cost of a given tree (easy) Search through all tree topologies (hard)

Example: Parsimony cost of one column Final cost C = 1 {A} {A, B} Cost C+=1 A B A B A A {A} {B} {A} {A}

Parsimony Scoring Given a tree, and an alignment column u Label internal nodes to minimize the number of required substitutions Initialization: Set cost C = 0; node k = 2N – 1 (last leaf) Iteration: If k is a leaf, set Rk = { xk[u] } // Rk is simply the character of kth species If k is not a leaf, Let i, j be the daughter nodes; Set Rk = Ri  Rj if intersection is nonempty Set Rk = Ri  Rj, and C += 1, if intersection is empty Termination: Minimal cost of tree for column u, = C

Example {B} {A,B} {A} {B} {A} {A,B} {A} A A A A B B A B {A} {A} {A}