Presentation is loading. Please wait.

Presentation is loading. Please wait.

Problem Set 2 Solutions Tree Reconstruction Algorithms

Similar presentations


Presentation on theme: "Problem Set 2 Solutions Tree Reconstruction Algorithms"— Presentation transcript:

1 Problem Set 2 Solutions Tree Reconstruction Algorithms
Marc A. Schaub February 22nd, 2008 CS 262 Problem Session Problem Set 2 Solutions Tree Reconstruction Algorithms Based on slides by - Andreas Sundquist and George Asimenos (problem 1) - Serafim Batzoglou (tree reconstruction)

2 Problem 1(a)

3 Problem 1(b) Baum-Welch: Suppose Forward: Similar for Backward

4 Problem 1(b) Baum-Welch:

5 Problem 1(b) Baum-Welch:

6 Problem 1(b) Baum-Welch: Given Inductive step:  After training:

7 Problem 1(b) Viterbi: Viterbi parse may arbitrarily choose state k over state k’  Akl  Ak’l  a’kl  a’k’l

8 Problem 1(c) akl l=1 2 k=0 1 1/2 Akl l=1 2 k=0 1 ek(b) b=x y k=1 1 2
1/2 Akl l=1 2 k=0 1 ek(b) b=x y k=1 1 2 Ek(b) b=x y k=1 3 2 1

9 Problem 1(c) Viterbi akl l=1 2 k=0 1 1/2 x y 1 .9 .045 .3645 .1640 2
1/2 x y 1 .9 .045 .3645 .1640 2 .405 ek(b) b=x y k=1 1 2

10 Problem 1(c) Viterbi x y 1 .75 .1688 .1139 .0769 2 .0375 .0084 .0057 akl l=1 2 k=0 1 0.9 0.1 ek(b) b=x y k=1 0.75 0.25 2 0.5 akl l=1 2 k=0 1 ek(b) b=x y k=1 0.75 0.25 2 ?

11 Additive Distances 1 d1,4 12 4 8 3 7 9 5 11 10 6 2 Given a tree, a distance measure is additive if the distance between any pair of leaves is the sum of lengths of edges connecting them Given a tree T & additive distances dij, can uniquely reconstruct edge lengths: Find two neighboring leaves i, j, with common parent k Place parent node k at distance dkm = ½ (dim + djm – dij) from any node m  i, j

12 Neighbor-Joining Dij = (N – 2) dij – ki dik – kj djk
Guaranteed to produce the correct tree if distance is additive May produce a good tree even when distance is not additive Step 1: Finding neighboring leaves Define Dij = (N – 2) dij – ki dik – kj djk Claim: The above “magic trick” ensures that Dij is minimal iff i, j are neighbors 1 3 0.1 0.1 0.1 0.4 0.4 2 4

13 Algorithm: Neighbor-joining
Initialization: Define T to be the set of leaf nodes, one per sequence Let L = T Iteration: Pick i, j s.t. Dij is minimal Define a new node k, and set dkm = ½ (dim + djm – dij) for all m  L Add k to T, with edges of lengths dik = ½ (dij + ri – rj), djk = dij – dik where ri = (N – 2)-1 ki dik Remove i, j from L; Add k to L Termination: When L consists of two nodes, i, j, and the edge between them of length dij

14 Parsimony – direct method not using distances
One of the most popular methods: GIVEN multiple alignment FIND tree & history of substitutions explaining alignment Idea: Find the tree that explains the observed sequences with a minimal number of substitutions Two computational subproblems: Find the parsimony cost of a given tree (easy) Search through all tree topologies (hard)

15 Example: Parsimony cost of one column
Final cost C = 1 {A} {A, B} Cost C+=1 A B A B A A {A} {B} {A} {A}

16 Parsimony Scoring Given a tree, and an alignment column u
Label internal nodes to minimize the number of required substitutions Initialization: Set cost C = 0; node k = 2N – 1 (last leaf) Iteration: If k is a leaf, set Rk = { xk[u] } // Rk is simply the character of kth species If k is not a leaf, Let i, j be the daughter nodes; Set Rk = Ri  Rj if intersection is nonempty Set Rk = Ri  Rj, and C += 1, if intersection is empty Termination: Minimal cost of tree for column u, = C

17 Example {B} {A,B} {A} {B} {A} {A,B} {A} A A A A B B A B {A} {A} {A}


Download ppt "Problem Set 2 Solutions Tree Reconstruction Algorithms"

Similar presentations


Ads by Google