Recitation 5 2/4/09 ML in Phylogeny

Recitation 5 2/4/09 ML in Phylogeny
Comp. Genomics Recitation 5 2/4/09 ML in Phylogeny Based on Slides by Ron Shamir and Nir Friedman

Outline Maximum likelihood (ML) ML in phylogeny
Ancestral sequence reconstruction using ML

Maximum likelihood One of the methods for parameter estimation
Likelihood: L=P(Data|Parameters) Simple example: Simple coin with P(head)=p 10 coin tosses 6 heads, 4 tails L=P(Data|Params)=(106)p6 (1-p)4

Maximum likelihood We want to find p that maximizes L=(106)p6 (1-p)4
Infi 1, Remember? Log is a monotolical function, we can optimize logL=log[(106)p6 (1-p)4]= log(106)+6logp+4log(1-p)] Deriving by p we get: 6/p-4/(1-p)=0 Estimate for p:0.6 (Makes sense?)

Likelihood of a Tree Input (small problem):
n sequences A tree T, with labels on the leafs (X) Find optimal labeled tree : labeling of internal nodes (Y) branch lengths (b) Maximizing the likelihood P(X|T,Y,b)

Likelihood (2) How to compute P(X|T,Y,b)? Assumptions:
Each character is independent The branching is a Markov process: The probability of a node having a given label is only a function of the parent node and the branch length b between them. The probabilities P(x|y,t) are known

Example x1 x2 x3 x4 x5 t1 t2 t3 t5

What if we want P(X|T,b)? Assume that the branch lengths b are known.
Independence of sites Markov property independence of each branch ALGMB, December 01 © Ron Shamir , TAU

Properties of P Additivity: Reversibility
Allows to freely move the root

Efficient Likelihood Calculation (Felsenstein ’73)
Use dynamic prog. similar to parsimony Need Sj(v,a) = Pr(subtree rooted in v | vj = x) Initialization: For each leaf v set Sj(v,a) = 1 if i is labeled by a, otherwise Sj(v,a) = 0 Recursion: Traverse the tree in postorder: for each node v with children u and w, for each state x Complexity: O(nmk2) n species, m chars, k states

Ancestral sequence reconstruction
Input: Rooted tree + extant (leaf) sequences Substitution matrix + branch lengths Problem: Find the sequence assignment of internal states which maximizes the total tree likelihood

Solving ancestral sequence reconstruction
Simple with parsimony methods, ≈ through the Fitch/Sankoff algorithms Here, we’re interested in ML Maximizing P(ancestral S|contemporary S) Joint vs. Marginal Marginal: focus on a single node (e.g., the root), and maximize its likelihood Joint: Infer all the sequences together

Solutions We can enumerate all the possible ancestral states and check their likelihood… cn possible combinations per character n – number of internal nodes Inapplicable when the tree is large Koshi and Goldshtein (1996) – fast algorithm for marginal reconstruction Pupko, Pe’er, Shamir and Graur (2004): fast algorithm for joint reconstruction

Basics We assume different sites evolve independently
Working one site at a time Pij(t) – the probability of observing ij in time t We want to maximize P(v|data)=P(data|v)*P(v)/P(Data) Constant!

DP to the rescue DP often suitable for tree problems Idea:
Start from the leaves and climb up the tree The subtree under every node is dependent only on the state of its parent! For node x compute Lx(i) and Cx(i) Lx(i) – the likelihood of x’s subtree under the condition that its parent is assigned with i Cx(i) – the state of x that gives rise to this likelihood

Algorithm phase I Initialization: Progression: Termination:
For a leaf y assigned with j: Cy(i)=j, Ly(i)=Pij(t) Progression: For an internal node z with sons x,y already visited: for each i we compute Termination: For the root with sons x,y,z – choose k maximizing

Algorithm phase II “Traceback”
Traverse the tree from root to the leaves For every internal node x with father y already reconstructed with i Reconstruct the state in x by setting Cx(i) Continue until all the nodes are reconstructed

Example

Complexity For n internal nodes and c possible states we compute a DP table of O(nc) cells. As we maximize in every cell over c states, time is O(nc2) As c is constant – O(n)

Recitation 5 2/4/09 ML in Phylogeny

Similar presentations

Presentation on theme: "Recitation 5 2/4/09 ML in Phylogeny"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Recitation 5 2/4/09 ML in Phylogeny

Similar presentations

Presentation on theme: "Recitation 5 2/4/09 ML in Phylogeny"— Presentation transcript:

Similar presentations

About project

Feedback