Download presentation
Presentation is loading. Please wait.
1
Algebraic Statistics for Computational Biology Lior Pachter and Bernd Sturmfels Ch.5: Parametric Inference R. Mihaescu Παρουσίαση: Aγγελίνα Βιδάλη Αλγεβρικοί & Γεωμετρικοί Αλγόριθμοι στη Μοριακή Βιολογία Διδάσκων: Ι. Εμίρης
2
Convenient algebraic structure for stating dynamic programming algorithms: the tropical semiring Tropical arithmetic (Convex hull) (Minkowski sum) The polytope agebra ( P d natural higher-dimensional generalization:
3
Inference From Observed random variables Y 1 = σ 1,…,Y n = σ n we want to infer values for the Hidden random variables Χ 1,…,Χ m : Unknown biological data, i.e.: How do two sequences allign? MAP estimation: given an observation σ 1,…,σ n which is the most probable explanation X 1 =h 1,…, Χ m =h m ? Model parameters give transition probabilities p hσ : hidden state hσ observed state Observation: σ 1,…,σ n : Known biological data
4
Observation: σ 1,…,σ n We want to compute an explanation for the observation: the sequence h 1,…,h m which yields the maximum a prosteriori probability (MAP): We can efficiently compute the marginal probabilities: Hidden Markov Model (HMM)
5
Computation of the marginal probabilities: p σ has the decomposition which gives the “Forward algorithm”. Markov chain: Independent probabilities
6
Viterbi algorithm problem of computing p σ Tropicalization:u ij =-log(p’ ij )v ij =-log(p ij ) We can now efficiently find an explanation h 1,…,h m for the observation σ 1,…,σ n using the recursion: It is again the Forward algorithm.
7
Pair Hidden Markov Model (pHMM) The algebraic statistical model for sequence alignment, known as the pair hidden Markov model, is the image of the map where A n,m is the set of all alignments of the sequences σ 1, σ 2.
8
The Needleman-Wunsch algorithm for finding the shortest path in the alignment graph is the tropicalization of the pair hidden Markov model for sequence allignment. gttta- gt--gc g t g c gttta Example: n=5, m=4 **
9
The polytope propagation algorithm Tropical sum-product algorithm in general fashion. f is the density function for a statistical model. From the d monomials find the one that maximizes Solution: Tropicalization: w i =-logp i & Computation in the ploytope algebra
10
Density function for a statistical model: f(p 1,p 2 )=p 1 3 +p 1 2 p 2 2 +p 1 p 2 2 +p 1 +p 2 4 Find the index j of the monomial that minimizes the function e j. w. Find an explanation Find the index j of the monomial with maximal value Tropicalization: w i =-logp i
11
Explanations are vertices of the Newton Polytope of f p13p13 p11p11 f(p 1,p 2 )=p 1 3 +p 1 2 p 2 2 +p 1 p 2 2 +p 1 +p 2 4 we find a point for each exponent vector of a monomial
12
Normal fan The normal fan partitions the parameter space into regions such that: the explanation(s) for all sets of parameters in a given region is given by the polytope vertex(face) associated to that region.
13
Parametric MAP estimation problem Local: given a choice of parameters determine the set of all parameters with the same MAP estimate. Solution: Computation of the normal cone of the Newton Polytope. Global: asks for a partition of the space of parameters such that any two parameters lie in the same part iff they yield the same MAP estimate. Solution: Computation of the normal fan of the Newton Polytope.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.