. Odds and Ends Tutorial #13 © Ilan Gronau. 2 The Noisy Transmission Model.

Slides:



Advertisements
Similar presentations
Lecture 16 Hidden Markov Models. HMM Until now we only considered IID data. Some data are of sequential nature, i.e. have correlations have time. Example:
Advertisements

Hidden Markov Models (HMM) Rabiner’s Paper
The EM algorithm LING 572 Fei Xia Week 10: 03/09/2010.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
The General Linear Model. The Simple Linear Model Linear Regression.
Phylogenetic Trees Lecture 4
Maximum Likelihood Sequence Detection (MLSD) and the Viterbi Algorithm
. Learning – EM in The ABO locus Tutorial #8 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
Topics Review of DTMC Classification of states Economic analysis
Introduction to Hidden Markov Models
Tutorial on Hidden Markov Models.
11 - Markov Chains Jim Vallandingham.
Belief Propagation on Markov Random Fields Aggeliki Tsoli.
Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.
Statistical NLP: Lecture 11
Hidden Markov Models Theory By Johan Walters (SR 2003)
1 Hidden Markov Models (HMMs) Probabilistic Automata Ubiquitous in Speech/Speaker Recognition/Verification Suitable for modelling phenomena which are dynamic.
Hidden Markov Models Fundamentals and applications to bioinformatics.
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
Hidden Markov Models (HMMs) Steven Salzberg CMSC 828H, Univ. of Maryland Fall 2010.
. Hidden Markov Model Lecture #6. 2 Reminder: Finite State Markov Chain An integer time stochastic process, consisting of a domain D of m states {1,…,m}
A Hidden Markov Model for Progressive Multiple Alignment Ari Löytynoja and Michel C. Milinkovitch Appeared in BioInformatics, Vol 19, no.12, 2003 Presented.
. Learning Hidden Markov Models Tutorial #7 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Data Mining Techniques Outline
Learning Seminar, 2004 Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data J. Lafferty, A. McCallum, F. Pereira Presentation:
Part 4 b Forward-Backward Algorithm & Viterbi Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
. Hidden Markov Models For Genetic Linkage Analysis Lecture #4 Prepared by Dan Geiger.
. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau.
. Alignment HMMs Tutorial #10 © Ilan Gronau. 2 Global Alignment HMM M ISIS ITIT STARTEND (a,a) (a,b) (z,z) (-,a) (-,z) (a,-) (z,-) Probability distributions.
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.
The EM algorithm LING 572 Fei Xia 03/01/07. What is EM? EM stands for “expectation maximization”. A parameter estimation method: it falls into the general.
. Inference in HMM Tutorial #6 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
Phylogeny Tree Reconstruction
Fall 2001 EE669: Natural Language Processing 1 Lecture 9: Hidden Markov Models (HMMs) (Chapter 9 of Manning and Schutze) Dr. Mary P. Harper ECE, Purdue.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
Hidden Markov Model Continues …. Finite State Markov Chain A discrete time stochastic process, consisting of a domain D of m states {1,…,m} and 1.An m.
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
EM algorithm LING 572 Fei Xia 03/02/06. Outline The EM algorithm EM for PM models Three special cases –Inside-outside algorithm –Forward-backward algorithm.
. Markov Chains Tutorial #5 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
. The Relative Entropy Rate of Two Hidden Markov Processes Or Zuk Dept. of Phys. Of Comp. Systems Weizmann Inst. Of Science Rehovot, Israel.
1 Sequence Alignment Input: two sequences over the same alphabet Output: an alignment of the two sequences Example: u GCGCATGGATTGAGCGA u TGCGCCATTGATGACCA.
PGM 2003/04 Tirgul 2 Hidden Markov Models. Introduction Hidden Markov Models (HMM) are one of the most common form of probabilistic graphical models,
CS 416 Artificial Intelligence Lecture 17 Reasoning over Time Chapter 15 Lecture 17 Reasoning over Time Chapter 15.
Introduction to Belief Propagation
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Hidden Markov Models (HMMs) –probabilistic models for learning patterns in sequences (e.g. DNA, speech, weather, cards...) (2 nd order model)
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
Log-Likelihood Algebra
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Other Models for Time Series. The Hidden Markov Model (HMM)
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
. The EM algorithm Lecture #11 Acknowledgement: Some slides of this lecture are due to Nir Friedman.
Hidden Markov Models HMM Hassanin M. Al-Barhamtoshy
Hidden Markov Models BMI/CS 576
Distance based phylogenetics
Multiple Alignment and Phylogenetic Trees
The Most General Markov Substitution Model on an Unrooted Tree
Markov Chains Tutorial #5
Presentation transcript:

. Odds and Ends Tutorial #13 © Ilan Gronau

2 The Noisy Transmission Model

3 0 I0I0 1 I1I I0I0 I1I I0I I1I Transitions: Stationary distribution: (8, 24, 1, 3)/36

4 Questions Given an output sequence (including blanks), what is the most probable path which yields this sequence?  (1.c) - Viterbi algorithm Given an output sequence, what is the most probable path to yield it, which passes through M non-noise states ( 0/1 )?  (1.d) Given an output sequence, what is the most probable path to yield it?  (bonus) Given an output sequence, what is the most probable transmission?  Problem: each transmission corresponds to multiple paths!

5 Answer to 1d Given an output sequence X 1,…,X n and M, we calculate the following values for all states S, i=1..n and j=1..M : v S (i,j) – log-probability of most probable path yielding output X 1,…,X i, passing through j non-noise states, and ending in state S. Initialize: v S (0,0) – initial log-probability of S (stationary distribution) For i,j>0 and a=0/1 : Hold update-pointers Most values are -∞  t(∙,∙), e(∙,∙) are log-probabilities

6 Answer to 1d Given an output sequence X 1,…,X n and M, we calculate the following values for all states S, i=1..n and j=1..M : v S (i,j) – log-probability of most probable path yielding output X 1,…,X i, passing through j non-noise states, and ending in state S. Recursion formulae: (For i,j>0 and a=0/1 ) At the end choose: and follow pointers to recover path Hold update-pointers

7 Bonus Given an output sequence, what is the most probable path to yield it? Approach 1: If we don’t know M, then we can fill in the tables column by column Eventually the probability of columns starts deteriorating Approach 2: a-priori bound Note that an optimal path doesn’t have 2 consecutive deletions (-) SiSi Si+1Si+1 S i+2 -- SiSi S i+2 -- Pr < Conclusion: M < 2n+2

8 2-species Evolution Observe the following evolution model for binary-character vectors: Each specie corresponds to a binary vector in {0,1} n Two species Y,Z evolve from a common ancestor X Each bit in X is chosen uniformly by random Each bit in X is flipped w.p. θ during evolution towards Y or Z Given binary vectors for Y, Z calculate most probable value for θ 1.Define the sufficient statistics of the problem 2.Give formula for L(θ) 3.Formulate EM algorithm for the problem 4.Give analytic solution (if exists) for MLE X Y Z θ θ hidden observed

9 2-species Evolution Define the sufficient statistics of the problem Given Y = y 1,…y n and Z = z 1,…z n define n 0 =|{i | y i = z i }|, n 1 =|{i | y i ≠ z i }| Give formula for L(θ) L(θ)= Pr[ Y,Z | θ]= Π i=1..n ( Pr[ Y i,Z i | θ] ) = X Y Z θ θ YiYi ZiZi XiXi Pr[X i,Y i,Z i ] 000½(1-θ) 2 1½θ2½θ2 010½ θ(1-θ) 1 Similarly if Y i =1

10 2-species Evolution Formulate EM algorithm for the problem E – Given θ calculate the expected number of flips from X to Y and Z E(#flips) = Σ i=1..n ( Pr[x i ≠ y i ] + Pr[x i ≠ z i ] ) = X Y Z θ θ YZXPr[X,Y,Z]Pr[X|Y,Z] 000½(1-θ) 2 1½θ2½θ2 010½ θ(1-θ) 1 #flips = sum of indicator variables M – Given expected number of flips from X to Y and Z calculate θ’ θ’= E(#flips) / 2n E+M –

11 2-species Evolution Give analytic solution (if exists) for MLE Find extreme-points of log-likelihood: X Y Z θ θ minimum maxima

12 Generalizing The Model Alphabet of size k : Uniform transition model: More complex transition models Evolution of n species (given the phylogenetic topology): X1X1 X2X2 X3X3 θ2θ2 θ1θ1 X4X4 X5X5 θ4θ4 θ3θ3 Y1Y1 Y2Y2 Y3Y3 YnYn observed hidden θ i correlates to evolutionary distance along the edge solves ‘small’ likelihood problem