Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker.

Slides:



Advertisements
Similar presentations
Lecture 16 Hidden Markov Models. HMM Until now we only considered IID data. Some data are of sequential nature, i.e. have correlations have time. Example:
Advertisements

HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
Learning HMM parameters
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Hidden Markov Model 主講人:虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Lecture 8: Hidden Markov Models (HMMs) Michael Gutkin Shlomi Haba Prepared by Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 2002 Modified.
Hidden Markov Models Eine Einführung.
Tutorial on Hidden Markov Models.
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Patterns, Profiles, and Multiple Alignment.
Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.
數據分析 David Shiuan Department of Life Science Institute of Biotechnology Interdisciplinary Program of Bioinformatics National Dong Hwa University.
Statistical NLP: Lecture 11
Ch-9: Markov Models Prepared by Qaiser Abbas ( )
Hidden Markov Models in Bioinformatics
Profiles for Sequences
Hidden Markov Models Theory By Johan Walters (SR 2003)
Statistical NLP: Hidden Markov Models Updated 8/12/2005.
1 Hidden Markov Models (HMMs) Probabilistic Automata Ubiquitous in Speech/Speaker Recognition/Verification Suitable for modelling phenomena which are dynamic.
Hidden Markov Models in NLP
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models (HMMs) Steven Salzberg CMSC 828H, Univ. of Maryland Fall 2010.
Sequential Modeling with the Hidden Markov Model Lecture 9 Spoken Language Processing Prof. Andrew Rosenberg.
Albert Gatt Corpora and Statistical Methods Lecture 8.
. Hidden Markov Model Lecture #6. 2 Reminder: Finite State Markov Chain An integer time stochastic process, consisting of a domain D of m states {1,…,m}
… Hidden Markov Models Markov assumption: Transition model:
PatReco: Hidden Markov Models Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
1 Probabilistic Reasoning Over Time (Especially for HMM and Kalman filter ) December 1 th, 2004 SeongHun Lee InHo Park Yang Ming.
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Forward-backward algorithm LING 572 Fei Xia 02/23/06.
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
Hidden Markov Models.
Hidden Markov Models 戴玉書
Hidden Markov models Sushmita Roy BMI/CS 576 Oct 16 th, 2014.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
Hidden Markov Models In BioInformatics
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Combined Lecture CS621: Artificial Intelligence (lecture 25) CS626/449: Speech-NLP-Web/Topics-in- AI (lecture 26) Pushpak Bhattacharyya Computer Science.
Ch10 HMM Model 10.1 Discrete-Time Markov Process 10.2 Hidden Markov Models 10.3 The three Basic Problems for HMMS and the solutions 10.4 Types of HMMS.
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
Hidden Markov Models Usman Roshan CS 675 Machine Learning.
Homework 1 Reminder Due date: (till 23:59) Submission: – – Write the names of students in your team.
S. Salzberg CMSC 828N 1 Three classic HMM problems 2.Decoding: given a model and an output sequence, what is the most likely state sequence through the.
NLP. Introduction to NLP Sequence of random variables that aren’t independent Examples –weather reports –text.
Hidden Markov Models & POS Tagging Corpora and Statistical Methods Lecture 9.
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2005 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Markov Chains and Hidden Markov Model.
Hidden Markov Models (HMMs) Chapter 3 (Duda et al.) – Section 3.10 (Warning: this section has lots of typos) CS479/679 Pattern Recognition Spring 2013.
Hidden Markov Models (HMMs) –probabilistic models for learning patterns in sequences (e.g. DNA, speech, weather, cards...) (2 nd order model)
Albert Gatt Corpora and Statistical Methods. Acknowledgement Some of the examples in this lecture are taken from a tutorial on HMMs by Wolgang Maass.
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
1 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,..., sN Si Sj.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Introducing Hidden Markov Models First – a Markov Model State : sunny cloudy rainy sunny ? A Markov Model is a chain-structured process where future states.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215.
Hidden Markov Models (HMMs)
Presentation transcript:

Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker 02/17/2016

Outline  Introduction  Motivation  Markov Models  Hidden Markov Models (HMMs)  HMMs Definition and Components  HMMs Problems  HMMs Basic Algorithms  Applications  HMMs in Speech Recognition  HMMs for Gene Finding in DNA  Summary  References 2

Introduction  Motivation :  Predictions based on models of observed data (independent and identically distributed).  This assumption not always is the best:  Measurements of weather patterns.  Daily values of stocks.  The composition of DNA.  The composition of texts.  Time frames used for speech recognition. 3

Introduction (Continue…)  Markov Models:  If the n’th observation in a chain of observations is influenced only by the n-1’th observation, then the chain of the observation is called 1 st order Markov chain:  P(X n |X 1,…….,X n-1 ) = P(X n |X n-1 )  Example: Weather Prediction(What the weather would be tomorrow depends on observations on the past: 4

Introduction (Continue…)  Weather of the day (day n), X n €(sunny, rainy, cloudy).  If the weather yesterday was cloudy and today is rainy, what is the probability that tomorrow will be sunny? P(X 3 = |X 2 = ) = 0.6 5

Introduction (Continue…)  What if the n’th observation in a chain of observations is influenced by a corresponding HIDDEN variable? Source: 6

Outline  Introduction  Hidden Markov Models (HMMs)  HMMs Definition and Components  HMMs Problems  HMMs Basic Algorithms  Applications  Summary  References 7

HMMs Definition and Components  HMMs: are powerful statistical models for modeling sequential or time-series data, and have been successfully used in many tasks such as [1]:  Robot control,  Speech recognition,  Protein/DNA sequence analysis,  and information extraction from text data. 8

HMMs Definition and Components  HMM is a 5-tuple (S, O, Π, A, B), where:  S = {s 1,..., s N } is a finite set of N states,  O = {o 1,..., o M } is a set of M possible outputs (observation)  Π = {π i } are the initial state probabilities,  A = {a ij } are the state transition probabilities,  B = {b i (o k )} are the output or emission probabilities.  We use λ(HMM) = (Π, A, B) to denote all the parameters. 9

HMMs Definition and Components  HMMs Example: Assistive Technology  Assume you have a little robot that tries to estimate the probability that you are happy (h) or sad (s), given that the robot has observed whether you are watching your favorite TV show(w), sleeping(s), crying(c), or face booking (f) [4].  Let hidden states be X= h, X=s.  Let observations be Y, which can be w, s, c, or f.  We want to find out those probabilities:  P(X=h|Y=w)?  P(X=s|Y=c)? Source: little-robot.html 10

HMMs Definition and Components  Using Bayes Rule: describes the probability of an event based on conditions that might be related to the event [5].  For i,  For n, 11

HMMs Definition and Components Continue…  Solve them with Prior and Likelihood Model [4]:  P(X|Y)?  P(X)=, P(X=s)= 0.2  P(Y|X)=, P(X=h|Y=w)?= 0.94 sh wscf s h

HMMs Definition and Components Continue… What if we have Transition Prior rather than absolute prior. Let assume we have observation like this ccc, wcw, Hence we have  S = {(H),(S) },  O = {(w),(s),(c)),(f)}  Π = {H:0.8, S:0.2},  A = {{H-H:0.9,H-S:0.1},{S-H:0.1,S-S:0.9}}  B = {{H-w:0.4, H-s:0.4, H-c:0.2, H-f:0 }, {S-w:0.1, S-s:0.3, S-c:0.5,S-f: 0.1 }} 13

HMMs Problems  There are three basic problems are associated with an HMM: 1) Evaluation : Evaluating the probability of an observed sequence of symbols O over all of possible sequence given a particular HMM λ, i.e., p(O|λ). 2) Decoding : Finding the best state sequence, given an observation sequence O and HMM λ, i.e, q ∗ = argmax s p(q|O). 3) Training : Finding all the parameters of HMMλ to maximize the probability of generating an observed sequence, i.e., to find λ ∗ = argmax λ p(O|λ). 14

HMMs Basic Algorithms ProblemAlgorithm Evaluation p(O|λ) Forward-Backward Decoding q ∗ = argmax s p(q|O) Viterbi Decoding Training λ ∗ = argmax λ p(O|λ) Baum-Welch(EM) 15

 The Forward Algorithm to solve Evaluation Problem :  Sum over all possible paths of the state sequence that generate the given observation sequence by the following recursive procedure:  α 1 (i) = π i b i (o 1 ), 1<=i<=N  α t+1 (i) = b i (o t+1 ), 1 ≤ t < T,  we may end at any of the N states.  The Backward Algorithm along with Forward Algorithm to solve the third Problem:  r  R  e HMMs Basic Algorithms continue… 16

HMMs Basic Algorithms continue…  Viterbi Algorithm to solve Decoding Problem:  The Viterbi algorithm is a dynamic programming algorithm.  It computes the most likely state transition path given an observed sequence of symbols.  It is similar to the forward algorithm, except takes “max”, rather than a “summation ”.  Viterbi Algorithm :  VP 1 (i)= π i b i (o 1 ) and q 1 * (i)= (i)  For 1≤t≤T, VP t+1 (i)=max 1≤j≤N VP t (j)a ji b i (o t+1 ) and q t+1 * (i)= q t * (k).(i), where k=argmax 1≤j≤N VP t (j)a ji b i (o t+1 ) 17

HMMs Basic Algorithms continue…  Baum-Welch Algorithm (also called Forward-Backward Algorithm) to solve Training Problem:  We assume that we know the HMMs parameters, but often these parameters re-estimated or annotated training that has two drawbacks: 1)Annotation is difficult/expensive. 2)Training data is different from the current data.  The goal of Baum-Welch algorithm is to tune the parameters of HMMs using EM (Expectation Maximization) that maximize the parameters of HMMs to the current data. 18

Outline  Introduction  Hidden Markov Models (HMMs)  Applications  HMMs in Speech Recognition  HMMs for Gene Finding in DNA  Summary  References 19

HMMs In Speech Recognition The word ball spoken by two different speakers: (a) female and (b) male[3] The phoneme /e/ in three different contexts: (a) let’s, (b) phonetic and (c) sentence[3] 20

HMMs In Speech Recognition Continue..  Using HMM to model some unit of speech word and sentence level from phoneme level-units. For example,  One with pronunciation “W AX N” [2]:  Representing speech as a sequence of symbols.  Output Probabilities: Probabilities of observing symbol in a state.  Transition Probabilities: Probabilities of staying in or skipping state. 21

HMMs In Speech Recognition Continue..  Training HMMs for Continuous Speech  Concatenate phone models to give word model.  Concatenate word models to give sentence model.  Train entire sentence model on entire spoken sentence. 22

HMMs In Speech Recognition Continue.. Source: oni/10601-slides/hmm-for- asr-whw.pdf oni/10601-slides/hmm-for- asr-whw.pdf Recognition Search 23

HMMs In Speech Recognition Continue.. Source: Forward-Backward Training for Continuous Speech 24

HMMs for Gene Finding in DNA Source: 25

HMMs for Gene Finding in DNA (Continue..) Basic Structure of Gene [6] 26

HMMs for Gene Finding in DNA (Continue..)  Input: DNA sequence X=(x 1,…x n ), where = A,C,G,T.  Output: Correct Labeling of each element in X as coding, non-coding, and intergenic regions.  The goal of gene finding is then to annotate the sets of genomic data with the location of genes and within these genes [6]. 27

HMMs for Gene Finding In DNA (Continue..) Enter: start codon or intron (3 ’ Splice Site) Exit: 5 ’ Splice site or three stop codons (taa, tag, tga) [7] 28

Outline  Introduction  Hidden Markov Models (HMMs)  Applications  Summary  References 29

Summary  Introduced Hidden Markov Models(HMMs).  Defined HMMs basic problems and the corresponding solving algorithms of them.  Presented the most known applications of HMMs. 30

References [1] [2] [3] [4] [5] [6] K. Smith, “ Hidden Markov Models in Bionformatics with Application to Gene Findingin Human DNA,” available on: [7] Nagiza F. Samatova, “ Computational Gene Finding using HMMs,” ppt 31

Questions Q1) Define HMM and give its components? Q2) What are the three problems that associate with an HMMs? And how to solve them? Q3) What the difference between Forward and Viterbi algorithms? Q4) Given the example of Assistive Technology find P(X=s|Y=c)? Hint, Use Absolute prior Q5) How to find a gene in human DNA using HMMs? What are the input and outputs of HMMs? 32