Learning Bit by Bit Hidden Markov Models
Weighted FSA weather The is outside
Markov Chain Computing probability of an observed sequence of events
Markov Chain weather The is outside.7.3 Observation = “The weather outside” wind.5.1.9
Parts of Speech Grammatical constructs like noun, verb
POS examples Nnounchair, bandwidth, pacing Vverbstudy, debate, munch ADJadjectivepurple, tall, ridiculous ADVadverbunfortunately, slowly Pprepositionof, by, to PROpronounI, me, mine DETdeterminerthe, a, that, those
Parts of Speech-uses Speech recognition Speech synthesis Data mining Translation
POS Tagging Words often have more than one POS: back – The back door = JJ – On my back = NN – Win the voters back = RB – Promised to back the bill = VB The POS tagging problem is to determine the POS tag for a particular instance of a word.
POS Tagging Sentence = sequence of observations Ie. “Secretariat is expected to race tomorrow”
Disambiguating “race”
Hidden Markov Model Observed Hidden
Hidden Markov Model 2 kinds of probabilities: – Tag transitions – Word likelihoods
Hidden Markov Model Tag transition prob = P( tag | previous tag) – ie. P(VB | TO)
Hidden Markov Model Word likelihood probability = P(word | tag) – ie. P(“race” | VB)
Actual probabilities: – P (NN | TO) = – P (VB | TO) =.83
Actual probabilities: – P (NR| VB) =.0027 – P (NR| NN) =.0012
Actual probabilities: – P (race | NN) = – P (race | VB) =.00012
Hidden Markov Model Probability “to race tomorrow” =“TO VB NR” P(VB|TO) * P(NR|VB) * P(race|VB).83 *.0027 * =
Hidden Markov Model Probability “to race tomorrow” =“TO NN NR” P(NN|TO) * P(NR|NN) * P(race|NN).00047*.0012* =
Hidden Markov Model Probability “to race tomorrow” =“TO NN NR” = Probability “to race tomorrow” =“TO VB NR” =
Bayesian Inference Correct answer = max (P (hypothesis | observed))
Bayesian Inference Prior probability = likelihood of the hypothesis
Bayesian Inference Likelihood = probability that the evidence matches the hypothesis
Bayesian Inference Bayesian vs. Frequentists Subjectivity
Examples