Lecture 9 The GHMM Library and The Brill Tagger CSCE 771 Natural Language Processing Lecture 9 The GHMM Library and The Brill Tagger Topics GHHM library in C – integrating with Python Brill tagger HMM assignment Readings: http://ghmm.org/ Section 5.6 – transformation based tagging February 13, 2013
Overview Last Time Today Computation Complexity for Problems 1 and 2 Problem 3 - Learning the model Backward computation Forward-Backward Algorithm Videos on NLP on You Tube from Coursesa etc. Today Computation Complexity for Forward-Backward Algorithm General Hidden Markov Model library (GHMM) http://ghmm.org/ Transformational taggers (back to sec 5.6 )
Ferguson’s 3 Fundamental Problems Computing Likelihood – Given an HMM λ = (A, B) and an observation sequence O, determine the likelihood P(O| λ). The Decoding Problem– Given an HMM λ = (A, B) and an observation sequence O= o1, o2, … oT, find the most probable sequence of states Q = q1, q2, … qT. Learning the Model (HMM) – Given an observation sequence and the set of possible states in the HMM, learn parameters A and B.
Big-Oh of Forward-Backward Alg. Speech and Language Processing, Second Edition Daniel Jurafsky and James H. Martin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved.
A Tutorial that Taught a Generation The Hidden Markov Models, or HMMs, … The web is full of information on HMMs. … HMMs have been around for years, but they came into their own when they started to win the speech recognition game. Rabiner and his coauthors wrote several tutorial surveys which helped greatly to speed up the process. Rabiner's 1989 survey remains an outstanding introduction. L. R. Rabiner (1989). "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proceedings of the IEEE, vol 77, no 2, 257--287. In finance and economics, HMMs are also known as regime switching models, … T. Ryden, T. Terasvirta, and S. Asbrink (1998). "Stylized Facts of Daily Return Series and the Hidden Markov Model," J. Applied Econometrics, 13, 217--244. http://www-stat.wharton.upenn.edu/~steele/Courses/956/Resource/HiddenMarkovModels.htm
GHMM – General HMM – a C library “The General Hidden Markov Model library (GHMM) is a freely available C library implementing efficient data structures and algorithms for basic and extended HMMs with discrete and continous emissions. It comes with Python wrappers which provide a much nicer interface and added functionality. The GHMM is licensed under the LGPL.” http://ghmm.org/ ghmm.org
Information theory - Entropy We need to include a discussion of entropy to provide foundation for the next models Maximum Entropy Model (MEM) Maximum Entropy Markov Models (MEMM)
Information Theory Introduction (4.10) Entropy is a measure of the information in a message Define a random variable X over whatever we are predicting (words, characters or …) then the entropy of X is given by
Horse Race Example of Entropy 8 horses: H1, H2, …H8, Want to send messages of which horse to bet on in race with as few bits as possible. We could use the bit sequences H1=000, H2=001, …H8=111, three bits per bet.
But now given a random variable B Assume our bets over the day, are modelled by a random variable B, following the distribution Horse Probability that we bet on it log2(prob) Horse 1 ½ log2(1/2) = -1 Horse 2 ¼ log2(1/4) = -2 Horse 3 1/8 log2(1/8) = -3 Horse 4 1/16 log2(1/16) = -4 Horse 5 1/64 log2(1/64) = -6 Horse 6 Horse 7 Horse 8
Horse Race Example of Entropy(cont.) Then the entropy is Horse Probability Encoding bit string Horse 1 ½ Horse 2 ¼ 10 Horse 3 1/8 110 Horse 4 1/16 1110 Horse 5 1/64 11110 Horse 6 111110 Horse 7 1111110 Horse 8 11111110
What if horses are equally likely
Entropy of Sequences