Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 9 The GHMM Library and The Brill Tagger

Similar presentations


Presentation on theme: "Lecture 9 The GHMM Library and The Brill Tagger"— Presentation transcript:

1 Lecture 9 The GHMM Library and The Brill Tagger
CSCE Natural Language Processing Lecture 9 The GHMM Library and The Brill Tagger Topics GHHM library in C – integrating with Python Brill tagger HMM assignment Readings: Section 5.6 – transformation based tagging February 13, 2013

2 Overview Last Time Today Computation Complexity for Problems 1 and 2
Problem 3 - Learning the model Backward computation Forward-Backward Algorithm Videos on NLP on You Tube from Coursesa etc. Today Computation Complexity for Forward-Backward Algorithm General Hidden Markov Model library (GHMM) Transformational taggers (back to sec 5.6 )

3 Ferguson’s 3 Fundamental Problems
Computing Likelihood – Given an HMM λ = (A, B) and an observation sequence O, determine the likelihood P(O| λ). The Decoding Problem– Given an HMM λ = (A, B) and an observation sequence O= o1, o2, … oT, find the most probable sequence of states Q = q1, q2, … qT. Learning the Model (HMM) – Given an observation sequence and the set of possible states in the HMM, learn parameters A and B.

4 Big-Oh of Forward-Backward Alg.
Speech and Language Processing, Second Edition Daniel Jurafsky and James H. Martin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved.

5 A Tutorial that Taught a Generation
The Hidden Markov Models, or HMMs, … The web is full of information on HMMs. … HMMs have been around for years, but they came into their own when they started to win the speech recognition game. Rabiner and his coauthors wrote several tutorial surveys which helped greatly to speed up the process. Rabiner's 1989 survey remains an outstanding introduction. L. R. Rabiner (1989). "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proceedings of the IEEE, vol 77, no 2, In finance and economics, HMMs are also known as regime switching models, … T. Ryden, T. Terasvirta, and S. Asbrink (1998). "Stylized Facts of Daily Return Series and the Hidden Markov Model," J. Applied Econometrics, 13,

6 GHMM – General HMM – a C library
“The General Hidden Markov Model library (GHMM) is a freely available C library implementing efficient data structures and algorithms for basic and extended HMMs with discrete and continous emissions. It comes with Python wrappers which provide a much nicer interface and added functionality. The GHMM is licensed under the LGPL.” ghmm.org

7 Information theory - Entropy
We need to include a discussion of entropy to provide foundation for the next models Maximum Entropy Model (MEM) Maximum Entropy Markov Models (MEMM)

8 Information Theory Introduction (4.10)
Entropy is a measure of the information in a message Define a random variable X over whatever we are predicting (words, characters or …) then the entropy of X is given by

9 Horse Race Example of Entropy
8 horses: H1, H2, …H8, Want to send messages of which horse to bet on in race with as few bits as possible. We could use the bit sequences H1=000, H2=001, …H8=111, three bits per bet.

10 But now given a random variable B
Assume our bets over the day, are modelled by a random variable B, following the distribution Horse Probability that we bet on it log2(prob) Horse 1 log2(1/2) = -1 Horse 2 log2(1/4) = -2 Horse 3 1/8 log2(1/8) = -3 Horse 4 1/16 log2(1/16) = -4 Horse 5 1/64 log2(1/64) = -6 Horse 6 Horse 7 Horse 8

11 Horse Race Example of Entropy(cont.)
Then the entropy is Horse Probability Encoding bit string Horse 1 Horse 2 10 Horse 3 1/8 110 Horse 4 1/16 1110 Horse 5 1/64 11110 Horse 6 111110 Horse 7 Horse 8

12 What if horses are equally likely

13 Entropy of Sequences


Download ppt "Lecture 9 The GHMM Library and The Brill Tagger"

Similar presentations


Ads by Google