Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009.

Slides:



Advertisements
Similar presentations
1 Gesture recognition Using HMMs and size functions.
Advertisements

Topic models Source: Topic models, David Blei, MLSS 09.
Hierarchical Dirichlet Process (HDP)
Learning HMM parameters
Title: The Author-Topic Model for Authors and Documents
Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang and David M. Blei NIPS 2009 Discussion led by Chunping Wang.
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
Chapter 6: HIDDEN MARKOV AND MAXIMUM ENTROPY Heshaam Faili University of Tehran.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Models (HMMs) Steven Salzberg CMSC 828H, Univ. of Maryland Fall 2010.
Apaydin slides with a several modifications and additions by Christoph Eick.
Generative Topic Models for Community Analysis
… Hidden Markov Models Markov assumption: Transition model:
Topic Modeling with Network Regularization Md Mustafizur Rahman.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Hidden Markov Model Special case of Dynamic Bayesian network Single (hidden) state variable Single (observed) observation variable Transition probability.
Hidden Markov Models. Hidden Markov Model In some Markov processes, we may not be able to observe the states directly.
Part 4 c Baum-Welch Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
. Hidden Markov Models with slides from Lise Getoor, Sebastian Thrun, William Cohen, and Yair Weiss.
Hidden Markov Models. Hidden Markov Model In some Markov processes, we may not be able to observe the states directly.
Distributed Representations of Sentences and Documents
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
. cmsc726: HMMs material from: slides from Sebastian Thrun, and Yair Weiss.
Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..
(Some issues in) Text Ranking. Recall General Framework Crawl – Use XML structure – Follow links to get new pages Retrieve relevant documents – Today.
1 Markov Chains. 2 Hidden Markov Models 3 Review Markov Chain can solve the CpG island finding problem Positive model, negative model Length? Solution:
Integrating Topics and Syntax Paper by Thomas Griffiths, Mark Steyvers, David Blei, Josh Tenenbaum Presentation by Eric Wang 9/12/2008.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
May 20, 2006SRIV2006, Toulouse, France1 Acoustic Modeling of Accented English Speech for Large-Vocabulary Speech Recognition ATR Spoken Language Communication.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
Example 16,000 documents 100 topic Picked those with large p(w|z)
Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Conditional Topic Random Fields Jun Zhu and Eric P. Xing ICML 2010 Presentation and Discussion by Eric Wang January 12, 2011.
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Style & Topic Language Model Adaptation Using HMM-LDA Bo-June (Paul) Hsu, James Glass.
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
PGM 2003/04 Tirgul 2 Hidden Markov Models. Introduction Hidden Markov Models (HMM) are one of the most common form of probabilistic graphical models,
CS Statistical Machine learning Lecture 24
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Inferring High-Level Behavior from Low-Level Sensors Donald J. Patterson, Lin Liao, Dieter Fox, and Henry Kautz.
School of Computer Science 1 Information Extraction with HMM Structures Learned by Stochastic Optimization Dayne Freitag and Andrew McCallum Presented.
Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatic, Yee Whye Teh and Zoubin Ghahramani (ICML 2008) Presented by Lihan.
Link Distribution on Wikipedia [0407]KwangHee Park.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Hidden Markov Models CISC 5800 Professor Daniel Leeds.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker.
Hidden Markov Models BMI/CS 576
Extracting Mobile Behavioral Patterns with the Distant N-Gram Topic Model Lingzi Hong Feb 10th.
Online Multiscale Dynamic Topic Models
Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009
CHAPTER 15: Hidden Markov Models
Hidden Markov Autoregressive Models
Hidden Markov Model LR Rabiner
Topic models for corpora and for graphs
Michal Rosen-Zvi University of California, Irvine
Nonparametric Bayesian Texture Learning and Synthesis
Topic models for corpora and for graphs
Presentation transcript:

Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009

Outline Motivations Related Topic Models Hidden Topic Markov Models Inference Experiments Conclusions

Motivations Feature Reduction Extensively large text corpora a small number of variables Topical segmentation Segment a document according to hidden topics Word sense disambiguation Distinguish between different instances of the same word according to the context

Related Topic Models LDA (JMLR 2003) 1. For, draw 2. For, (a) Draw (b) For, draw (c) For, draw Words in a document are exchangeable; documents are also exchangeable.

Related Topic Models Dynamic Topic Models (ICML 2006) Words in a document are exchangeable; documents are not exchangeable.

Related Topic Models Topic Modeling: Beyond Bag of Words (ICML 2006) Words in a document are not exchangeable; documents are exchangeable.

Related Topic Models Integrating Topics and Syntax (NIPS 2005) Words in a document are not exchangeable; documents are exchangeable. HMM LDASemantic words Non-semantic (syntactic) words

Hidden Topic Markov Models No topic transition is allowed within a sentence. Whenever a new sentence starts, either the old topic is kept or a new topic is drawn according to.

Hidden Topic Markov Models Transition matrices within a sentence or no transition between two sentences, with probability Transition occurs between two sentences, with probability Emission matrix Initial state distribution

Inference EM algorithm: E-step Compute using the forward- backward algorithm; M-step

Experiments NIPS dataset (1740 documents, 1557 for training, 183 for testing) –Data preprocess Extract words in the vocabulary (J=12113, no stop words); Divide text to sentences according to “.?!; ”. –Compare LDA, HTMM and VHTMM1 in terms of perplexity VHTMM1: a variant of HTMM with, a “ bag of sentences ” N test : the total length of the test document; N: the first N words of the document are observed. Average N test =1300

Experiments K=100 N=10 The lower the perplexity is, the better the model is in predicting unseen words.

Experiments –Topical segmentation HTMM LDA

Experiments –Top words of topics HTMM LDA mathacknowledgments reference

Experiments As more topics are available, the topics become more specific and topic transitions are more frequent.

Experiments Two toy datasets, generated using HTMM and LDA. Goal: to eliminate the option that the perplexity of HTMM might be lower than the perplexity of LDA only because it has less degrees of freedom. With toy datasets, other criteria can be used for comparison.

Conclusions HTMM is another extension of LDA, which relaxes the “ bag-of-words ” assumption by modeling the topic dynamics with a Markov chain. This extension leads to a significant improvement in perplexity, and makes additional inferences possible, such as topical segmentation and word sense disambiguation. It requires a larger storage since the entire document has to be the input of the algorithm. It only applies to structured data, where sentences are well defined.