Online Learning for Latent Dirichlet Allocation Matthew D. Hoffman, David M. Blei and Francis Bach NIPS 2010 Presented by Lingbo Li
Latent Dirichlet Allocation (LDA) Draw each topic For each document: Draw topic proportions For each word: Draw
Batch variational Bayes for LDA For a collection of documents, infer: Per-word topic assignment Per-document topic proportion topic distributions True posterior is approximated by Optimize over the variational parameters
Online variational inference for LDA Mini-batches: Hyperparameter estimation:
Analysis of convergence
Analysis of convergence Multiply the gradients by the inverse of an appropriate positive definite matrix H to speed up stochastic gradient algorithms. H: the Fisher information matrix of the variational distribution q
Experiments Use perplexity on held-out data as a measure of model: are fit using the E step in algorithm 2;
Evaluating learning parameters Two corpora: 352,549 documents from the journal Nature, and 100,000 documents from the English version Wikipedia. For each corpus, set aside a 1,000-document test set and a separate 1,000-document validation set. Run online LDA for five hours on the remaining documents from each corpus for
Compare batch and online on fixed corpora:
True online