Download presentation
Presentation is loading. Please wait.
Published byBrandon Warren Modified over 9 years ago
1
Online Learning for Latent Dirichlet Allocation
Matthew D. Hoffman, David M. Blei and Francis Bach NIPS 2010 Presented by Lingbo Li
2
Latent Dirichlet Allocation (LDA)
Draw each topic For each document: Draw topic proportions For each word: Draw
3
Batch variational Bayes for LDA
For a collection of documents, infer: Per-word topic assignment Per-document topic proportion topic distributions True posterior is approximated by Optimize over the variational parameters
6
Online variational inference for LDA
Mini-batches: Hyperparameter estimation:
7
Analysis of convergence
8
Analysis of convergence
Multiply the gradients by the inverse of an appropriate positive definite matrix H to speed up stochastic gradient algorithms. H: the Fisher information matrix of the variational distribution q
9
Experiments Use perplexity on held-out data as a measure of model:
are fit using the E step in algorithm 2;
10
Evaluating learning parameters
Two corpora: 352,549 documents from the journal Nature, and 100,000 documents from the English version Wikipedia. For each corpus, set aside a 1,000-document test set and a separate 1,000-document validation set. Run online LDA for five hours on the remaining documents from each corpus for
11
Compare batch and online on fixed corpora:
12
True online
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.