A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

Slides:



Advertisements
Similar presentations
Sinead Williamson, Chong Wang, Katherine A. Heller, David M. Blei
Advertisements

Topic models Source: Topic models, David Blei, MLSS 09.
Xiaolong Wang and Daniel Khashabi
Markov Chain Sampling Methods for Dirichlet Process Mixture Models R.M. Neal Summarized by Joon Shik Kim (Thu) Computational Models of Intelligence.
Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.
Hierarchical Dirichlet Processes
An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.
Probabilistic Clustering-Projection Model for Discrete Data
Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang and David M. Blei NIPS 2009 Discussion led by Chunping Wang.
Variational Inference for Dirichlet Process Mixture Daniel Klein and Soravit Beer Changpinyo October 11, 2011 Applied Bayesian Nonparametrics Special Topics.
Visual Recognition Tutorial
Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.
Generative Topic Models for Community Analysis
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation James Foulds 1, Levi Boyles 1, Christopher DuBois 2 Padhraic Smyth.
1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.
British Museum Library, London Picture Courtesy: flickr.
Visual Recognition Tutorial
Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..
Bayes Factor Based on Han and Carlin (2001, JASA).
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
Example 16,000 documents 100 topic Picked those with large p(w|z)
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Online Learning for Latent Dirichlet Allocation
Bayesian Sets Zoubin Ghahramani and Kathertine A. Heller NIPS 2005 Presented by Qi An Mar. 17 th, 2006.
Memory Bounded Inference on Topic Models Paper by R. Gomes, M. Welling, and P. Perona Included in Proceedings of ICML 2008 Presentation by Eric Wang 1/9/2009.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Mean Field Variational Bayesian Data Assimilation EGU 2012, Vienna Michail Vrettas 1, Dan Cornford 1, Manfred Opper 2 1 NCRG, Computer Science, Aston University,
High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.
Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.
Integrating Topics and Syntax -Thomas L
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.
Variational Inference for the Indian Buffet Process
Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008.
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
Introduction to LDA Jinyang Gao. Outline Bayesian Analysis Dirichlet Distribution Evolution of Topic Model Gibbs Sampling Intuition Analysis of Parameter.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
Cosmological Model Selection David Parkinson (with Andrew Liddle & Pia Mukherjee)
Latent Dirichlet Allocation
Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatic, Yee Whye Teh and Zoubin Ghahramani (ICML 2008) Presented by Lihan.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Recitation4 for BigData Jay Gu Feb MapReduce.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Analysis of Social Media MLD , LTI William Cohen
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
How can we maintain an error bound? Settle for a “per-step” bound What’s the probability of a mistake at each step? Not cumulative, but Equal footing with.
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.
Fast search for Dirichlet process mixture models
Online Multiscale Dynamic Topic Models
Variational Bayesian Inference
ICS 280 Learning in Graphical Models
Variational Bayes Model Selection for Mixture Distribution
Accelerated Sampling for the Indian Buffet Process
Multimodal Learning with Deep Boltzmann Machines
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009
Collapsed Variational Dirichlet Process Mixture Models
Bayesian Inference for Mixture Language Models
Stochastic Optimization Maximization for Latent Variable Models
Topic models for corpora and for graphs
Michal Rosen-Zvi University of California, Irvine
Topic models for corpora and for graphs
Topic Models in Text Processing
Presentation transcript:

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion led by Iulian Pruteanu

Outline Introduction Approximate inferences for LDA Collapsed VB inference for LDA Experimental results Conclusions

Introduction (1/2) Latent Dirichlet Allocation is suitable for many applications from document modeling to computer vision. Collapsed Gibbs sampling seems to be the preferred choice to the large scale problems; however collapsed Gibbs sampling has its own problems. CVB algorithm, making use of some approximations, is easy to implement and more accurate than standard VB.

Introduction (2/2) This paper –proposes an improved VB algorithm based on integrating out the model parameters - assumption: the latent variables are mutually independent –uses a Gaussian approximation for computation efficiency

Approximate inferences for LDA(1/3) - observed words - latent variables (topic indices) - mixing proportions - topic parameters - number of documents - number of topics

Approximate inferences for LDA (2/3) Given the observed words the task of Bayesian inference is to compute the posterior distribution over 1.Variational Bayes

Approximate inferences for LDA (3/3) 2. Collapsed Gibbs sampling

Collapsed VB inference for LDA and marginalization on model parameters In variational Bayesian approximation, we assume a factorized form for the posterior approximating distribution. However it is not a good assumption since changes in model parameters ( ) will have a considerable impact on latent variables ( ). CVB is equivalent to marginalizing out the model parameters before approximating the posterior over the latent variable. The exact implementation of CVB has a closed form but is computationally too expensive to be practical. Therefore, the authors propose a simple Gaussian approximation which seems to work very accurately.

Experimental results Left: results for KOS. D=3,430 documents; W=6,909; N=467,714 words Right: results for NIPS. D=1,675 documents; W=12,419; N=2,166,029 words 10% for testing; 50 random runs Variational bounds (# iterations) Log probabilities (# iterations)

Conclusions Variational approximation are much more efficient computationally than Gibbs sampling, with almost no loss in accuracy The CVB inference algorithm is easy to implement*, computationally efficient (Gaussian approximation) and more accurate than standard VB.