Stochastic Optimization Maximization for Latent Variable Models

Slides:



Advertisements
Similar presentations
Topic models Source: Topic models, David Blei, MLSS 09.
Advertisements

Mixture Models and the EM Algorithm
Expectation Maximization
Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.
Probabilistic Clustering-Projection Model for Discrete Data
Statistical Topic Modeling part 1
K-means clustering Hongning Wang
Variational Inference for Dirichlet Process Mixture Daniel Klein and Soravit Beer Changpinyo October 11, 2011 Applied Bayesian Nonparametrics Special Topics.
Guillaume Bouchard Xerox Research Centre Europe
Mixture Language Models and EM Algorithm
Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.
Generative Topic Models for Community Analysis
EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.
Topic Modeling with Network Regularization Md Mustafizur Rahman.
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation James Foulds 1, Levi Boyles 1, Christopher DuBois 2 Padhraic Smyth.
Most slides from Expectation Maximization (EM) Northwestern University EECS 395/495 Special Topics in Machine Learning.
Latent Dirichlet Allocation a generative model for text
Parametric Inference.
Expectation Maximization Algorithm
British Museum Library, London Picture Courtesy: flickr.
Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.
Visual Recognition Tutorial
Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..
. Expressive Graphical Models in Variational Approximations: Chain-Graphs and Hidden Variables Tal El-Hay & Nir Friedman School of Computer Science & Engineering.
6. Experimental Analysis Visible Boltzmann machine with higher-order potentials: Conditional random field (CRF): Exponential random graph model (ERGM):
Introduction to Machine Learning for Information Retrieval Xiaolong Wang.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Online Learning for Latent Dirichlet Allocation
Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.
Crowdsourcing with Multi- Dimensional Trust Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department of Electrical.
Memory Bounded Inference on Topic Models Paper by R. Gomes, M. Welling, and P. Perona Included in Proceedings of ICML 2008 Presentation by Eric Wang 1/9/2009.
Lecture 19: More EM Machine Learning April 15, 2010.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
Variational Inference for the Indian Buffet Process
CS Statistical Machine learning Lecture 24
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Variational Bayesian Methods for Audio Indexing
Lecture 2: Statistical learning primer for biologists
Latent Dirichlet Allocation
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Analysis of Social Media MLD , LTI William Cohen
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
NIPS 2013 Michael C. Hughes and Erik B. Sudderth
Modeling Annotated Data (SIGIR 2003) David M. Blei, Michael I. Jordan Univ. of California, Berkeley Presented by ChengXiang Zhai, July 10, 2003.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Learning Deep Generative Models by Ruslan Salakhutdinov
Online Multiscale Dynamic Topic Models
Multimodal Learning with Deep Boltzmann Machines
Latent Variables, Mixture Models and EM
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009
Akio Utsugi National Institute of Bioscience and Human-technology,
Bayesian Models in Machine Learning
Probabilistic Models with Latent Variables
Collapsed Variational Dirichlet Process Mixture Models
Topic Modeling Nick Jordan.
Bayesian Inference for Mixture Language Models
Expectation Maximization
Topic models for corpora and for graphs
10701 Recitation Pengtao Xie
Expectation-Maximization & Belief Propagation
Latent Dirichlet Allocation
Topic models for corpora and for graphs
Topic Models in Text Processing
Lecture 11 Generalizations of EM.
The EM Algorithm With Applications To Image Epitome
Presentation transcript:

Stochastic Optimization Maximization for Latent Variable Models Manzil Zaheer 1, Satwik Kottur 2 1 Machine Learning Department 2 Department of Electrical and Computer Engg INTRODUCTION APPROACH RESULTS Latent Variable Models (LVM) Have become a staple of statistical modelling: clustering, topic models, subspace estimation Discovering the hidden thematic structure of objects in a human-understandable format Share common structure: Global variables: pervade the dataset Local variables: labels for data points Inference Estimating posterior distribution (or the most likely assignment) of all the latent variables Intractable Even approximate methods do not scale as data dependencies introduced by global state LVM with complete data likelihood : Assume distribution in exponential family: E Step: S Step: M Step: Find parameters that maximize expected log-likelihood wrt conditional Gaussian Mixture Models (n = 100,000, K = 10) Latent Dirichlet Allocation (LDA) Topic modelling, LVM using Dirichlet priors Vocabulary, dataset size: PubMed: 0.14 million 700 million 2. Wikipedia: 0.21 million 1.1 billion CONNECTIONS TO SGD Sampling (eg. Gibbs) Variational (eg. EM) Can be seen as SGD on MAP for LVM Collapse E and M steps to get update equation: Now we also have S step Collapse E , S, M steps to get update equation: MOTIVATION Expectation Maximization (variational) E-step: Compute distribution of hidden variables given data (independent) M-step: Maximize lower bound by conditional expectation wrt previously computed distribution Stochastic Expectation Maximization (SEM) Asychronous, parallel, stochastic in nature CONCLUSION An inference method for LVM that is a stochastic version of EM with emperical justification Speed-up over current state-of-the-art inference algorithms for LDA with decent accuracy Future Work Theoretical guarantees about convergence