Stochastic Optimization Maximization for Latent Variable Models

Slides:

Advertisements

Similar presentations

Topic models Source: Topic models, David Blei, MLSS 09.

Advertisements

Mixture Models and the EM Algorithm

Expectation Maximization

Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.

An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.

Probabilistic Clustering-Projection Model for Discrete Data

Statistical Topic Modeling part 1

K-means clustering Hongning Wang

Variational Inference for Dirichlet Process Mixture Daniel Klein and Soravit Beer Changpinyo October 11, 2011 Applied Bayesian Nonparametrics Special Topics.

Guillaume Bouchard Xerox Research Centre Europe

Mixture Language Models and EM Algorithm

Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.

Generative Topic Models for Community Analysis

EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.

Topic Modeling with Network Regularization Md Mustafizur Rahman.

Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation James Foulds 1, Levi Boyles 1, Christopher DuBois 2 Padhraic Smyth.

Most slides from Expectation Maximization (EM) Northwestern University EECS 395/495 Special Topics in Machine Learning.

Latent Dirichlet Allocation a generative model for text

Parametric Inference.

Expectation Maximization Algorithm

British Museum Library, London Picture Courtesy: flickr.

Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.

Visual Recognition Tutorial

Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..

. Expressive Graphical Models in Variational Approximations: Chain-Graphs and Hidden Variables Tal El-Hay & Nir Friedman School of Computer Science & Engineering.

6. Experimental Analysis Visible Boltzmann machine with higher-order potentials: Conditional random field (CRF): Exponential random graph model (ERGM):

Introduction to Machine Learning for Information Retrieval Xiaolong Wang.

Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.

1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)

Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.

Online Learning for Latent Dirichlet Allocation

Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.

Crowdsourcing with Multi- Dimensional Trust Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department of Electrical.

Memory Bounded Inference on Topic Models Paper by R. Gomes, M. Welling, and P. Perona Included in Proceedings of ICML 2008 Presentation by Eric Wang 1/9/2009.

Lecture 19: More EM Machine Learning April 15, 2010.

Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.

Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang

Variational Inference for the Indian Buffet Process

CS Statistical Machine learning Lecture 24

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.

Variational Bayesian Methods for Audio Indexing

Lecture 2: Statistical learning primer for biologists

Latent Dirichlet Allocation

Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Analysis of Social Media MLD , LTI William Cohen

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

NIPS 2013 Michael C. Hughes and Erik B. Sudderth

Modeling Annotated Data (SIGIR 2003) David M. Blei, Michael I. Jordan Univ. of California, Berkeley Presented by ChengXiang Zhai, July 10, 2003.

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

Learning Deep Generative Models by Ruslan Salakhutdinov

Online Multiscale Dynamic Topic Models

Multimodal Learning with Deep Boltzmann Machines

Latent Variables, Mixture Models and EM

J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009

Akio Utsugi National Institute of Bioscience and Human-technology,

Bayesian Models in Machine Learning

Probabilistic Models with Latent Variables

Collapsed Variational Dirichlet Process Mixture Models

Topic Modeling Nick Jordan.

Bayesian Inference for Mixture Language Models

Expectation Maximization

Topic models for corpora and for graphs

10701 Recitation Pengtao Xie

Expectation-Maximization & Belief Propagation

Latent Dirichlet Allocation

Topic models for corpora and for graphs

Topic Models in Text Processing

Lecture 11 Generalizations of EM.

The EM Algorithm With Applications To Image Epitome

Presentation transcript:

Stochastic Optimization Maximization for Latent Variable Models Manzil Zaheer 1, Satwik Kottur 2 1 Machine Learning Department 2 Department of Electrical and Computer Engg INTRODUCTION APPROACH RESULTS Latent Variable Models (LVM) Have become a staple of statistical modelling: clustering, topic models, subspace estimation Discovering the hidden thematic structure of objects in a human-understandable format Share common structure: Global variables: pervade the dataset Local variables: labels for data points Inference Estimating posterior distribution (or the most likely assignment) of all the latent variables Intractable Even approximate methods do not scale as data dependencies introduced by global state LVM with complete data likelihood : Assume distribution in exponential family: E Step: S Step: M Step: Find parameters that maximize expected log-likelihood wrt conditional Gaussian Mixture Models (n = 100,000, K = 10) Latent Dirichlet Allocation (LDA) Topic modelling, LVM using Dirichlet priors Vocabulary, dataset size: PubMed: 0.14 million 700 million 2. Wikipedia: 0.21 million 1.1 billion CONNECTIONS TO SGD Sampling (eg. Gibbs) Variational (eg. EM) Can be seen as SGD on MAP for LVM Collapse E and M steps to get update equation: Now we also have S step Collapse E , S, M steps to get update equation: MOTIVATION Expectation Maximization (variational) E-step: Compute distribution of hidden variables given data (independent) M-step: Maximize lower bound by conditional expectation wrt previously computed distribution Stochastic Expectation Maximization (SEM) Asychronous, parallel, stochastic in nature CONCLUSION An inference method for LVM that is a stochastic version of EM with emperical justification Speed-up over current state-of-the-art inference algorithms for LDA with decent accuracy Future Work Theoretical guarantees about convergence