Topic models Source: Topic models, David Blei, MLSS 09.

Slides:



Advertisements
Similar presentations
Scaling Up Graphical Model Inference
Advertisements

Xiaolong Wang and Daniel Khashabi
Hierarchical Dirichlet Process (HDP)
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Hierarchical Dirichlet Processes
Simultaneous Image Classification and Annotation Chong Wang, David Blei, Li Fei-Fei Computer Science Department Princeton University Published in CVPR.
Title: The Author-Topic Model for Authors and Documents
An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.
Generative learning methods for bags of features
Probabilistic Clustering-Projection Model for Discrete Data
Final Project Presentation Name: Samer Al-Khateeb Instructor: Dr. Xiaowei Xu Class: Information Science Principal/ Theory (IFSC 7321) TOPIC MODELING FOR.
Statistical Topic Modeling part 1
Variational Inference for Dirichlet Process Mixture Daniel Klein and Soravit Beer Changpinyo October 11, 2011 Applied Bayesian Nonparametrics Special Topics.
Unsupervised and Weakly-Supervised Probabilistic Modeling of Text Ivan Titov April TexPoint fonts used in EMF. Read the TexPoint manual before.
Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.
Generative Topic Models for Community Analysis
Caimei Lu et al. (KDD 2010) Presented by Anson Liang.
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation James Foulds 1, Levi Boyles 1, Christopher DuBois 2 Padhraic Smyth.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Latent Dirichlet Allocation a generative model for text
British Museum Library, London Picture Courtesy: flickr.
Multiscale Topic Tomography Ramesh Nallapati, William Cohen, Susan Ditmore, John Lafferty & Kin Ung (Johnson and Johnson Group)
LATENT DIRICHLET ALLOCATION. Outline Introduction Model Description Inference and Parameter Estimation Example Reference.
Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..
Introduction to Machine Learning for Information Retrieval Xiaolong Wang.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
Example 16,000 documents 100 topic Picked those with large p(w|z)
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Online Learning for Latent Dirichlet Allocation
(Infinitely) Deep Learning in Vision Max Welling (UCI) collaborators: Ian Porteous (UCI) Evgeniy Bart UCI/Caltech) Pietro Perona (Caltech)
Crowdsourcing with Multi- Dimensional Trust Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department of Electrical.
Memory Bounded Inference on Topic Models Paper by R. Gomes, M. Welling, and P. Perona Included in Proceedings of ICML 2008 Presentation by Eric Wang 1/9/2009.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Probabilistic Topic Models
27. May Topic Models Nam Khanh Tran L3S Research Center.
1 Bayesian Param. Learning Bayesian Structure Learning Graphical Models – Carlos Guestrin Carnegie Mellon University October 6 th, 2008 Readings:
Eric Xing © Eric CMU, Machine Learning Latent Aspect Models Eric Xing Lecture 14, August 15, 2010 Reading: see class homepage.
Integrating Topics and Syntax -Thomas L
Randomized Algorithms for Bayesian Hierarchical Clustering
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
Variational Inference for the Indian Buffet Process
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
Introduction to LDA Jinyang Gao. Outline Bayesian Analysis Dirichlet Distribution Evolution of Topic Model Gibbs Sampling Intuition Analysis of Parameter.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
Topic Modeling using Latent Dirichlet Allocation
Latent Dirichlet Allocation
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:
Markov Random Fields & Conditional Random Fields
Analysis of Social Media MLD , LTI William Cohen
Modeling Annotated Data (SIGIR 2003) David M. Blei, Michael I. Jordan Univ. of California, Berkeley Presented by ChengXiang Zhai, July 10, 2003.
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Topic Modeling for Short Texts with Auxiliary Word Embeddings
The topic discovery models
Multimodal Learning with Deep Boltzmann Machines
The topic discovery models
The topic discovery models
Bayesian Inference for Mixture Language Models
Stochastic Optimization Maximization for Latent Variable Models
Topic models for corpora and for graphs
Michal Rosen-Zvi University of California, Irvine
Expectation-Maximization & Belief Propagation
Latent Dirichlet Allocation
LDA AND OTHER DIRECTED MODELS FOR MODELING TEXT
Topic models for corpora and for graphs
Topic Models in Text Processing
Presentation transcript:

Topic models Source: Topic models, David Blei, MLSS 09

Topic modeling - Motivation

Discover topics from a corpus

Model connections between topics

Model the evolution of topics over time

Image annotation

Extensions* Malleable: Can be quickly extended for data with tags (side information), class label, etc The (approximate) inference methods can be readily translated in many cases Most datasets can be converted to bag-of- words format using a codebook representation and LDA style models can be readily applied (can work with continuous observations too) *YMMV

Connection to ML research

Latent Dirichlet Allocation

LDA

Probabilistic modeling

Intuition behind LDA

Generative model

The posterior distribution

Graphical models (Aside)

LDA model

Dirichlet distribution

Dirichlet Examples Darker implies lower magnitude \alpha < 1 leads to sparser topics

LDA

Inference in LDA

Example inference

Topics vs words

Explore and browse document collections

Why does LDA work ?

LDA is modular, general, useful

Approximate inference An excellent reference is On smoothing and inference for topic models Asuncion et al. (2009).

Posterior distribution for LDA The only parameters we need to estimate are \alpha, \beta

Posterior distribution

Posterior distribution for LDA Can integrate out either \theta or z, but not both Marginalize \theta => z ~ Polya (\alpha) Polya distribution also known as Dirichlet compound multinomial (models burstiness) Most algorithms marginalize out \theta

MAP inference Integrate out z Treat \theta as random variable Can use EM algorithm Updates very similar to that of PLSA (except for additional regularization terms)

Collapsed Gibbs sampling

Variational inference Can think of this as extension of EM where we compute expectations w.r.t variational distribution instead of true posterior

Mean field variational inference

MFVI and conditional exponential families

Variational inference

Variational inference for LDA

Collapsed variational inference MFVI: \theta, z assumed to be independent \theta can be marginalized out exactly Variational inference algorithm operating on the collapsed space as CGS Strictly better lower bound than VB Can think of soft CGS where we propagate uncertainty by using probabilities than samples

Estimating the topics

Inference comparison

Comparison of updates On smoothing and inference for topic models Asuncion et al. (2009). MAP VB CVB0 CGS

Choice of inference algorithm Depends on vocabulary size (V), number of words per document (say N_i) Collapsed algorithms – Not parallelizable CGS - need to draw multiple samples of topic assignments for multiple occurrences of same word (slow when N_i >> V) MAP – Fast, but performs poor when N_i << V CVB0 - Good tradeoff between computational complexity and perplexity

Supervised and relational topic models

Supervised LDA

Variational inference in sLDA

ML estimation

Prediction

Example: Movie reviews

Diverse response types with GLMs

Example: Multi class classification

Supervised topic models

Upstream vs downstream models Upstream: Conditional models Downstream: The predictor variable is generated based on actually observed z than \theta which is E(zs)

Relational topic models

Predictive performance of one type given the other

Predicting links from documents

Things we didnt address Model selection: Non parametric Bayesian approaches Hyperparameter tuning Evaluation can be a bit tricky (comparing approximate bounds) for LDA, but can use traditional metrics in supervised versions

Thank you!