Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..

Slides:

Advertisements

Similar presentations

Topic models Source: Topic models, David Blei, MLSS 09.

Advertisements

A Tutorial on Learning with Bayesian Networks

Image Modeling & Segmentation

Mixture Models and the EM Algorithm

Hierarchical Dirichlet Processes

An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

Statistical Topic Modeling part 1

Unsupervised and Weakly-Supervised Probabilistic Modeling of Text Ivan Titov April TexPoint fonts used in EMF. Read the TexPoint manual before.

Generative Topic Models for Community Analysis

Statistical Models for Networks and Text Jimmy Foulds UCI Computer Science PhD Student Advisor: Padhraic Smyth.

Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation James Foulds 1, Levi Boyles 1, Christopher DuBois 2 Padhraic Smyth.

Latent Dirichlet Allocation a generative model for text

End of Chapter 8 Neil Weisenfeld March 28, 2005.

Bayesian Networks Alan Ritter.

Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.

Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.

Latent Dirichlet Allocation (LDA) Shannon Quinn (with thanks to William Cohen of Carnegie Mellon University and Arvind Ramanathan of Oak Ridge National.

Online Learning for Latent Dirichlet Allocation

Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.

Hierarchical Topic Models and the Nested Chinese Restaurant Process Blei, Griffiths, Jordan, Tenenbaum presented by Rodrigo de Salvo Braz.

Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.

Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.

Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009.

Integrating Topics and Syntax -Thomas L

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Summary We propose a framework for jointly modeling networks and text associated with them, such as networks or user review websites. The proposed.

Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang

Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.

Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.

Lecture 2: Statistical learning primer for biologists

Latent Dirichlet Allocation

Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.

Lecture #9: Introduction to Markov Chain Monte Carlo, part 3

Techniques for Dimensionality Reduction

John Lafferty Andrew McCallum Fernando Pereira

CS246 Latent Dirichlet Analysis. LSI  LSI uses SVD to find the best rank-K approximation  The result is difficult to interpret especially with negative.

Towards Total Scene Understanding: Classiﬁcation, Annotation and Segmentation in an Automatic Framework N 工科所錢雅馨 2011/01/16 Li-Jia Li, Richard.

Web-Mining Agents Topic Analysis: pLSI and LDA

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Analysis of Social Media MLD , LTI William Cohen

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Latent Dirichlet Allocation (LDA)

Modeling Annotated Data (SIGIR 2003) David M. Blei, Michael I. Jordan Univ. of California, Berkeley Presented by ChengXiang Zhai, July 10, 2003.

Scaling up LDA William Cohen. First some pictures…

Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li

Latent Feature Models for Network Data over Time Jimmy Foulds Advisor: Padhraic Smyth (Thanks also to Arthur Asuncion and Chris Dubois)

Analysis of Social Media MLD , LTI William Cohen

Probabilistic models for corpora and graphs. Review: some generative models Multinomial Naïve Bayes C W1W1 W2W2 W3W3 ….. WNWN  M  For each document.

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.

B. Freeman, Tomasz Malisiewicz, Tom Landauer and Peter Foltz,

Learning Deep Generative Models by Ruslan Salakhutdinov

Latent Dirichlet Allocation

Speeding up LDA.

Markov Networks.

Latent Dirichlet Analysis

Bayesian Inference for Mixture Language Models

Stochastic Optimization Maximization for Latent Variable Models

Topic models for corpora and for graphs

Michal Rosen-Zvi University of California, Irvine

Latent Dirichlet Allocation

LDA AND OTHER DIRECTED MODELS FOR MODELING TEXT

CS246: Latent Dirichlet Analysis

Junghoo “John” Cho UCLA

Topic models for corpora and for graphs

10-405: LDA Lecture 2.

Topic Models in Text Processing

Presentation transcript:

Topic models for corpora and for graphs

Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,.. –some structure homophily, scale-free degree dist?

More terms “Stochastic block model”, aka “Block- stochastic matrix”: –Draw n i nodes in block i –With probability p ij, connect pairs (u,v) where u is in block i, v is in block j –Special, simple case: p ii =q i, and pij=s for all i≠j Question: can you fit this model to a graph? –find each p ij and latent node  block mapping

Not? football

Not? books

Outline Stochastic block models & inference question Review of text models –Mixture of multinomials & EM –LDA and Gibbs (or variational EM) Block models and inference Mixed-membership block models Multinomial block models and inference w/ Gibbs

Review – supervised Naïve Bayes Naïve Bayes Model: Compact representation C W1W1 W2W2 W3W3 ….. WNWN C W N  M M   

Review – supervised Naïve Bayes Multinomial Naïve Bayes C W1W1 W2W2 W3W3 ….. WNWN  M  For each document d = 1, , M Generate C d ~ Mult( |  ) For each position n = 1, , N d Generate w n ~ Mult( | ,C d )

Review – supervised Naïve Bayes Multinomial naïve Bayes: Learning –Maximize the log-likelihood of observed variables w.r.t. the parameters: Convex function: global optimum Solution:

Review – unsupervised Naïve Bayes Mixture model: unsupervised naïve Bayes model C W N M   Joint probability of words and classes: But classes are not visible: Z

LDA

Review - LDA Motivation w M  N Assumptions: 1) documents are i.i.d 2) within a document, words are i.i.d. (bag of words) For each document d = 1, ,M Generate  d ~ D 1 (…) For each word n = 1, , N d generate w n ~ D 2 ( | θ d n ) Now pick your favorite distributions for D 1, D 2

Latent Dirichlet Allocation z w  M  N  For each document d = 1, ,M Generate  d ~ Dir( |  ) For each position n = 1, , N d generate z n ~ Mult( |  d ) generate w n ~ Mult( |  z n ) “Mixed membership” K

vs Naïve Bayes… z w  M  N  K

LDA’s view of a document

LDA topics

Review - LDA Latent Dirichlet Allocation –Parameter learning: Variational EM –Numerical approximation using lower-bounds –Results in biased solutions –Convergence has numerical guarantees Gibbs Sampling –Stochastic simulation –unbiased solutions –Stochastic convergence

Review - LDA Gibbs sampling –Applicable when joint distribution is hard to evaluate but conditional distribution is known –Sequence of samples comprises a Markov Chain –Stationary distribution of the chain is the joint distribution Key capability: estimate distribution of one latent variables given the other latent variables and observed variables.

Why does Gibbs sampling work? What’s the fixed point? –Stationary distribution of the chain is the joint distribution When will it converge (in the limit)? –Graph defined by the chain is connected How long will it take to converge? –Depends on second eigenvector of that graph

Called “collapsed Gibbs sampling” since you’ve marginalized away some variables Fr: Parameter estimation for text analysis - Gregor Heinrich

Review - LDA Latent Dirichlet Allocation z w  M  N  Randomly initialize each z m,n Repeat for t=1,…. For each doc m, word n Find Pr(z mn =k|other z’s) Sample z mn according to that distr. “Mixed membership”

Outline Stochastic block models & inference question Review of text models –Mixture of multinomials & EM –LDA and Gibbs (or variational EM) Block models and inference Mixed-membership block models Multinomial block models and inference w/ Gibbs Beastiary of other probabilistic graph models –Latent-space models, exchangeable graphs, p1, ERGM

Review - LDA Motivation w M  N Assumptions: 1) documents are i.i.d 2) within a document, words are i.i.d. (bag of words) For each document d = 1, ,M Generate  d ~ D 1 (…) For each word n = 1, , N d generate w n ~ D 2 ( | θ d n ) Docs and words are exchangeable.

Stochastic Block models: assume 1) nodes w/in a block z and 2) edges between blocks z p,z q are exchangeable zpzp zqzq a pq N2N2 zpzp N  p 

Stochastic Block models: assume 1) nodes w/in a block z and 2) edges between blocks z p,z q are exchangeable zpzp zqzq a pq N2N2 zpzp N  p  Gibbs sampling: Randomly initialize z p for each node p. For t = 1… For each node p Compute z p given other z’s Sample z p See: Snijders & Nowicki, 1997, Estimation and Prediction for Stochastic Blockmodels for Groups with Latent Graph Structure

Mixed Membership Stochastic Block models pp qq zp.zp. z.qz.q a pq N2N2 pp N  p  Airoldi et al, JMLR 2008

Parkkinen et al paper

Another mixed membership block model

z=(zi,zj) is a pair of block ids n z = #pairs z q z1, i = #links to i from block z1 q z1,. = #outlinks in block z1 δ = indicator for diagonal M = #nodes

Another mixed membership block model

Experiments Balasubramanyan, Lin, Cohen, NIPS w/s lots of synthetic data

Experiments Balasubramanyan, Lin, Cohen, NIPS w/s 2010

Experiments Balasubramanyan, Lin, Cohen, NIPS w/s 2010