Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..

Similar presentations


Presentation on theme: "Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,.."— Presentation transcript:

1 Topic models for corpora and for graphs

2 Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,.. –some structure homophily, scale-free degree dist?

3 More terms “Stochastic block model”, aka “Block- stochastic matrix”: –Draw n i nodes in block i –With probability p ij, connect pairs (u,v) where u is in block i, v is in block j –Special, simple case: p ii =q i, and pij=s for all i≠j Question: can you fit this model to a graph? –find each p ij and latent node  block mapping

4 Not? football

5 Not? books

6 Outline Stochastic block models & inference question Review of text models –Mixture of multinomials & EM –LDA and Gibbs (or variational EM) Block models and inference Mixed-membership block models Multinomial block models and inference w/ Gibbs

7 Review – supervised Naïve Bayes Naïve Bayes Model: Compact representation C W1W1 W2W2 W3W3 ….. WNWN C W N  M M   

8 Review – supervised Naïve Bayes Multinomial Naïve Bayes C W1W1 W2W2 W3W3 ….. WNWN  M  For each document d = 1, , M Generate C d ~ Mult( |  ) For each position n = 1, , N d Generate w n ~ Mult( | ,C d )

9 Review – supervised Naïve Bayes Multinomial naïve Bayes: Learning –Maximize the log-likelihood of observed variables w.r.t. the parameters: Convex function: global optimum Solution:

10 Review – unsupervised Naïve Bayes Mixture model: unsupervised naïve Bayes model C W N M   Joint probability of words and classes: But classes are not visible: Z

11 LDA

12 Review - LDA Motivation w M  N Assumptions: 1) documents are i.i.d 2) within a document, words are i.i.d. (bag of words) For each document d = 1, ,M Generate  d ~ D 1 (…) For each word n = 1, , N d generate w n ~ D 2 ( | θ d n ) Now pick your favorite distributions for D 1, D 2

13 Latent Dirichlet Allocation z w  M  N  For each document d = 1, ,M Generate  d ~ Dir( |  ) For each position n = 1, , N d generate z n ~ Mult( |  d ) generate w n ~ Mult( |  z n ) “Mixed membership” K

14 vs Naïve Bayes… z w  M  N  K

15 LDA’s view of a document

16 LDA topics

17 Review - LDA Latent Dirichlet Allocation –Parameter learning: Variational EM –Numerical approximation using lower-bounds –Results in biased solutions –Convergence has numerical guarantees Gibbs Sampling –Stochastic simulation –unbiased solutions –Stochastic convergence

18 Review - LDA Gibbs sampling –Applicable when joint distribution is hard to evaluate but conditional distribution is known –Sequence of samples comprises a Markov Chain –Stationary distribution of the chain is the joint distribution Key capability: estimate distribution of one latent variables given the other latent variables and observed variables.

19 Why does Gibbs sampling work? What’s the fixed point? –Stationary distribution of the chain is the joint distribution When will it converge (in the limit)? –Graph defined by the chain is connected How long will it take to converge? –Depends on second eigenvector of that graph

20

21 Called “collapsed Gibbs sampling” since you’ve marginalized away some variables Fr: Parameter estimation for text analysis - Gregor Heinrich

22 Review - LDA Latent Dirichlet Allocation z w  M  N  Randomly initialize each z m,n Repeat for t=1,…. For each doc m, word n Find Pr(z mn =k|other z’s) Sample z mn according to that distr. “Mixed membership”

23 Outline Stochastic block models & inference question Review of text models –Mixture of multinomials & EM –LDA and Gibbs (or variational EM) Block models and inference Mixed-membership block models Multinomial block models and inference w/ Gibbs Beastiary of other probabilistic graph models –Latent-space models, exchangeable graphs, p1, ERGM

24 Review - LDA Motivation w M  N Assumptions: 1) documents are i.i.d 2) within a document, words are i.i.d. (bag of words) For each document d = 1, ,M Generate  d ~ D 1 (…) For each word n = 1, , N d generate w n ~ D 2 ( | θ d n ) Docs and words are exchangeable.

25 Stochastic Block models: assume 1) nodes w/in a block z and 2) edges between blocks z p,z q are exchangeable zpzp zqzq a pq N2N2 zpzp N  p 

26 Stochastic Block models: assume 1) nodes w/in a block z and 2) edges between blocks z p,z q are exchangeable zpzp zqzq a pq N2N2 zpzp N  p  Gibbs sampling: Randomly initialize z p for each node p. For t = 1… For each node p Compute z p given other z’s Sample z p See: Snijders & Nowicki, 1997, Estimation and Prediction for Stochastic Blockmodels for Groups with Latent Graph Structure

27 Mixed Membership Stochastic Block models pp qq zp.zp. z.qz.q a pq N2N2 pp N  p  Airoldi et al, JMLR 2008

28 Parkkinen et al paper

29 Another mixed membership block model

30 z=(zi,zj) is a pair of block ids n z = #pairs z q z1, i = #links to i from block z1 q z1,. = #outlinks in block z1 δ = indicator for diagonal M = #nodes

31 Another mixed membership block model

32 Experiments Balasubramanyan, Lin, Cohen, NIPS w/s 2010 + lots of synthetic data

33

34

35 Experiments Balasubramanyan, Lin, Cohen, NIPS w/s 2010

36 Experiments Balasubramanyan, Lin, Cohen, NIPS w/s 2010


Download ppt "Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,.."

Similar presentations


Ads by Google