Presentation is loading. Please wait.

Presentation is loading. Please wait.

Online Multiscale Dynamic Topic Models

Similar presentations


Presentation on theme: "Online Multiscale Dynamic Topic Models"— Presentation transcript:

1 Online Multiscale Dynamic Topic Models
Best Research Paper Award Honorable Mention Online Multiscale Dynamic Topic Models Tomoharu Iwata Yasushi Sakurai Takeshi Yamada Naonori Ueda NTT Communication Science Laboratories Japan

2 Introduction Topic models for analyzing document dynamics Models
Dynamic topic model [Blei+06] Topic over time [Wang+06] Dynamic mixture model [Wei+07] Topic tracking model [Iwata+09] Data scientific papers news articles blog

3 Multiscale dynamics Topics naturally evolve with multiple timescales
Example: Politics topic in news articles Many years constitution, congress, president Tens of years names of members in Congress A few days names of bills under discussion Long timescale Middle timescale Short timescale

4 Proposed model Multiscale Dynamic Topic Model (MDTM)
Topic model for analyzing dynamics with multiple timescales Robust Information loss is reduced by considering short and long timescale dynamics Efficient online inference The model is updated using only newly obtained data Past data need not to be stored

5 Standard topic model Graphical model
Latent Dirichlet Allocation (LDA) [Blei+03] Basis of the proposed model A document is modeled as a mixture of topics Word distribution is generated from a symmetric Dirichlet No dynamics Dirichlet topic proportions Multinomial topic #docs word Multinomial word distribution #words ○:latent variable ●:observed variable □:repetition Dirichlet Graphical model #topics

6 Multiscale word distribution
word distribution at scale s (from t to t-1) s-1 generated depending on weighted sum of multiscale distributions long-scale word distribution at t-1 word distribution at t short-scale word distribution at t-1

7 Generative process of MDMT
Gamma (of documents at epoch t ) Dirichlet topic proportions prior Word distribution is generated depended on the weighted sum of multiscale distributions Topic proportions’ prior is generated depended on the previous value Multinomial Multinomial weight Dirichlet multiscale word dist. * ξ t-1: hyper-parameter #scales

8 Online inference Update the model at each epoch using Stochastic EM
the newly obtained data the previous model Stochastic EM [E-step] collapsed Gibbs sampling of latent topics [M-step] maximum joint likelihood of parameters model model data data t t+1

9 Estimation of multiscale distribution
Maximum likelihood estimate word probability of scale s word count of scale s word count of scale s word count at epoch t’ word count at epoch t t-2 +1 s-1 t-1 t

10 Estimation of multiscale distribution
Maximum likelihood estimate Online update Required memory word probability of scale s word count of scale s word count of scale s word count at epoch t’ word count at epoch t t-2 +1 s-1 t-1 t current word count first word count in the scale current value previous value

11 Approximated efficient estimation of multiscale word distribution
Decrease update frequency for long-scale dist. Store only the previous epoch count

12 Approximated efficient estimation of multiscale word distribution
Decrease update frequency for long-scale dist. Store only the previous epoch count Required memory count of s=3 t=4 t=5 t=6 t=7 t=8 scale=3 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 5 6 7 8 scale=2 3 4 3 4 5 6 5 6 7 8 scale=1 4 5 6 7 8 count at t=3 5 6 7 8 update [s=1] update [s=1,2] update [s=1] update [s=1,2,3] newly obtained count

13 Experiments Data sets NIPS: papers in NIPS from 1987 to 1999
PNAS: titles in PNAS from 1915 to 2005 Digg: blog posts in social news site Digg from 1/29 to 2/20 in 2000 Addresses: the State of the Union addresses from 1790 to 2002 Methods MDTM: Online Multiscale Dynamic Topic Model DTM: Online Dynamic Topic Model (MDTM with #scales=1) LDAall: LDA that uses all past data for inference LDAone: LDA that uses just the current data for inference LDAonline: LDA with online inference

14 Average perplexity MDTM
(standard deviation) MDTM can appropriately model the dynamics through its use of multiscale properties DTM does not model the long-timescale dependencies LDAall and LDAonline do not model the dynamics LDAone ignores the past information

15 Perplexity with different #scales
Digg Addresses #scales #scales Perplexities decreased as #scales increased indicates the importance of considering multiscale dynamics

16 Estimated weights for each scale
Addresses weight (λ) Digg scale scale Weights decreased as the timescale lengthened recent short-scale distributions are more informative for estimating current distribution

17 Topic extraction 1 (NIPS)
Speech recognition topic speech recognition word speaker training set tdnn time test speakers system data letter state letters neural utterance words phoneme classification state hmm system probabilities model words context hmms markov probability level phonetic segmentation language segment accuracy duration continuous unit male spectral feature false acoustic independent models normalization rate trained gradient log likelihood models sequence sequences hidden hybrid states frame transition hidden states models feature continuous modeling features adaption human acoustic

18 Topic extraction 2 (NIPS)
Reinforcement learning topic learning state control action time policy reinforcement optimal actions recognition dynamic space model exploration states programming barto sutton goal task function state algorithm model agent decision step reward markov space robot based controller system forward level memory real jordan world skills policies singh adaptive iteration stochastic transition values expected based grid based memory controller continuous cost system temporal iteration interpolation rl machine policies environment iteration mdp singh finite update search

19 Conclusion Topic model with multiscale dynamics
Efficient online inference Experimentally confirmed the high predictive performance Future work Automatic determination of length of scale, and #topics Evaluation on other data, such as web access log, blog,


Download ppt "Online Multiscale Dynamic Topic Models"

Similar presentations


Ads by Google