Download presentation
1
Topic Modeling using Latent Dirichlet Allocation
2
Topic Modeling A process of analyzing large collections of documents in order to discover latent topics from the documents. Able to organize and structure the documents Discover the different topics that a documents has How similar are certain documents
4
Latent Dirichlet Allocation (LDA)
It is a unsupervised learning Produces a generative model
5
Terminology Word: w ∈ {1,…,V} Document: Sequence of N words
Corpus: which is a set of M documents Topic: z ∈ {1,…, K}
6
Topic A topic is a set of co-occurring terms
7
Generate Process Choose N based on Poisson distribution
Choose θ based on Dirichlet distribution (θ is a topic weight vector) For each of the N words: Choose z from θ Choose w from z
8
Learning Variational Bayes Gibbs Sampling
9
Applications of LDA Collaborative Filtering Spam Detection Music Image
10
References D M Blei, A Y Ng, M I Jordan. (2003). Latent Dirichlet Allocation. The Journal of Machine Learning Research D J Hu. (2009). Latent Dirichlet Allocation for text, images, and music.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.