Download presentation
Presentation is loading. Please wait.
Published byPenelope Pitts Modified over 9 years ago
1
Project 2 Latent Dirichlet Allocation 2014/4/29 Beom-Jin Lee
2
Data Selection Enron Email Dataset –http://www.cs.cmu.edu/~enron/http://www.cs.cmu.edu/~enron/ NIPS 1-17 data –http://ai.stanford.edu/~gal/data.htmlhttp://ai.stanford.edu/~gal/data.html –http://www.cs.nyu.edu/~roweis/data.htmlhttp://www.cs.nyu.edu/~roweis/data.html Datahub (Wikipedia Data, Wikinews, etc) –http://datahub.io/en/datasethttp://datahub.io/en/dataset Reuters Corpora (RCV1, RCV2, TRC2) –http://trec.nist.gov/data/reuters/reuters.htmlhttp://trec.nist.gov/data/reuters/reuters.html News group data –http://www.infochimps.com/datasets/20-newsgroups-dataset-de-duped- versionhttp://www.infochimps.com/datasets/20-newsgroups-dataset-de-duped- version Company Datasets –http://endb-consolidated.aihit.com/datasets.htmhttp://endb-consolidated.aihit.com/datasets.htm Twitter Data –http://snap.stanford.edu/data/twitter7.htmlhttp://snap.stanford.edu/data/twitter7.html
3
Methodology Original Paper –Latent Dirichlet Allocation, David M. Blei, Andrew Y. Ng, Michael I. Jordan, Journal of Machine Learning Research 3, 993 – 1022, 2003Latent Dirichlet Allocation Toolbox –http://psiexp.ss.uci.edu/research/programs_d ata/toolbox.htmhttp://psiexp.ss.uci.edu/research/programs_d ata/toolbox.htm Help –http://www.4four.us/article/2010/11/latent- dirichlet-allocation-simplyhttp://www.4four.us/article/2010/11/latent- dirichlet-allocation-simply
5
평가방법 Base line –Data Selection, Data inspection, Methodology report, Result from using LDA Plus points 1.Big data processing method(Wikipedia, Wallstreet Journal, etc) 2.Different kind of model comparison 3.Improvement in LDA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.