Dynamic Multi-Faceted Topic Discovery in Twitter Date : 2013/11/27 Source : CIKM’13 Advisor : Dr.Jia-ling, Koh Speaker : Wei, Chang 1.

Slides:



Advertisements
Similar presentations
Sinead Williamson, Chong Wang, Katherine A. Heller, David M. Blei
Advertisements

Topic models Source: Topic models, David Blei, MLSS 09.
Entity-Centric Topic-Oriented Opinion Summarization in Twitter Date : 2013/09/03 Author : Xinfan Meng, Furu Wei, Xiaohua, Liu, Ming Zhou, Sujian Li and.
Finding Topic-sensitive Influential Twitterers Presenter 吴伟涛 TwitterRank:
Linking Named Entity in Tweets with Knowledge Base via User Interest Modeling Date : 2014/01/22 Author : Wei Shen, Jianyong Wang, Ping Luo, Min Wang Source.
Title: The Author-Topic Model for Authors and Documents
Text-Based Measures of Document Diversity Date : 2014/02/12 Source : KDD’13 Authors : Kevin Bache, David Newman, and Padhraic Smyth Advisor : Dr. Jia-Ling,
Date : 2013/09/17 Source : SIGIR’13 Authors : Zhu, Xingwei
Probabilistic Clustering-Projection Model for Discrete Data
Statistical Topic Modeling part 1
Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM09.
MIS 696A Final Presentation Victor Benjamin, Joey Buckman, Xiaobo Cao, Weifeng Li, Zirun Qi, Lee Spitzley, Yun Wang, Rich Yueh.
Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.
Generative Topic Models for Community Analysis
Caimei Lu et al. (KDD 2010) Presented by Anson Liang.
Personalized Search Result Diversification via Structured Learning
Multiscale Topic Tomography Ramesh Nallapati, William Cohen, Susan Ditmore, John Lafferty & Kin Ung (Johnson and Johnson Group)
1. Social-Network Analysis Using Topic Models 2. Web Event Topic Analysis by Topic Feature Clustering and Extended LDA Model RMBI4310/COMP4332 Big Data.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Modeling Scientific Impact with Topical Influence Regression James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine.
Dongyeop Kang1, Youngja Park2, Suresh Chari2
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
KDD 2012, Beijing, China Community Discovery and Profiling with Social Messages Wenjun Zhou Hongxia Jin Yan Liu.
Online Learning for Latent Dirichlet Allocation
Popularity-Aware Topic Model for Social Graphs Junghoo “John” Cho UCLA.
Hierarchical Topic Models and the Nested Chinese Restaurant Process Blei, Griffiths, Jordan, Tenenbaum presented by Rodrigo de Salvo Braz.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Topic Modeling using Latent Dirichlet Allocation
Latent Dirichlet Allocation
Abdul Wahid, Xiaoying Gao, Peter Andreae
Topic Models Discovering Annotating Comparing Referring Sampling Illustrating Representing John Unsworth, “Scholarly Primitives” “Scholarly Primitives”
LINDEN : Linking Named Entities with Knowledge Base via Semantic Knowledge Date : 2013/03/25 Resource : WWW 2012 Advisor : Dr. Jia-Ling Koh Speaker : Wei.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Link Distribution on Wikipedia [0407]KwangHee Park.
Web-Mining Agents Topic Analysis: pLSI and LDA
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date: From Word Representations:... ACL2010, From Frequency... JAIR 2010 Representing Word... Psychological.
Visualization Lab By: Thomas Kraft.  What is being talked about and where?  Twitter has massive amounts of data  Tweets are unstructured  Goal: Quickly.
{ Adaptive Relevance Feedback in Information Retrieval Yuanhua Lv and ChengXiang Zhai (CIKM ‘09) Date: 2010/10/12 Advisor: Dr. Koh, Jia-Ling Speaker: Lin,
Scalable Learning of Collective Behavior Based on Sparse Social Dimensions Lei Tang, Huan Liu CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/02/01.
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Inferring User Interest Familiarity and Topic Similarity with Social Neighbors in Facebook INSTRUCTOR: DONGCHUL KIM ANUSHA BOOTHPUR
Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Multi-Modal Bayesian Embeddings for Learning Social Knowledge Graphs Zhilin Yang 12, Jie Tang 1, William W. Cohen 2 1 Tsinghua University 2 Carnegie Mellon.
Extracting Mobile Behavioral Patterns with the Distant N-Gram Topic Model Lingzi Hong Feb 10th.
Customized of Social Media Contents using Focused Topic Hierarchy
Kuifei Yu, Baoxian Zhang, Hengshu Zhu,Huanhuan Cao, and Jilei Tian
Online Multiscale Dynamic Topic Models
Measuring Sustainability Reporting using Web Scraping and Natural Language Processing Alessandra Sozzi
On the Generative Discovery of Structured Medical Knowledge
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009
Precursor pattern analysis for civil unrest events
Community-based User Recommendation in Uni-Directional Social Networks
Speaker: Jim-an tsai advisor: professor jia-lin koh
Understanding Connections: Amazon Customer Reviews
People-LDA using Face Recognition
Topic Modeling Nick Jordan.
Stochastic Optimization Maximization for Latent Variable Models
Topic models for corpora and for graphs
Michal Rosen-Zvi University of California, Irvine
Topic Models in Text Processing
TOPTRAC: Topical Trajectory Pattern Mining
Jinwen Guo, Shengliang Xu, Shenghua Bao, and Yong Yu
Preference Based Evaluation Measures for Novelty and Diversity
Presentation transcript:

Dynamic Multi-Faceted Topic Discovery in Twitter Date : 2013/11/27 Source : CIKM’13 Advisor : Dr.Jia-ling, Koh Speaker : Wei, Chang 1

Outline Introduction Approach Experiment Conclusion 2

Twitter 3

What are they talking about? Entity-centric High dynamic 4

Multiple facets of a topic discussed in Twitter 5

Goal 6

Outline Introduction Approach Framework Pre-processing LDA MfTM Experiment Conclusion 7

Framework 8 Training document Model (hyper parameter) Twitter Per document Document Vector Twitter Pre-processing

Convert to lower-case Remove punctuation and numbers “Goooood” to “good” Remove stop words Named entity recognition Entity types : person, organization, location, general terms Linked Web : Tweet : All user’s posts published during the same day are grouped as a document 9

Latent Dirichlet Allocation Each document may be viewed as a mixture of various topics. The topic distribution is assumed to have a Dirichlet prior. Unsupervised learning Need to initialize the topic number K Not Linear discriminant analysis (LDA) 10

Example I like to eat broccoli and bananas. I ate a banana and spinach smoothie for breakfast. Chinchillas and kittens are cute. My sister adopted a kitten yesterday. Look at this cute hamster munching on a piece of broccoli. Topic 1 Topic 2 : food : cute animals 11

How LDA write a document? broccoli munching breakfast bananas kittens chinchillas cute hamster 12

Real World Example 13

LDA Plate Annotation 14

LDA 15

EM algorithm Gibbs sampling Stochastic Variational Inference (SVI) 16

Multi-Faceted Topic Model 17

Outline Introduction Approach Experiment Conclusion 18

Perplexity Evaluation 19

Perplexity Evaluation 20

KL-divergence P={1/6, 1/6, 1/6, 1/6, 1/6, 1/6} Q={1/10, 1/10, 1/10, 1/10, 1/10, 1/2} KL is a non-symmetric measure 21

KL-divergence 22

Scalability A standard PC with a dual-core CPU, 4GB RAM and a 600GB hard-drive 23

Outline Introduction Approach Experiment Conclusion 24

Conclusion We propose a novel Multi-Faceted Topic Model. The model extracts semantically-rich latent topics, including general terms mentioned in the topic, named entities and a temporal distribution 25