Download presentation
Presentation is loading. Please wait.
Published byGregory Strickland Modified over 9 years ago
1
Analysis of Social Media MLD 10-802, LTI 11-772 William Cohen 1-27-010
2
Some projects from 2010’s course Q&A sites and their community basis: – Two-student study of AskMetaFilter Q&A site, a sping- off from MetaFilter (a long-standing community site) – Modeled after similar analyses of Yahoo! Answers, and other Q/A sites but AskMetaFilter is possibly different, since it’s built around an existing community. – Research question: How do Q&A sites built around communities differ from those that accrue communities? – Data available: "Pete[r] Landwehr"
3
Some projects from 2010’s course TED: Comments worth understanding – Two-student project to analyze the 2600+ person community of commentors on 667 TED talks – Explored (Sub)community detection algorithms Tag assignment (for folksonomy tags on talks) Recommending commentors from ASR transcript of talks. – Data available (aasish@cs.cmu.edu)
4
Some projects from 2010’s course Topic Models for Twitter – One-student project (w/ help from outside) – Approximately: LDA with each user a document – Used for link prediction
5
Some projects from 2010’s course Topic Models for Politics – Hierarchical topic model for text in classes – Also, community analysis of a political dataset – Data available (dong.p.ng@gmail.com)
6
Some projects from 2010’s course Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks – Three-student project that used a single large dataset and considered several types of analysis Can you detect usual calling patterns (e.g., holidays) Can you predict properties like reciprocity, average call duration, node degree from other easily measurable properties? Can you predict whether social interactions will persist (vs be transient) over time? – Data not available (except thru Heinz School) – Paper at KDD workshop
7
Some projects from 2010’s course Language and Geography in Twitter – One-student project – Course project was exploratory: Predicting location from various signals (e.g., unambiguous location names) Identifying events from tweets – Followup project published at EMNLP 2010 Later cited on SlashDot, Ars Technica, …, NPR – Data available at http://www.ark.cs.cmu.edu/GeoText/ http://www.ark.cs.cmu.edu/GeoText/
8
Additional datasets to know about Tae Yano and Noah Smith – political blogs and comments http://www.ark.cs.cmu.edu/blog-data/ http://www.ark.cs.cmu.edu/blog-data/ Tae Yano and Noah Smith – biased vs unbiased sentences in political blogs http://sites.google.com/site/amtworkshop2010/data-1 http://sites.google.com/site/amtworkshop2010/data-1 – Tae is interested in talking to/working with people doing a project with either of these
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.