Download presentation
Presentation is loading. Please wait.
Published bySheryl Smith Modified over 8 years ago
1
Twitter Based Research Benny Bornfeld Mentors Professor Sheizaf Rafaeli Dr. Daphne Raban
2
Where research meets Bigbird Research Twitter Big Data My Research & Tools My Research & Tools
3
Research Big Data Twitter
4
About Twitter Facts – Established in 2006 – ~140 million active users – ~340 million messages per day Superlatives – “the stream of the world’s collective consciousness” – “the first rough draft of history”
5
How does it work?
6
Retweet Tweet ReTweet Tweet
7
Reply
8
Twitter is used for many different purposes
9
Power Law distribution
11
Research Big Data Twitter Research
12
What is Twitter? Social network! Social Network? Mass Media?
13
Replace surveys?
14
Twitter based predictions I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper”
15
Twitter as a social learning platform
16
Technological determinism Why the revolution will not be tweeted? Influence What’s the influence of twitter on society? Clay Shirky Malcolm Gladwell
17
Influence in Twitter How do we measure influence? – Number of followers? – Centrality? – Creating action/reaction? – Viral spreading?
18
The Message vs the Carrier approaches
19
Research Twitter Big Data
20
Online social networks research fields Computers NetworksSociology
21
Big Data in SN Research Pros: – Exploratory research (vs confirmatory research) – Avoid the sampling reliability issue (power law) – Collect what people are actually saying – Non intrusive – Allow analysis of many dimensions – Catch irregular events
22
Big Data in SN Research Cons: – Lots of noise – It is sometimes hard to map the data to your research question – Cost of collecting the data – Lack of tools/knowledge on how to store and analyze the data – May come on the expense of theory
23
Where Research meets Bigbird Research Twitter Big Data My Research & Tools My Research & Tools
24
Influence the capacity or power of persons or things to be a compelling force on or produce effects on the actions, behavior, opinions, etc., of others
25
Influence In online social networks Sentiment Valence TweetReTweet
26
The research question Which is more viral? Which is more likely to spread in a social network (Twitter) ? Messages of negative or positive sentiment valence
27
The Data Collected ~2 million tweets about new movies Why movies: – People have opinions about movies – People share their opinions about movies – Can compare to other researches (benchmarks)
28
Collecting the Tweets Twitter provides an API for collecting tweets Up to mid 2010, full data streams were available for free, currently, the rate is very limited (~150/hour) Full data streams (fire hose) are available via a company called GNIP
29
Tweets Collecting architecture HTTP Streaming JSON RULES FILTER C o ll e c t A p p DB Files JSON parser My App
30
Data Fields #followers #following #number of tweets klout tweet rate creation date language name description location sender content type (original/RT) post time Device computed fields # of RT Total Exposure Sentiment User Data: Message Data:
31
Reading Tasks Handle partial messages Handle broken messages Handle duplicate messages Handle special characters
32
Clean the data Non related messages [build your dream house] Spammers Gibberish messages Normalize the data (e.g. Tweets/Time)
33
Tools for data analysis Sorting Filtering Counting Histograms Sentiment analysis
38
Classifying users
39
Classifying users with cluster analysis
40
Sentiment Analysis Classify each message to positive/neutral/negative Classification methods – Manual (~10 sec tweet) – Automatic
41
Sentiment Analysis : Some challenging Tweets examples – Just saw #Footloose with my sisters. The movie fab, and I even spotted my karaoke machine! Did you dolls catch it? – Paranormal Activity 3 seems almost as scary as a level 9 magikarp – My kids want to see Jack and Jill. Its making it hard to love them.
42
Automatic classifications
43
Naïve Bayes classifier POS NEG POS NEU POS Machine learning – supervised learning NEU ++++++ +++++ +- + + + + + + + + + + + + ++ + + + + ++++++ ++++++ ++++
44
Naïve Bayes classifier POS NEG POS NEU POS Machine learning – supervised learning NEU ++++++ +++++ ++++++ ++++++ ++++ NGRAMPOSNEGNEU 210 110 ++++ +++++++ +++++++ NEG
45
Naïve Bayes classifier POS NEG POS NEU POS NGRAM = 2 NEU + + +++ + + + +
48
references Why the revolution will not be tweeted? Clay Shirky: How social media can make history [ted] Clay Shirky: How social media can make history [ted] Looking At The World Through Twitter Data Twitter mood predicts the stock market Twitter mood predicts the stock market Six Provocations for Big Data Susan Blackmore on memes and "temes“ [ted]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.