Presentation is loading. Please wait.

Presentation is loading. Please wait.

Twitter Based Research Benny Bornfeld Mentors Professor Sheizaf Rafaeli Dr. Daphne Raban.

Similar presentations


Presentation on theme: "Twitter Based Research Benny Bornfeld Mentors Professor Sheizaf Rafaeli Dr. Daphne Raban."— Presentation transcript:

1 Twitter Based Research Benny Bornfeld Mentors Professor Sheizaf Rafaeli Dr. Daphne Raban

2 Where research meets Bigbird Research Twitter Big Data My Research & Tools My Research & Tools

3 Research Big Data Twitter

4 About Twitter Facts – Established in 2006 – ~140 million active users – ~340 million messages per day Superlatives – “the stream of the world’s collective consciousness” – “the first rough draft of history”

5 How does it work?

6 Retweet Tweet ReTweet Tweet

7 Reply

8 Twitter is used for many different purposes

9 Power Law distribution

10

11 Research Big Data Twitter Research

12 What is Twitter? Social network! Social Network? Mass Media?

13 Replace surveys?

14 Twitter based predictions I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper”

15 Twitter as a social learning platform

16 Technological determinism Why the revolution will not be tweeted? Influence What’s the influence of twitter on society? Clay Shirky Malcolm Gladwell

17 Influence in Twitter How do we measure influence? – Number of followers? – Centrality? – Creating action/reaction? – Viral spreading?

18 The Message vs the Carrier approaches

19 Research Twitter Big Data

20 Online social networks research fields Computers NetworksSociology

21 Big Data in SN Research Pros: – Exploratory research (vs confirmatory research) – Avoid the sampling reliability issue (power law) – Collect what people are actually saying – Non intrusive – Allow analysis of many dimensions – Catch irregular events

22 Big Data in SN Research Cons: – Lots of noise – It is sometimes hard to map the data to your research question – Cost of collecting the data – Lack of tools/knowledge on how to store and analyze the data – May come on the expense of theory

23 Where Research meets Bigbird Research Twitter Big Data My Research & Tools My Research & Tools

24 Influence the capacity or power of persons or things to be a compelling force on or produce effects on the actions, behavior, opinions, etc., of others

25 Influence In online social networks Sentiment Valence TweetReTweet

26 The research question Which is more viral? Which is more likely to spread in a social network (Twitter) ? Messages of negative or positive sentiment valence

27 The Data Collected ~2 million tweets about new movies Why movies: – People have opinions about movies – People share their opinions about movies – Can compare to other researches (benchmarks)

28 Collecting the Tweets Twitter provides an API for collecting tweets Up to mid 2010, full data streams were available for free, currently, the rate is very limited (~150/hour) Full data streams (fire hose) are available via a company called GNIP

29 Tweets Collecting architecture HTTP Streaming JSON RULES FILTER C o ll e c t A p p DB Files JSON parser My App

30 Data Fields #followers #following #number of tweets klout tweet rate creation date language name description location sender content type (original/RT) post time Device computed fields # of RT Total Exposure Sentiment User Data: Message Data:

31 Reading Tasks Handle partial messages Handle broken messages Handle duplicate messages Handle special characters

32 Clean the data Non related messages [build your dream house] Spammers Gibberish messages Normalize the data (e.g. Tweets/Time)

33 Tools for data analysis  Sorting  Filtering  Counting  Histograms  Sentiment analysis

34

35

36

37

38 Classifying users

39 Classifying users with cluster analysis

40 Sentiment Analysis Classify each message to positive/neutral/negative Classification methods – Manual (~10 sec tweet) – Automatic

41 Sentiment Analysis : Some challenging Tweets examples – Just saw #Footloose with my sisters. The movie fab, and I even spotted my karaoke machine! Did you dolls catch it? – Paranormal Activity 3 seems almost as scary as a level 9 magikarp – My kids want to see Jack and Jill. Its making it hard to love them.

42 Automatic classifications

43 Naïve Bayes classifier POS NEG POS NEU POS Machine learning – supervised learning NEU ++++++ +++++ +- + + + + + + + + + + + + ++ + + + + ++++++ ++++++ ++++

44 Naïve Bayes classifier POS NEG POS NEU POS Machine learning – supervised learning NEU ++++++ +++++ ++++++ ++++++ ++++ NGRAMPOSNEGNEU 210 110 ++++ +++++++ +++++++ NEG

45 Naïve Bayes classifier POS NEG POS NEU POS NGRAM = 2 NEU + + +++ + + + +

46

47

48 references Why the revolution will not be tweeted? Clay Shirky: How social media can make history [ted] Clay Shirky: How social media can make history [ted] Looking At The World Through Twitter Data Twitter mood predicts the stock market Twitter mood predicts the stock market Six Provocations for Big Data Susan Blackmore on memes and "temes“ [ted]

49


Download ppt "Twitter Based Research Benny Bornfeld Mentors Professor Sheizaf Rafaeli Dr. Daphne Raban."

Similar presentations


Ads by Google