Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 765 – Fall 2014 Paulo Alexandre Regis Reddit analysis.

Similar presentations


Presentation on theme: "CS 765 – Fall 2014 Paulo Alexandre Regis Reddit analysis."— Presentation transcript:

1 CS 765 – Fall 2014 Paulo Alexandre Regis Reddit analysis

2 Outline ABOUT REDDIT WHY REDDIT PREVIOUS WORKS INITIAL PROPOSAL Q&A

3 What is reddit? Reddit is an open-source platform that supports the interaction of communities. It has been used as news hub, Q&A platform, internet hoax/meme propagatio.

4 Features Subreddits Voting Karma Public API

5 Why reddit? Growing communities Diverse usage Open-source platform Unexplored opportunities

6 Why reddit?

7 The API Easy to parse, returns JSON objects 30 requests per minute limit 60 requests per minute if using Oauth Useful links: Dev community: http://www.reddit.com/r/redditdevhttp://www.reddit.com/r/redditdev API documentation: http://www.reddit.com/dev/apihttp://www.reddit.com/dev/api

8 Previous works PRAW Information and social analysis Identifying social roles Backbone networks

9 PRAW Python Reddit API Wrapper Open-source Respects Reddit’s guidelines Easy integration Well documented Project website: https://praw.readthedocs.orghttps://praw.readthedocs.org

10 Information and social analysis of reddit Insights on comments section Generated 3 social graphs: – Loose: user A comments on user B establishes an edge – Tight: user A commenting on user B and user B commenting on user A – Strict: user A comments 4 times on user B and vice-versa

11 Information and social analysis of reddit

12 Limited data collection: – Time constraints – 1% (250) of the top subcommunities crawled Results:

13 Identifying social roles in reddit Identify specific role (answer-person: responds to questions but only in a few different discussions. i.e. Q&A) in reddit Sampled top users from top submissions and targeted communities Used PRAW Crawler script open- sourcehttps://github.com/cbuntain/redditResponseExtractorhttps://github.com/cbuntain/redditResponseExtractor

14

15 (a) Mike Shuttleworth (Ubuntu) IAmA Q&A (b) Regular user from other subreddit

16

17 Using backbone networks to map user interests in social media Focus on communities (subreddits) Communities linked by users (bipartite graph) Small-world (shortest path ~= 3.71) Roughly 1/3 of users crawled Anonymized data available: http://figshare.com/articles/reddit_user_posting_behavior/874101 http://figshare.com/articles/reddit_user_posting_behavior/874101

18

19

20 Initial proposal Analyze the influence of social hubs in reddit’s network. Se if high degree nodes attract more attention from lower degree nodes. An edge would be formed when both nodes comment in the same post. The degree of the nodes would be their predefined “karma”. And it could be compared with other ranking algorithms (i.e. PageRank)

21 Questions?


Download ppt "CS 765 – Fall 2014 Paulo Alexandre Regis Reddit analysis."

Similar presentations


Ads by Google