Kristina Lerman Aram Galstyan USC Information Sciences Institute Analysis of Social Voting Patterns on Digg.

Slides:



Advertisements
Similar presentations
Jacob Goldenberg, Barak Libai, and Eitan Muller
Advertisements

1 ©2009 MeeMix MeeMix – A personalized Experience.
E-Business and e-Commerce. e-commerce and e-business e-commerce refers to aspects of online business involving exchanges among customers, business partners.
Fawaz Ghali Web 2.0 for the Adaptive Web.
CONSUMER RESEARCH Chapter 2.
Self-introduction Name:  鲍鹏 (Peng Bao) Research Interests:  Popularity Prediction, Information Diffusion, Social Network , etc… Grade:  In the third.
Overarching Goal: Understand that computer models require the merging of mathematics and science. 1.Understand how computational reasoning can be infused.
Enabling the Social Web Krishna P. Gummadi Networked Systems Group Max Planck Institute for Software Systems.
Flickr Information propagation in the Flickr social network Meeyoung Cha Max Planck Institute for Software Systems With Alan Mislove.
The Social Web or how flickr changed my life Kristina Lerman USC Information Sciences Institute
A Comprehensive Overview of Social Media Marketing Essentials.
Cascading Behavior in Large Blog Graphs Patterns and a Model Leskovec et al. (SDM 2007)
CS246 Search Engine Bias. Junghoo "John" Cho (UCLA Computer Science)2 Motivation “If you are not indexed by Google, you do not exist on the Web” --- news.com.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
CS 599: Social Media Analysis University of Southern California1 Influence in Social Media Kristina Lerman University of Southern California.
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
A Measurement-driven Analysis of Information Propagation in the Flickr Social Network WWW09 报告人: 徐波.
1 SOCIAL BOOKMARKING 101. HIBA KHALID BILAL SAEED KHAN FARID ALIANI ASKARI HASAN SOCIAL BOOKMARKING.
WELCOME TO A NATIONAL DISABILITY INSTITUTE 30 SECOND TRAINING.
Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowd sourced Content Author - Anindya Ghose, Panagiotis G.
You Tube YouTube Profile: Website: Alexa Ranking: 3 (as of 5/1/08) Description: A video sharing website Similar Websites:
SOCIAL NETWORKS AND THEIR IMPACTS ON BRANDS Edwin Dionel Molina Vásquez.
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
1. The Future of News  “The Web audience is growing at a fast clip, while print circulation is not. Online revenues are growing faster, too, although.
Version 1.0 Requirements.  PROstructor ◦ PROstructor is a community and service to finding, scheduling and paying professional for private, group lessons.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 5)
Authors: Xu Cheng, Haitao Li, Jiangchuan Liu School of Computing Science, Simon Fraser University, British Columbia, Canada. Speaker : 童耀民 MA1G0222.
Recommender systems Drew Culbert IST /12/02.
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
UMBC iConnect Audumbar Chormale, Dr. A. Joshi, Dr. T. Finin, Dr. Z. Segall.
How do I decide whom to follow on Twitter ? IARank: Ranking Users on Twitter in Near Real-time, Based on their Information Amplification Potential.
Understanding Cross-site Linking in Online Social Networks Yang Chen 1, Chenfan Zhuang 2, Qiang Cao 1, Pan Hui 3 1 Duke University 2 Tsinghua University.
Tie Strength, Embeddedness & Social Influence: Evidence from a Large Scale Networked Experiment Sinan Aral, Dylan Walker Presented by: Mengqi Qiu(Mendy)
Information Spread and Information Maximization in Social Networks Xie Yiran 5.28.
Collusion-Resistance Misbehaving User Detection Schemes Speaker: Jing-Kai Lou 2015/10/131.
Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets Anindya Ghose Panagiotis Ipeirotis Stern.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Wang-Chien Lee i Pervasive Data Access ( i PDA) Group Pennsylvania State University Mining Social Network Big Data Intelligent.
Using a Model of Social Dynamics to Predict Popularity of News Kristina Lerman Tad Hogg USC Information Sciences Institute HP Labs WWW 2010.
The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network Michael Matczynski Status Report April 14, 2006.
--He Xiangnan PhD student Importance Estimation of User-generated Data.
The Tube Over Time: Characterizing Popularity Growth of YouTube Videos ` Abstract In this work, we characterize the growth patterns of video popularity.
Dynamics of Collaborative Document Rating Systems (SNA-KDD’07) Advisor : Dr. Koh Jia-Ling Speaker : Chou-Bin Fan Date :
Evaluating Results of Learning Blaž Zupan
A measurement-driven Analysis of Information Propagation in the Flickr Social Network Meeyoung Cha Alan Mislove Krisnna P. Gummadi.
With each device or application that expands the bandwidth of available information, the computer ’ s understanding of us remains unchanged.
Jane Shuttleworth Consultancy and Training THE SCIENCE AND ART OF COMMISSIONING COMMISSIONING AS A DESIGN PROCESS JUNE 2006.
Social Software. Enables people to connect or collaborate through computer- mediated communication and to form online communities People form online communities.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
Mining information from social media
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Social Media and Networking: Academic Kristina Lerman USC Information Sciences Institute
Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
Portrait of an Online Shopper: Understanding and Predicting Consumer Behavior Farshad Kooti* Kristina Lerman* Luca Aiello † Mihajlo Grbovic † Nemanja Djuric.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Presented By: Madiha Saleem Sunniya Rizvi.  Collaborative filtering is a technique used by recommender systems to combine different users' opinions and.
Scenario process Next steps. New scenarios (Share Socio-economic Pathways)
Chapter 1 Introduction to Social Commerce. Learning Objectives 1.Define social computing and the Social Web. 2.Describe the Social Web revolution. 3.Describe.
Using a model of Social Dynamics to Predict Popularity of News Kristina Lerman, Tad Hogg USC Information Sciences Institute, Institute for Molecular Manufacturing.
Stochastic Models of User-Contributory Web Sites
Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia.
What Stops Social Epidemics?
Center for Complexity in Business, R. Smith School of Business
Overview The World Wide Web has changed the way that people
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Presentation transcript:

Kristina Lerman Aram Galstyan USC Information Sciences Institute Analysis of Social Voting Patterns on Digg

Content, content everywhere and not a drop to read Explosion of user-generated content 2G/day of “authored” content 10-15G/day of user generated content How do producers promote their content to potential consumers? How do users/consumers find relevant content?

Social networks for promoting content Viral or word-of-mouth marketing Exploit social interactions between users to promote content But, does it really work? Previous empirical studies have conflicting results Study showed popularity of albums did affect user’s choice of what music to listen to [Salganik et al., 2006] Study showed recommendation might not lead to new purchases on Amazon [Leskovec, Adamic & Huberman, 2006] Showed sensitivity to type and price of products

In this work Do those results apply to free content? Empirical study on social news aggregator Digg How do social networks affect spread of free content?

Social news aggregator Digg Users submit and moderate news stories Digg automatically promotes stories for the front page Digg allows social networking Users can add other users as Friends This results in a directed social network Friends of user A are everyone A is watching Fans of A are all users who are watching A

Lifecycle of a story 1.User submits a story to the Upcoming Stories queue 2.Other users vote on (digg) the story 3.When the story accumulates enough votes (diggs>50), it is promoted to the Front page 4.The Friends Interface lets users can see 1.Stories friends submitted 2.Stories friends voted on, …

How the Friends Interface works ‘see stories my friends submitted’ ‘see stories my friends dugg’

What are the patterns of “vote diffusion” on the Digg network? Can these patterns in early dynamics predict story’s eventual popularity? Research questions

Digg datasets Stories Collected by scraping Digg … now available through the API ~200 stories promoted to the Front page on 6/30/2006 ~900 newly submitted stories (not yet promoted) on 6/30/2006 For each story Submitter’s id Time-ordered votes the story received Ids of the users who voted on the story Social networks Friends: outgoing links A  B := B is a friend of A Fans: incoming links A  B := A is a fan of B Enables to reconstruct the diffusion process

Dynamics of votes story “interestingness” Shape of the curves (votes vs time) is qualitatively similar Large spread in the final number of votes Implicitly defines the “interestingness”, or popularity, of a story

Distribution of votes Wu & Huberman, 2007 ~30,000 front page stories submitted in 2006 ~200 front page stories submitted in June 29-30, 2006 Interesting (popular) not interesting

Dynamics of voting on Digg Two main mechanisms for voting Voting is influenced by intrinsic attributes of a story E.g., some stories are more interesting and have more popular appeal than others Voting is also impacted by social interactions (e.g, through the Friends Interface) Diffusive spread on a network We can not measure “interestingness”, but we can analyze the patterns of “social voting” Can we use those patterns to predict the eventual popularity of a story?

Patterns of network spread Definition: In-network votes are votes coming from fans of the previous voters (including the submitter)

Patterns of network spread Definition: In-network votes are votes coming from fans of the previous voters (including the submitter)

Main Findings Large number of early in-network votes is negatively correlated with the eventual popularityLarge number of early in-network votes is negatively correlated with the eventual popularity of the story of the story Stories receiving more in-network votes will turn out to be less popular More interesting story receive fewer in-network votes

Stories submitted by the same user <500 final votes >500 final votes <500 final votes >500 final votes

Popularity vs in-network votes The stories that become popular initially receive fewer in- network votes Popularity vs the number of in-network votes out of first 6 in-network votes

The trend continues

Classification: Training Predict how popular the story will become based on how many in-network votes it receives within the first 10 votes Decision tree classifier Features v10: Number of in-network votes within the first 10 votes fans1: Number of fans of submitter Story popularity –Yes if > 500 votes –No if < 500 votes v10 fans1 yes(130/5) no(18/0) yes(30/8)no(29/13) <=4 >4 >8 <=8 <=85>85

Classification: Testing Use the classifier to predict how popular stories will be based on the first 10 votes it received Dataset 48 new stories submitted by top users Of these, 14 were promoted by Digg Predictions Correctly classified 36 stories (TP=4, TN=32) 12 errors (FP=11, FN=1) Compared to Digg’s prediction Digg predicted that 14 are interesting (by promoting them) Digg prediction: 5 of 14 received more than 500 votes –Digg prediction: Pr=0.36 Our prediction: 4 of 7 received more than 520 votes (Pr=0.57) Prediction was made after 10 votes, as opposed to Digg’s 40+ votes yes(130/5) no(18/0) v10 fans1 yes(30/8)no(29/13) <=4 >4 >8 <=8 <=85>85

Summary Social Web sites like Digg provide data for empirical study of collective user behavior How do social networks impact the spread of content, ideas, products? Findings for Digg Patterns of voting spread on networks indicative of content quality Those patterns enable early prediction of eventual popularity Future work More systematic and larger scale empirical studies Agent-based computational and mathematical models of social voting on Diggs