Using a Model of Social Dynamics to Predict Popularity of News Kristina Lerman Tad Hogg USC Information Sciences Institute HP Labs WWW 2010.

Slides:



Advertisements
Similar presentations
Chapter 5 Multiple Linear Regression
Advertisements

Design of Experiments Lecture I
By Venkata Sai Pulluri ( ) Narendra Muppavarapu ( )
Example 2.2 Estimating the Relationship between Price and Demand.
Influence and Passivity in Social Media Daniel M. Romero, Wojciech Galuba, Sitaram Asur, and Bernardo A. Huberman Social Computing Lab, HP Labs.
Self-introduction Name:  鲍鹏 (Peng Bao) Research Interests:  Popularity Prediction, Information Diffusion, Social Network , etc… Grade:  In the third.
August 23, 2013 Social Media Audit. Overview  Goals –Evaluate current social networking status –Identify trending topics and social influencers –Provide.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Power Laws: Rich-Get-Richer Phenomena
Vote Calibration in Community Question-Answering Systems Bee-Chung Chen (LinkedIn), Anirban Dasgupta (Yahoo! Labs), Xuanhui Wang (Facebook), Jie Yang (Google)
Voting, Spatial Monopoly, and Spatial Price Regulation Economic Inquiry, Jan, 1992, MH Ye and M. J. Yezer Presentation Date: 06/Jan/14.
Link creation and profile alignment in the aNobii social network Luca Maria Aiello et al. Social Computing Feb 2014 Hyewon Lim.
Flickr Information propagation in the Flickr social network Meeyoung Cha Max Planck Institute for Software Systems With Alan Mislove.
Optimization of Signal Significance by Bagging Decision Trees Ilya Narsky, Caltech presented by Harrison Prosper.
Predictability and Prediction of Social Processes Rich Colbaugh*† Kristin Glass* *New Mexico Institute of Mining and Technology †Sandia National Laboratories.
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
A Measurement-driven Analysis of Information Propagation in the Flickr Social Network WWW09 报告人: 徐波.
Linear Regression Analysis
Social Network Analysis via Factor Graph Model
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
Right Buddy Makes the Difference: an Early Exploration of Social Relation Analysis in Multimedia Applications Jitao Sang, Changsheng Xu*. 1 Institute of.
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
1. The Future of News  “The Web audience is growing at a fast clip, while print circulation is not. Online revenues are growing faster, too, although.
Isolated-Word Speech Recognition Using Hidden Markov Models
Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.
Kristina Lerman Aram Galstyan USC Information Sciences Institute Analysis of Social Voting Patterns on Digg.
3/2003 Rev 1 I – slide 1 of 33 Session I Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
How do I decide whom to follow on Twitter ? IARank: Ranking Users on Twitter in Near Real-time, Based on their Information Amplification Potential.
Predicting Content Change On The Web BY : HITESH SONPURE GUIDED BY : PROF. M. WANJARI.
Tie Strength, Embeddedness & Social Influence: Evidence from a Large Scale Networked Experiment Sinan Aral, Dylan Walker Presented by: Mengqi Qiu(Mendy)
1 Discovering Authorities in Question Answer Communities by Using Link Analysis Pawel Jurczyk, Eugene Agichtein (CIKM 2007)
User Interests Imbalance Exploration in Social Recommendation: A Fitness Adaptation Authors : Tianchun Wang, Xiaoming Jin, Xuetao Ding, and Xiaojun Ye.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
1 Dynamics of Competition Between Incumbent and Emerging Network Technologies Youngmi Jin (Penn) Soumya Sen (Penn) Prof. Roch Guerin (Penn) Prof. Kartik.
Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.
Feedback Effects between Similarity and Social Influence in Online Communities David Crandall, Dan Cosley, Daniel Huttenlocher, Jon Kleinberg, Siddharth.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
The System and Software Development Process Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
The Tube Over Time: Characterizing Popularity Growth of YouTube Videos ` Abstract In this work, we characterize the growth patterns of video popularity.
A Stochastic Model of Platoon Formation in Traffic Flow USC/Information Sciences Institute K. Lerman and A. Galstyan USC M. Mataric and D. Goldberg TASK.
Chapter 11 Correlation and Simple Linear Regression Statistics for Business (Econ) 1.
Dynamics of Collaborative Document Rating Systems (SNA-KDD’07) Advisor : Dr. Koh Jia-Ling Speaker : Chou-Bin Fan Date :
A measurement-driven Analysis of Information Propagation in the Flickr Social Network Meeyoung Cha Alan Mislove Krisnna P. Gummadi.
Patience and the Wealth of Nations Topics in Behavioral and Experimental Economics SS 2015 Presentation: Dominik Schaufler Paper by: Thomas Dohmen, Benjamin.
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
Scalable Learning of Collective Behavior Based on Sparse Social Dimensions Lei Tang, Huan Liu CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/02/01.
CHARACTERIZING CLOUD COMPUTING HARDWARE RELIABILITY Authors: Kashi Venkatesh Vishwanath ; Nachiappan Nagappan Presented By: Vibhuti Dhiman.
CHAPTER ONE. SOCIAL MEDIA using it to locate new hires 94% of 18 to 34 year-olds found their last job through a social network 73% over 50% of employers.
TEMPLATE DESIGN © Crawling is the process of automatically exploring a web application to discover the states of the application.
Dynamics of Competition Between Incumbent and Emerging Network Technologies Youngmi Jin (Penn) Soumya Sen (Penn) Prof. Roch Guerin (Penn) Prof. Kartik.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Hamed Haddadi Fabricio Benevenuto Krishna P. Gummadi.
Using a model of Social Dynamics to Predict Popularity of News Kristina Lerman, Tad Hogg USC Information Sciences Institute, Institute for Molecular Manufacturing.
A Method to Approximate the Bayesian Posterior Distribution in Singular Learning Machines Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
 DM-Group Meeting Liangzhe Chen, Oct Papers to be present  RSC: Mining and Modeling Temporal Activity in Social Media  KDD’15  A. F. Costa,
Mlmguruji.in is an mlm classified website in india.
Stochastic Models of User-Contributory Web Sites
Introduction to Combinatorics
What Stops Social Epidemics?
Linear Regression and Correlation Analysis
User Joining Behavior in Online Forums
Social Media and Networking: What it is & why it’s important
Ranking Potentially Popular Items from Early Votes Peifeng Yin +
Overview The World Wide Web has changed the way that people
Overview The World Wide Web has changed the way that people
A Data Partitioning Scheme for Spatial Regression
Presentation transcript:

Using a Model of Social Dynamics to Predict Popularity of News Kristina Lerman Tad Hogg USC Information Sciences Institute HP Labs WWW 2010

Outline Introduction Introduction Social News Portal Digg Social News Portal Digg Social Dynamics of Digg Social Dynamics of Digg Model-Based Prediction Model-Based Prediction Conclusions Conclusions

Outline Introduction Introduction Social News Portal Digg Social News Portal Digg Social Dynamics of Digg Social Dynamics of Digg Model-Based Prediction Model-Based Prediction Conclusions Conclusions

Introduction Popularity of content in social media is unequally distributed. Popularity of content in social media is unequally distributed. –16,000 new stories submitted to Digg everyday, while only a handful subset becomes popular Importance of popularity prediction Importance of popularity prediction –Provide users with tools to indentify interesting items –Enable social media companies to maximize revenue Studies of past researches Studies of past researches –Content quality weakly correlates with eventual popularity –Social influence is responsible for the unpredictability of popularity

Outline Introduction Introduction Social News Portal Digg Social News Portal Digg Social Dynamics of Digg Social Dynamics of Digg Model-Based Prediction Model-Based Prediction Conclusions Conclusions

User Interface of Digg Popular list Popular list –The front page –Promoted news Upcoming list Upcoming list Friends ’ Activity Friends ’ Activity

Inequality of Popularity Figure: Dynamics of social voting. (a) Evolution of the number of votes received by two front page stories in June (b) Distribution of popularity of 201 front page stories submitted in June 2006.

Outline Introduction Introduction Social News Portal Digg Social News Portal Digg Social Dynamics of Digg Social Dynamics of Digg Model-Based Prediction Model-Based Prediction Conclusions Conclusions

Story Data Sets May May –Submitted to Digg between May 25-27, 2006 –2152 stories, 1212 distinct users –510 stories by 239 users are promoted to the front page June June –Promoted (popular) subset 201 stories promoted between June 27-30, stories promoted between June 27-30, 2006 User name and time stamp of the first 216 votes for each story User name and time stamp of the first 216 votes for each story –Upcoming subset Submitted between June 30, 2006 and July 1, 2006 Submitted between June 30, 2006 and July 1, stories received at least 10 votes 159 stories received at least 10 votes

Snapshot of Social Network in Digg June June –1020 top-ranked users with their friends and fans –Augment the network in February, 2008 –Eliminate users who joined Digg after June 30, 2006 May May –Retain only the top 1020 users and their fans –Assume other users had zero fans

Stochastic Model of Social Dynamics in Digg Hogg and Lerman (ICWSM ’ 09) Hogg and Lerman (ICWSM ’ 09) –The stochastic processes framework relates users ’ individual choices to their aggregate behavior. –Represent user behavior in Digg as transitions between a small number of states Explanatory power Explanatory power –Why some stories accumulate many more votes than others? Predictive power Predictive power

Dynamical Model of Social Voting Rate equation for the number of users who vote for a story: Rate equation for the number of users who vote for a story: (vote_rate = interest * visibility) (vote_rate = interest * visibility) s(0) = S (the number of fans of the story’s submitter) N vote (0)=1

Model Parameters Some parameters are measured directly from the May data set. Some parameters are measured directly from the May data set. Story specific parameters Story specific parameters –r: estimated as the value that minimizes the root-mean-square (RMS) difference between the observed votes and the model predictions. –S = the number of fans of the story’s submitter

Observations on the Model The correlation between S and r = The correlation between S and r = General observations reproduced by the model General observations reproduced by the model –Slow initial growth in votes while the story is on the upcoming list –More interesting stories are promoted faster and receive more votes –A story submitted by a poorly connected user tends to need high interest to be promoted (Lerman, 2007)

Outline Introduction Introduction Social News Portal Digg Social News Portal Digg Social Dynamics of Digg Social Dynamics of Digg Model-Based Prediction Model-Based Prediction Conclusions Conclusions

Applications of the Model Estimating inherent story quality from the evolution of its observed popularity Estimating inherent story quality from the evolution of its observed popularity Predicting a story ’ s eventual popularity based on the early reaction of users to the story Predicting a story ’ s eventual popularity based on the early reaction of users to the story

Story Quality Estimation A wide range of interestingness to users A wide range of interestingness to users Well fit lognormal distribution Well fit lognormal distribution

Examples

Predicting Final Popularity of Stories Correlations are 0.87 and 0.49, respectively. Correlations are 0.87 and 0.49, respectively. Strong prediction in popularity rating Strong prediction in popularity rating

Comparison with Social Influence only Prediction Decision tree classifier based on social influence Decision tree classifier based on social influence –Two Features: 1. number of fan votes received within the first 10 votes; 2. number of submitter ’ s fans Model-based prediction outperforms the decision tree classifier Model-based prediction outperforms the decision tree classifier

Outline Introduction Introduction Social News Portal Digg Social News Portal Digg Social Dynamics of Digg Social Dynamics of Digg Model-based Prediction Model-based Prediction Conclusions Conclusions

Conclusions Research has shown that popularity is weakly related to inherent content quality, and that social influence leads to an uneven distribution of popularity, and makes it difficult to predict. Research has shown that popularity is weakly related to inherent content quality, and that social influence leads to an uneven distribution of popularity, and makes it difficult to predict. We claim that the model of social dynamics, which is developed in an earlier work, can quantitatively characterize evolution of popularity of items in Digg. We claim that the model of social dynamics, which is developed in an earlier work, can quantitatively characterize evolution of popularity of items in Digg. How interesting a story is and how connected the submitter is fully determines the evolution of the number of received votes. How interesting a story is and how connected the submitter is fully determines the evolution of the number of received votes.