Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling Topic Diffusion in Scientific Collaboration Networks

Similar presentations


Presentation on theme: "Modeling Topic Diffusion in Scientific Collaboration Networks"— Presentation transcript:

1 Modeling Topic Diffusion in Scientific Collaboration Networks
Graph Data Management Lab, School of Computer Science Fudan University, Shanghai, China Which Topic will You Follow? Deqing Yang, Yanghua Xiao, Bo Xu, Hanghang Tong, Wei Wang and Sheng Huang Introduction Empirical Study Motivations Who are the most appropriate candidates to receive a call-for-paper or call-for-participation? What session topics should we propose for a conference of next year? Addressing these objectives, we study author’s topic-following behavior in Scientific Collaboration Network (SCN), i.e., an author follows others to publish papers of a given topic Basic Idea Scientific Collaboration Network It is represented as a graph where vertices represent authors and edges represent coauthor relationships extracted from DBLP dataset It is a temporal graph Gt, in which vertices and edges increase as time t elapses Author’s topic-following behavior is the process of topic diffusion in social networks, which is driven by two typical ingredients, social influence and homophily We try to find the variables that can precisely depict social influence and homophily in our scenario and use them to predict one author’s topic-following behavior in future Challenges How to distinguish social influence and homophily? Topic definition and identification Sample sparseness Contributions Uncover the effects of social influence and homophily on topic diffusion Propose a Multiple Logistic Regression (MLR) model to predict author’s topic-following behavior Extensive experiments prove our model’s excellent performance Driving Forces of Topic-Following U1: users affected by both social influence and homophily U2: users affected only by social influence U3: users affected only homophily U4: users without any impact Results: Two forces are mixed to impact topic-following Impacts are time-sensitive and decrease in an exponential way Social Influence An author adopts a topic with more probability when more of his neighbors have followed the topic before x is affected neighbor number/proportion p(x) is the probability that an author follows the topic It is more probable for an author to follow the topics that have been adopted by his neighbors (direct propagation) who have coauthored more papers with him Modeling Topic Diffusion in Scientific Collaboration Networks Model It is a two-category classification to predict whether an author will follow a given topic Multiple Logistic Regression (MLR) model is feasible for our scenario, where the probability of topic-following is formalized as: where xi is explanatory variable, αand β are parameters we should estimate by training Baseline model where a is the number of neighbors who have followed the topic Explanatory Variables Social Influence An author u’s tendency to follow topic s in year t, is composed from all his neighbor v’s tendency to this topic, as well as considering their coauthor strengths Homophily We use topic similarity to depict the homophily among users in the context of topic-following A 25-dim vector u represents an author’s topic history, each dimension is the number of his papers of a given topic Then, topic similarity between user u and v can be defined as W.r.t. those users who have followed topic s before t, i.e., we measure u’s homophily as Then, the whole MLR model is Y=π(x)=1, if u follows s or its related topics Parameter Estimation By maximum likelyhood against training set β2 has larger Wald value than β1 indicating FTS (homophily) is more crucial to impact topic-following behavior than FSI Model Evaluation Metrics Recall/sensitivity, specificity, precision, accuracy, AUC Fβ, we set β=1.1 to favor recall a little For topic XML Area under ROC curve (AUC) is vs For other 4 representative topics, MLR outperforms the baseline in both accuracy and Fβ yangdeqing, ECML/PKDD2012, Bristol, UK


Download ppt "Modeling Topic Diffusion in Scientific Collaboration Networks"

Similar presentations


Ads by Google