Download presentation
Presentation is loading. Please wait.
Published byOpal Hunt Modified over 9 years ago
1
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Deqing Yang, Yanghua Xiao, Bo Xu, Hanghang Tong, Wei Wang, Sheng Huang School of Computer Science, Fudan University, China ECML-PKDD’2012 Which Topic will You Follow?
2
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Outline Introduction Preliminaries Empirical Study Modeling
3
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Who are the most appropriate candidates to receive a call-for- paper or call-for-participation? How can you deliver the call-for-paper emails to the authors who are interested in the proposed topic instead of flooding it blindly?
4
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? What session topics should we propose for a conference of next year? Furthermore, how many sessions are necessary for a certain topic?
5
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Can we predict the topic of an author’s next paper?
6
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Basic Idea Use features of authors in Scientific Collaboration Network (SCN) to model author’s topic- following behavior Two candidate features Social influence an individual tends to adopt behaviors of his neighbors or friends Homophily the tendency of individuals to choose friends with similar characteristics
7
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Contributions Verify that social influence and homophily are the two factors determining topic diffusion in SCN Propose a Multiple Logistic Regression (MLR) model to predict author’s topic-following behavior Conduct extensive experiments to prove our model has good prediction performance
8
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Outline Introduction Preliminaries Empirical Study Modeling
9
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Scientific Collaboration Network SCN A temporal, undirected and edge-weighted graph Vertex: author Edge: coauthoring relationship Edge-weight: number of papers coauthored by the two ends of the edge Settings DBLP dataset 25 representative topics
10
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Homophily We use topic similarity to characterize homophily A 25-dim vector u represents an author’s topic history Topic similarity between two authors u and v: Topic similarity between an author u and a group of authors U: is also a 25-dim vector each dimension of which is i-th topic’s paper number published by all users in U
11
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Outline Introduction Preliminaries Empirical Study Modeling
12
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Driving Forces of Topic-Following U=U 0 ∪ V 0, U 0 ∩ V 0 = U 0 : the users who have published papers of a given topic before a certain year V 0 : U 1 ~U 4 N(u) is neighbor set of u U 1: affected by social influence and homophily U 2 : affected merely by social influence U 3 : affected merely by homophily U 4 : not affected by these two forces
13
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Driving Forces of Topic-Following (cont.) Two forces are mixed together to impact topic- following Impacts are time-sensitive and decrease in an exponential way
14
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Social Influence An author adopts a topic with more probability when more of his neighbors have followed the topic before It is more probable for an author to follow the topics that have been adopted by his neighbors (direct propagation) who have coauthored more papers with him
15
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Outline Introduction Preliminaries Empirical Study Modeling
16
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Model Variables Model selection Two-category classification Multiple Logistic Regression (MLR) model Explanatory Variables Social Influence An author u’s tendency to follow topic s in year t
17
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Explanatory Variables Homophily W.r.t. those users who have followed topic s before t, i.e.,, we measure u’s homophily as Then, the whole MLR model is Baseline ( Anagnostopoulos et al.,2008 ) Model Variables (cont.)
18
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Parameter Estimation By maximum likelihood (training set in [2004,2008]) β 2 has larger Wald value than β 1 indicating F TS (homophily) is more crucial to impact topic-following behavior than F SI (social influence)
19
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Evaluation Results Model evaluation Metrics (testing set in 2009) Recall/sensitivity, specificity, precision, accuracy, AUC (Area under ROC curve), Results for topic XML AUC: 0.743 vs. 0.638
20
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Evaluation Results (cont.) For other 4 representative topics, MLR outperforms the baseline in both accuracy and F β
21
Graph Data Management Lab School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: yangdeqing@fudan.edu.cnECML-PKDD 2012, Bristol, UK Which Topic will You Follow? Thank you! Any question is welcome
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.