Download presentation
Presentation is loading. Please wait.
Published byCory Oliver Modified over 9 years ago
1
Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi Guo 2 1 University of Illinois at Urbana-Champaign {chiwang1, hanj, yjia3, dzhang22, yintao}@illinois.edu 2 Tsinghua University {jietang, guojy07@mails}.tsinghua.edu.cn
2
Motivation Latent knowledge in information network: – Relationships: friends/relatives/colleagues/enemies? If they can be mined by links, it will benefit our study in – Community structure clustering & classification – Exerting Searching search & ranking – Evolution patterns prediction & recommendation
3
Overall Framework
4
a i : author i p j : paper j py: paper year pn: paper# st i,yi : starting time ed i,yi : ending time r i,yi : ranking score
5
Heuristics ASSUMPTION 1: at each time t during the publication history of a node x, x is either being advised or not being advised. Once x starts to advise another node, it will never be advised again. ASSUMPTION 2: for a given pair of advisor and advisee, the advisor always has a longer publication history than the advisee.
6
Stage 1: Preprocessing From author-paper bipartite network to authorship collaboration homogenous network. Then a filtering process is performed to remove unlikely relations of advisor-advisee.
7
Stage 1: Preprocessing Author aj is not considered to be ai’s advisor if one of the following conditions holds:
8
Stage 1: Preprocessing In addition, estimate: – the starting time st ij is estimated as the time they started to collaborate; – the ending time ed ij can be estimated as either the time point when the Kulczynski measure starts to decrease; – the local likelihood of aj being ai’s advisor lij
9
Stage 2: Graph Factor Model For each node ai, there are three variables to decide: yi, sti, and edi. Suppose we have already had a local feature function g(yi, sti, edi) defined on the three variables of any given node.
10
Experiment Results DBLP data: 654, 628 authors, 1076,946 publications, years provided. DatasetsRULESVMIndMAXTPFG TEST169.9%73.4%75.2%78.9%80.2%84.4% TEST269.8%74.6% 79.0%81.5%84.3% TEST380.6%86.7%83.1%90.9%88.8%91.3% Empirical parameter optimized parameter heuristicsSupervised learning
11
Case Study AdviseeTop Ranked AdvisorTimeNote David M. Blei 1. Michael I. Jordan01-03PhD advisor, 2004 grad 2. John D. Lafferty05-06Postdoc, 2006 Hong Cheng 1. Qiang Yang02-03MS advisor, 2003 2. Jiawei Han04-08PhD advisor, 2008 Sergey Brin 1. Rajeev Motawani97-98“Unofficial advisor”
12
Effect of rules - ROC curve Filtering rules in TPFG 12
13
THANK YOU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.