Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Advisor-Advisee Relatio nships from Research Publication Networks KDD2010 报告人:徐晓旻.

Similar presentations


Presentation on theme: "Mining Advisor-Advisee Relatio nships from Research Publication Networks KDD2010 报告人:徐晓旻."— Presentation transcript:

1 Mining Advisor-Advisee Relatio nships from Research Publication Networks KDD2010 报告人:徐晓旻

2 INTRODUCTION  conduct a systematic investigation of the ca se of mining advisor-advisee relationships between authors in a research publication n etwork.  better understand the insight of the research co mmunity  provides additional semantic information on t he links

3

4 INTRODUCTION(cont.)  The left figure  shows the input: an temporal collaboration net work, which consists of authors, papers  The middle figure  shows the output of our analysis: an author net work with solid arrow indicating the advising r elationship  The right figure  gives an example of visualized chronological hierarchies.

5 PROBLEM FORMULATION  {G} = {(V = Vp ∪ Va,E)}, where Vp ={p 1,..., p np } is the set of public ations, with pi published in time ti, V a = {a 1,..., a na } is the set of authors, and E is the set of edges. Each edge e ij ∈ E associates the paper pi and t he author aj, meaning aj is one author of pi.

6 original network transformed  original network can be transformed into networ k containing only authors.  Let G ′ = (V ′,E ′,{py ij }e ij ∈ E ′,{pn ij }e ij ∈ E ′ ), where V ′ = {a 0,..., a na } is the set of authors (includin g a virtual node a 0 ). Each edge e ′ ij = (i, j) ∈ E c onnects authors ai and aj if they have publicati on together  two vectors associated with the edge, Pub_Ye ar_vector py ij and Pub_Num_vector pn ij.

7 network transformed cont.  associate with each author two vectors p y i a nd p n i to respectively represent the number of papers and the corresponding published y ear by author ai. The two vectors p y i and p n i can be derived from py ij and pn ij.

8 this problem is more complicated  (i) one could have multiple advisors like maste r advisors, PhD co-advisors  (ii) some mentors from industry behave simila rly as academic advisors if only judged by the collaboration history;  (iii) one’s advisor could be missing in the data set

9 construct subgraph H′  Formally, we denote r ij as the probability of a j being t he advisor of a i.  construct a subgraph H′< G′by removing some edges f rom G′ and make the remaining edges directed from a dvisee to potential advisor.

10 construct subgraph H′cont. A simple way to predict is :  to fetch top k potential advisors of a i and check whether a j i s one of them while r ij > r i0 or r ij >, where is a threshold such as 0.5. We use P@(k, ) to denote this method.

11

12 4. APPROACH  The main idea is to leverage a time-constrained pr obabilistic factor graph model to decompose the jo int probability of the unknown advisor of every au thor.  By maximizing the joint probability of the factor graph we can infer the relationship and compute ra nking score for each relation edge on the candidate graph.

13 4.1 Assumptions and Framework

14 two-stage framework solution  In stage 1, we preprocess the heterogeneous collaboration netwo rk to generate the candidate graph H′. This includes the transfor mation from G to a homogeneous network G′, the construction from G′ to H′, and the estimate of the local likelihood on each ed ge of H′  In stage 2, these potential relations are further modeled with a pr obabilistic model. Local likelihood and time constraints are com bined in the global joint probability of all the hidden variables. The joint probability is maximized and the ranking score of all t he potential relations is computed together. The construction of H is finished in this stage.

15 4.2 Stage 1: Preprocessing

16 Rule to detect advisor  The Kulczynski meas ure reflects the correla tion of the two authors ’publications.  IR is used to measure the imbalance of the o ccurrence of aj given a i and the occurrence o f ai given aj

17 Rule to detect advisor

18  When the pair of authors passes the test of selected rules from them, we construct a dir ected edge from ai to aj in H′.  we estimate the starting time and ending ti me of the advising, as well as the local likeli hood of a j being a i ’s advisor l ij  starting time st ij is estimated as the time the y started to collaborate

19  the ending time ed ij can be estimated as eit her the time point when the Kulczynski mea sure starts to decrease, or the year making t he largest difference between the Kulczynsk i measure before and after it. local likelihood of aj being ai’s advisor lij

20 Stage 2: TPFG Model  define the TPFG model  For each node a i, there are three variables to d ecide: y i, st i, and ed i.  local feature function g(y i, st i, ed i ) joint probability of all the variables in the network

21 Stage 2: TPFG Model  To find the most probable values of all the hidden variables, we need to maximize the j oint probability of all of them.  It is intractable to do exhaustive search

22 Decomposition of variables dependency 消除变量 sti,edi 计算 j 为 i 的老师的可能性,以及必须 满足的条件 ( 由指示函数 I 给出 )

23 Decomposition of variables dependency

24 该图中 f1(.) 相关的节点 有 y1, 以及 节点 1 所有 可能的学生 节点从图表 中可以看出 是节点 2,3

25 4.4 Model Learning

26 Sum-product Sum-Product 算 法继承了消息 传递机制,但 通过引入 factor graph 将全局的 概率密度函数 分解成若干个 局部概率密度 函数的乘积

27 single- sum-product algorithm

28 Sum-product algorithm 考虑 g i (x i ) 正是只关 于 xi 的函数,即有 g i (x i )=u x->gi ()(xi) 于是 就照公式 (5) 可得 g i (x i )

29 single- sum-product algorithm

30 New TPFG Inference Algorithm  The original sum-product algorithm meet with dif ficulty since it requires that each node needs to wa it for all-but-one message to arrive. Thus in TPFG some nodes will be waiting forever due to the exis tence of cycles.  we arrange the message passing in a mode based on the strict order determined by H′. Each node ai has a descendant set Y −1 i and an ascendant set Y i.

31 Message Passing two-phase schema  In the first phase, messages are passed from advis ees to possible advisors, and in the second, messag es are passed back from advisors to possible advis ees.  the first phase:  The message from f i () to yi is generated and sent only when all the messages from its descendants h ave arrived. And yi immediately send it to all its as cendants f j (), j ∈ Y i.

32 two-phase schema cont.  the second phase:  each of which are along the reverse direction on the edge as in phase 1. 为什么有了 lij 还要计算 rij? 因为 lij 是 j 为 i 的导师的 local 支持度 rij 根据定义 是全局意义 上的支持度 他考虑了图 的其他依赖 关系,考虑 形式就是该 传播模型

33 two-phase schema cont.  After the two phases of message propagatio n, we can collect the two messages on any e dge and obtain the marginal function.

34 simplify the message propagation  Eliminating the function nodes and the internal m essages between a function node and a variable no de  The improved message propagation is still separat ed into two Phases  the first phase, the messages senti which passe d from one to their ascendants are generated in a similar order as before.  In the second, messages returned from ascend ants recvi are stored in each node.

35 simplify the message propagation

36

37

38 5. EXPERIMENTAL RESULTS  Data Sets:DBLP Computer Science Bibliog raphy Database  test the accuracy of the discovered advisor- advisee relationships  adopt three data sets: One is manually labeled by looking into the home page of the advisors, and the other two are crawled from the Mathem atics Genealogy project1 and AI Genealogy pro ject

39 compare TPFG with baseline methods  Evaluation Aspects  two performance measurements: accuracy and sc alability.

40 5.2 Accuracy  Effect of rules in TPFG  From Figure 5(a) we can see that R2/R3 has th e highest suitability on the tested data. ROC 曲线: 通过 test data 中已知的师生 pair 和算 法计算出的师生 pair 的比较,将计算 出的 pair 按照 rank score 从大到小排 列,然后取横轴为 top a%of 计算 pair, 纵轴为 top a% 与 test data 中 pair 的交 集 /test data 规模

41 Effect of network structure  From Figure 5(c) we see that for closures with differ ent depths,TPFG achieves better accuracy when the depth increases,  To compare it with the exact maximal joint probabili ty and other approximate algorithmJuncT and LBP

42 Effect of training data  Support Vector Machines(SVMs) are accurate supervised learning approaches  reduce advisor mining to a classification problem  we combined Kulczynski and IR measures wit h as features.  TPFG can achieve comparable or even better accuracy compared with a supervised method

43 Effect of training data

44 5.3 Scalability Performance

45 5.4 Applications  Visualization of genealogy  The visualized hierarchies of research community based on the relationship can help us gain a better insight of the community

46 5.4 Applications  Expert finding and Bole search  bole search, a specific expert finding task, ai ming to identify best supervisors

47


Download ppt "Mining Advisor-Advisee Relatio nships from Research Publication Networks KDD2010 报告人:徐晓旻."

Similar presentations


Ads by Google