Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Finding a Team of Experts in Social Networks Theodoros Lappas, Kun Liu, and Evimaria Terzi KDD, 2009 Reported by Wen-Chung Liao, 2009/12/22
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines Motivation Objective Preliminary Problems Algorithms Experiments Conclusions Comments
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation The success of a project depends not only on the expertise of the people who are involved, but also on how effectively they collaborate, communicate and work together as a team. Figure 1: Network of connections between individuals in {a, b, c, d, e}. X a ={algorithms}, X b ={web programming}, X c ={software engineering, distributed systems}, X d ={software engineering} X e ={software engineering, distributed systems, web programming}. T={algorithms, software engineering, distributed systems, web programming}. X’ = {a, b, c} or X” = {a, e}
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objectives Given a task T that requires a set of skills, our goal is to find a set of individuals X ’ X, such that every required skill in T is exhibited by at least one individual in X ’. Additionally, the members of team X ’ should define a subgraph in G with low communication cost.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Preliminaries X = {1,..., n}: n individuals A = {a 1,..., a m }: a universe of m skills X i A T: a task, a subset of skills required to perform a job. T A. S(a): the support set of the skill a, the set of individuals in X that has the skill a. S(a) = {i | i X and a X i }. G(X,E): an undirected and weighted graph d(i, i ’ ) Path(i, i ’ ) d(i,X ’ ) = min i’ X’ d(i, i ’ ) Path(i, X ’ ) G[X ’ ]: the subgraph of G that contains only the nodes in X ’
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 Problems Problem [Team Formation]: Given the set of n individuals X = {1,..., n}, a graph G(X,E), and task T, find X’ X, so that C (X’, T) = T, and the communication cost Cc(X’) is minimized. Diameter (R): Cc-R(X’) Minimum Spanning Tree (Mst): Cc-Mst (X’), Proposition 1. The Diameter-TF problem is NP-complete. Proposition 2. The Mst-TF problem is NP-complete.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Algorithms S(a 0 ) ={1,7} S(a 1 ) R 1 =max{1, 2, 2}=2 R 7 =max{1, 0, 1}=1 X’= {7} ∪ {2, 8} Proposition 3. Cc-R(X’) ≦ 2 Cc-R(X*) O(|S(a rare )| × n) S(a 2 ) S(a 3 )
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 X0X v X’ v* O(|X 0 | × |E|) O(|T| × |X|)
Intelligent Database Systems Lab N.Y.U.S.T. I. M Y3Y3 3 Y2Y2 Y1Y1 v v* X’ O(k × |E|)
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Experiments GreedyDiameter (GreedyMST) DBLP dataset: papers in DB, DM, AI and T conferences. X dblp : authors that have at least three papers individuals. X i : terms that appear in at least two titles of papers that author i has co-authored distinct skills. Authors i, i’ are connected in G dblp (X dblp,E) if they appear as co-authors in at least two papers. A task T(t, s) is generated: (1) select S from {DB, DM, AI,T} with |S| = s. (2) randomly pick t required skills. For every (s, t), generate 100 random tasks, t=2, 4,…, 20 and s = 1,…,4.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Experiments
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 12 Conclusion Address the problem of forming a team of skilled individuals to perform a given task, while minimizing the communication cost among the members of the team. Prove that the Team Formation problem is NP- Hard. Propose appropriate approximation algorithms.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 13 Comments Advantage Shortage Applications ─ Team formation