Finding Experts Using Social Network Analysis 2007 IEEE/WIC/ACM International Conference on Web Intelligence Yupeng Fu, Rongjing Xiang, Yong Wang, Min Zhang, Shaoping Ma Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei, Liang Date: 2009/03/23 1
Outline Introduction A two-stage ranking method Building associations among Candidates Experiment Conclusion 2
Introduction When locating some desired information, – One can usually be satisfied by finding an expert in the topic of interest Finding information quickly on the expertise of people quickly – Can play critical roles in facilitating better solutions and fostering the formation of virtual organizations, expertise networks 3
Introduction Discovering who knows what is challenging – Could build up a relationship between a query and an expert via documents – Social networks provide another opportunity for finding experts and web pages can be utilized to mine the relationship among people 4
A two-stage ranking method Expert finding problem – Given the query topic q, the probability of a candidate c being an expert To reveal the relationship between query & expert – How to associate the high semantic and abstract concept “person” with concrete documents? Two kinds of association to tackle this problem 5
A two-stage ranking method First kind of association a(d, c) – Between the candidates and the content of the documents Assume: – Each expert’s knowledge can be represented by a list of terms – Document d where the candidate c appears has non-zero associations a(d, c) 6
A two-stage ranking method Second kind of association a(c x, c y ) – Among the candidates themselves – The connection among candidates can be identified through document analysis Given an expert c x, – the candidate c y who has strong association a(c x, c y ) with him is also quite likely to be an expert 7
A two-stage ranking method Two-stage method 1. Expertise evaluating process – Identifying candidate c to be an expert through sum over the similarities of all the documents for a given topic q – Select some candidates ranking at top levels as seed 8
A two-stage ranking method 2. Expertise propagation process – Employ the associations among candidates to propagate the likelihood from those highly possible experts to other candidates – Viewed as estimating the probability p(c y |c x ), the probability of candidate c y to be expert 9
A two-stage ranking method If an expert c x has an expertise probability of P(c x ) and w associated candidates, each of the w candidates c y has the association a(c x, c y ) will receive a score fraction from c x 10
11
Building associations among Candidates Task: Building associations Represent the organization as a Graph – Nodes correspond to candidates, edges correspond to the strength of associations – Higher strength of association indicates that the two people have more common interest and more frequent communication 12
Building associations among Candidates Web pages-based Social network – People co-occur in a range of local context may share similar interest – Intuitive way, count the co-occurrence of two candidates c x and c y in a document d 13
Building associations among Candidates communication-based Social network – Two candidates c x and c y are associated if they appear together in the from, to or cc field of an e 14
Building associations among Candidates communication-based Social network (cont.) – connection matrix built merely through single message is sparse – Consider associating candidates appearing in the same thread 15
Building associations among Candidates communication-based Social network (cont.) – Combine single message and thread together 16
Building associations among Candidates Query dependent social network – Calculate the associations from those web pages and s which are relevant to the query topic – Focus merely on the associations which are related to the desired topic 17
Building associations among Candidates Query dependent social network (cont.) – Evaluate the strength of association by employing similarity of the documents to the query The more relevant to query the document that joins the candidates is, the stronger association exists among the candidates – Replace the binary function with numerical function Potential expert propagates more expertise probability to those candidates with stronger associations 18
Experiment Experimental settings – Text Retrieval Conference (TREC) Provided a common platform with the Enterprise Search Track to empirically assess methods A crawl of the public W3C sites and comprises 331,037 documents in six different sub-collection including lists web and personal homepages 198,394 messages totally form 79,521 trees 19
Experiment Experimental results 20
Experiment The role of seed 21
Experiment 22
Experiment 23
Conclusion Propose a two-stage method for expert finding The performance improvement in experiments demonstrates its effectiveness 24