A Taxonomy-based Model for Expertise Extrapolation Delroy Cameron, Amit P. Sheth Ohio Center for Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State University, Dayton OH Boanerges Aleman-Meza Department of Biochemistry and Cell Biology Rice University, Houston TX I. Budak Arpinar, Sheron L. Decker LSDIS Lab, Department of Computer Science University of Georgia, Athens GA 48 th ACM Southeast Conference. ACMSE Oxford, Mississippi. April 15-17, 2010.
BACKGROUND Realm of Finding Experts o Propagation Method o Human-Centered Information Diffusion o prima facie o Issues o Inconsistent Human Perceptions o Strong vs. Weak ties Aftefacts o Curricula Vitarium o Version Control Systems, Patents & Research Grants o Citation Linkage 2 Citation Sentiment Detection Pied Piper Effect Expertise Granularity Adage: The publications of a Researcher is indicative of her expertise.
CONTRIBUTIONS Structured Data o Taxonomy of Topics o Extrapolation o Bibliographic Data o Collaboration Networks Co-authorship Graph o Prevent Collaboration Stagnation 3 Search Algorithms Page Rank subtopic_of DFS, BFS Semantic Associations Topic Hierarchy
s EXPERTISE MODEL 4 aiai B = {b 1, b 2, …, b n }P = {p 1, p 2,…,p n } T = {t 1, t 2, …, t m } b1b1 λ1λ1 p1p1 b2b2 p2p2 b3b3 p3p3 b4b4 p4p4 bnbn pnpn t1t1 t2t2 t3t3 tmtm λ2λ2 λ3λ3 λ4λ4 λnλn Expertise Profile author
EXPERTISE PROFILES 5 #Semantic_Web p 49 p 73 p 70 p 17 p 40 p 37 p 68 p 13 p 36 p9p9 p 20 p 29 #A.I. p5p5 #Reasoning #OWL #Know. Acq #Know. Man. #XML #Semantics #Languages #Content p 50 p8p8 p 42 p 53 #Web #RDF a i - 81 publications 12 - Semantic Web
EXPERTISE PROFILES 6
COMPUTING EXPERTISE 7 #A.I. p5p5 #Reasoning #OWL e(#Semantic_Web) = ((p 5 (OWL) v p 5 (Reasoning) v p 5 (A.I.)) λ ecai e(p 5 ) = (1 v 0 v 0) 0.69 = 0.69
COMPUTING EXPERTISE 8 #Semantic_Web p 49 p 73 p 70 p 17 p 40 p 37 p 68 p 13 p 36 p9p9 p 20 p 29 #A.I. p5p5 #Reasoning #OWL #Know. Acq #Know. Man. #XML #Semantics #Languages #Content p 50 p8p8 p 42 p 53 #Web #RDF e(p 5 ) = λ ecai = 0.69 e(p 8 ) = λ ekaw = 0.55 e(p 42 ) = λ www = 1.54 e(p 50 ) = λ ewimt = 0.1 e(p 53 ) = λ ekaw = 0.55 e‘’ = =3.43 e’ = e = =13.43
DATASET 9 Papers-to-Topics Dataset o 476,299 papers o 676,569 relationships to topics o Focus Crawl DBLP Taxonomy of CS Topics o Manually (320 Topics) o Conference Names (60) o Session Names (216) o Index Terms & Yahoo! Term Extractor (128) o O`Comma Taxonomy (50) Publication Impact Factors o Citeseer (>1200 Proceedings)
DEMO 10
EVALUATION 11
GEODESIC Geodesic - Shortest path between two vertices in a directed graph 12 b a Geodesic LevelDescription w.r.t. PC Chair(s)Degree of Separation STRONGco-authorsOne MEDIUMcommon coauthorsTwo WEAKpublished in same proceedingsUnspecified coauthors w/ common coauthorsTwo coauthor related to editorThree EXTREMELY WEAKcoauthors in same proceedingsThree UNKNOWNno relationship in datasetUnknown
EVALUATION 13
C-Net C-Net – Measure of collaboration strength within expert subgroups 14 v m =14.80 v 1 =0.73 v 2 =0.73 v 3 =0.73 v 4 = M. E. J. Newman, “Coauthorship networks and patterns of scientific collaboration,” in Proceedings of the National Academy of Sciences, 2004
LIMITATIONS Taxonomy of Topics Semantic Association in Large RDF Graphs Entity Disambiguation Paper-to-Topics Mappings 15
CONCLUSION Semantic Expert Finder o Taxonomy of Topics o Publication Impact Factors o Expertise Profiles Collaboration Network Analysis o Co-Authorship Graph o Semantic Associations 16
ACKNOWLEDGEMENTS People Wenbo Wang Ajith Ranabahu Boanerges Aleman-Meza National Science Foundation Award SemDis (Discovering Complex Relationships in the Semantic Web) No Wright State University No. IIS to University of Georgia 17