Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Delroy Cameron Masters Thesis Computer Science, University of Georgia.

Similar presentations


Presentation on theme: "1 SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Delroy Cameron Masters Thesis Computer Science, University of Georgia."— Presentation transcript:

1 1 SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Delroy Cameron Masters Thesis Computer Science, University of Georgia 11/27/2007 Advisor: I. Budak Arpinar Committee: Prashant Doshi Robert J. Woods

2 2 OUTLINE  Background  Expertise Profiles  Ranking Experts  Collaboration Networks Expansion  Results and Evaluation  Conclusion  Demo SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

3 3 BACKGROUND  Semantic Web  What ?  Extension of current Web  Attach Meaning to Data  Why ?  Under Utilization of Current Web  HTML Limitations  Goal  Enhance Information Exchange  Automatic Information Discovery  Interoperability of Services SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

4 4 BACKGROUND  Semantic Web  Technologies  XML  RDF/RDFS/OWL  URI  Ontology SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks “David Billington is a Professor of Mathematics” David Billington Mathematics David Billington Mathematics David Billington

5 5 BACKGROUND  Semantic Web  Common Challenges  Entity Disambiguation  Ontology Mapping/Alignment  Trust/Provenance  Semantic Association Discovery  Application  Social Networks  Bio-Informatics  National Security  GPS Data Mining SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

6 6 BACKGROUND  Social Networks  What ?  Connected through Social Relationships  Characteristics  Clustering Coefficient (connectedness to neighbors) ‏  Centrality (average shortest path length) ‏  Geodesic (shortest path length) ‏ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

7 7 BACKGROUND  Peer-Review Process  What ?  Review scholarly manuscripts  Challenges  Slow  Conflict of Interest  Finding Suitable Reviewers  Arbitrary Knowledge Approach  Research Diversification  Emerging Fields SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

8 8 CONTRIBUTIONS  Applicability of Semantics  Finding Expertise  Fine Levels of Granularity  Finding Experts  Taxonomy  Collaboration Networks  Discovery of Unknown Experts SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

9 9 SEMEF  SEMantic Expert Finder  Finding Expertise (Expertise Profiles) ‏  Collecting Expertise  Quantifying Expertise  Finding (Ranking) Experts  w/ and w/o taxonomy  Collaboration Networks  Geodesic  C-Nets SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

10 10 EXPERTISE PROFILES  Collecting Expertise  Collect All Publication  Map papers to topic  Quantify all papers  Publications Dataset  DBLP 473,296 papers (conference/session names - Nov. 2007) ‏  ACM, IEEE, Science Direct 29,454 papers (abstracts/index terms) ‏  Combined 476,299 papers SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

11 11 EXPERTISE PROFILES  Collecting Expertise  Papers-to-Topics Dataset  Combined (476,299) ‏  Topics (320) ‏  Relationships (676,569) ‏  Expertise Profiles (560,792) ‏ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

12 12 EXPERTISE PROFILES  Quantifying Expertise  Mapping each paper to distinct value  Publication Impact  Hector Garcia-Molina (248 papers - 2003) ‏  E. F. Codd (49 papers - 2003) ‏  Citeseer Impact Statistics (1221 venues) ‏  DBLP URIs SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

13 13 EXPERTISE PROFILES Figure 1: Expertise Profile author_A topic 1 (4.50)‏ paper 1 1.54 topic 2 (1.86)‏topic 3 (3.08)‏ paper 2 paper 3 1.541.101.861.54 paper 4 paper 6 paper 5 1.86

14 14 RANKING EXPERTS  Taxonomy of Topics  Session names  Conference Names  O’CoMMA  Paper Abstracts  Index Terms Figure 2: Taxonomy of Topics 192 128 320 216 60 50

15 15 RANKING EXPERTS  Case 1  Single Topic without Taxonomy  Traverse all Expertise Profiles  Sum impact, (papers  topics) ‏  Case 2  Single Topic with Taxonomy  Traverse all Expertise Profiles  Sum impact, (papers  topics, subtopics) ‏ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Prevent Expertise Overestimation 1) Map 2) Papers to leaf nodes only

16 16 RANKING EXPERTS  Case 3  Array of Topics without Taxonomy  Same as Case 2  Case 4  Array of Topics with Taxonomy  Filter input topics  Sum impact, (papers  topics, subtopics) ‏ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

17 17 COLLABORATION NETWORKS EXPANSION  Geodesic Figure 3: Geodesic Relationships author_A author_1 author_Bauthor_A author_B author_2author_A author_B opus:Article_in_Proceedings_179 opus:Proceedings_543 opus:Article_in_Proceedings_35opus:Article_in_Proceedings_8 author_A STRONG MEDIUMUNKNOWN WEAK opus:author opus:Article_in_Proceedings_291 opus:author opus:Article_in_Proceedings_3 opus:author opus:isIncludedIn opus:author

18 18 COLLABORATION NETWORKS EXPANSION  C-Net  Ordering Cluster of Experts  Collaboration Strength* * Newman, M. E. J.: Coauthorship Networks and Patterns of Scientific Collaboration. National Academy of Sciences of the United States of America, 1(101): 5200- 5205, (2004). coauthor_1 {0.73, 0.5} Super Node {14.80} coauthor_2 {1.81, 1.0} coauthor_3 {0.73, 0.5} coauthor_4 {0.73, 0.5} coauthor_5 {1.54, 1.0} coauthor_n {1.1, 0.8} Figure 3: Geodesic Relationships

19 19 RESULTS AND EVALUATION  Evaluation  WWW Search Track (2005/6/7) ‏  Input Topics Call For Papers  SWETO-DBLP Subset (67,366 authors) ‏  DBLP (560,792) ‏  Validation  Collaboration Networks Expansion SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

20 20 RESULTS AND EVALUATION  Validation Table 1: Past PC Lists comparison with SEMEF 52%668510-20% 58%200620-30% 65%211430-40% 73%302640-50% 79%211350-60% 82%100460-70% 85%101170-80% 85%000180-90% 85%000090-100% 29/3421/2526/2940/48 Total 89 13 Search2006 84 13 Search 2007 85 12 Average 83 35%10(top) 0-10% Search 2005 Cumulative Percentage in PC List Search Track (Number of PC Members in SEMEF List) Percentage in SEMEF List

21 21 RESULTS AND EVALUATION  Validation Figure 4: Average Number of PC in SEMEF List

22 22 RESULTS AND EVALUATION  Validation Figure 5: Average PC Distribution in SEMEF List

23 23 RESULTS AND EVALUATION  Collaboration Networks Expansion Table 4: PC Chair – SEMEF List Geodesic Relationships 10141120151731WEAK 2 2 0 Chair2 1 6 3 Chair1 Search2006 0 7 3 Chair1 Search 2007 PC List (Number of Expert Relationships) ‏ EXTREMELY WEAK MEDIUM STRONG Relationships 1 10 2 Chair1 Search 2005 2 7 0 Chair2 00 48 00 Above Average Expertise (in PC) ‏ 58576605582608293649WEAK 26 55 3 Chair2 66 88 10 Chair1 Search2006 66 88 10 Chair1 Search 2007 SEMEF (Number of Expert Relationships) ‏ EXTREMELY WEAK MEDIUM STRONG Relationships 99 106 6 Chair1 Search 2005 26 53 2 Chair2 32 1676 343 Chair2 Above Average Expertise (in PC) ‏ Table 3: PC Chair – PC Member Geodesic Relationships

24 24 CONCLUSION  Expertise Profiles  Publication Data  Publication Impact Statistics  Papers-to-Topics Relationships  Ranking Experts  w/ and w/o Taxonomy  Single and Array of Topics  Collaboration Networks Expansion  Semantic Association Discovery  Geodesic  C-Nets SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

25 25 DEMO  Web Application  Apache Tomcat 6.0  Java Server Pages  Ubuntu 7.10 Delroy Cameron Masters Thesis Computer Science, University of Georgia

26 26 RELATED WORK  Particle Swarm Algorithm  ExpertiseNets  Expertise Browser  Experience Atoms  Expertise Recommender  Change history  Tech Support Heuristics  Profiling, Identification, Supervisor SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

27 27 RELATED WORK  Web-Based Communities  Expert Rank  Formal Probabilistic Models  Candidate Models  Document Models  RDF-Matcher SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

28 28 EXPERTISE PROFILE ALGORITHM Algorithm findExpertiseProfile(researcherURI, list of publications) ‏ create ‘empty expertise profile’ foreach paper of researcher do get ‘topics’ list of paper (using papers-to-topics dataset) get ‘publication impact’ if ‘publication impact’ is null do ‘publication impact’  default weight else ‘weight’  ‘publication impact’ + existing ‘weight’ from expertise profile if ‘expertise profile’ contains ‘topic’ do update ‘expertise profile’ with else add pair to ‘expertise profile’ end return ‘expertise profile’ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

29 29 RANKING EXPERTS ALGORITHM Algorithm rankValue(researcherURI, list of topics) ‏ set expertRank to zero create temp ‘expertise profile’ filter topics foreach topic in filtered topics list do get ‘papers’ for this topic (using papers-to-topics dataset) foreach paper in papers list do if researcher is author do get ‘publication impact’ as ‘weight’ expertRankValue = expertRankValue + ‘publication impact’ add pair to temporary ‘expertise profile’ end if end return ‘rankValue’ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks


Download ppt "1 SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Delroy Cameron Masters Thesis Computer Science, University of Georgia."

Similar presentations


Ads by Google