Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,

Slides:



Advertisements
Similar presentations
Google News Personalization: Scalable Online Collaborative Filtering
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School.
The State of the Art in Distributed Query Processing by Donald Kossmann Presented by Chris Gianfrancesco.
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Maggie Zhou COMP 790 Data Mining Seminar, Spring 2011
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Oozing out knowledge in human brains to the Internet Lada Adamic School of Information University of Michigan
Computing Trust in Social Networks
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
School of Information University of Michigan Expertise Networks in Online Communities: Structure and Algorithms Lada Adamic joint work with Jun Zhang and.
Overview of Search Engines
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα
School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,
Research Meeting Seungseok Kang Center for E-Business Technology Seoul National University Seoul, Korea.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Adversarial Information Retrieval The Manipulation of Web Content.
Instrumentation.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Social Networking Algorithms related sections to read in Networked Life: 2.1,
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
1 Discovering Authorities in Question Answer Communities by Using Link Analysis Pawel Jurczyk, Eugene Agichtein (CIKM 2007)
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
LOGO Finding High-Quality Content in Social Media Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis and Gilad Mishne (WSDM 2008) Advisor.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Ch 14. Link Analysis Padmini Srinivasan Computer Science Department
Gao Cong, Long Wang, Chin-Yew Lin, Young-In Song, Yueheng Sun SIGIR’08 Speaker: Yi-Ling Tai Date: 2009/02/09 Finding Question-Answer Pairs from Online.
Finding high-Quality contents in Social media BY : APARNA TODWAL GUIDED BY : PROF. M. WANJARI.
Algorithmic Detection of Semantic Similarity WWW 2005.
Ranking Link-based Ranking (2° generation) Reading 21.
Finding Experts Using Social Network Analysis 2007 IEEE/WIC/ACM International Conference on Web Intelligence Yupeng Fu, Rongjing Xiang, Yong Wang, Min.
Slides are modified from Lada Adamic
Correlation. Correlation Analysis Correlations tell us to the degree that two variables are similar or associated with each other. It is a measure of.
Measuring Behavioral Trust in Social Networks
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
Liangjie Hong and Brian D. Davison Department of Computer Science and Engineering Lehigh University SIGIR 2009.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Information Retrieval and Web Search Link analysis Instructor: Rada Mihalcea (Note: This slide set was adapted from an IR course taught by Prof. Chris.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
A Unified Approach to Ranking in Probabilistic Databases Jian Li, Barna Saha, Amol Deshpande University of Maryland, College Park, USA VLDB
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Models of Web-Like Graphs: Integrated Approach
哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.
Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Hamed Haddadi Fabricio Benevenuto Krishna P. Gummadi.
XRANK: RANKED KEYWORD SEARCH OVER XML DOCUMENTS Lin Guo Feng Shao Chavdar Botev Jayavel Shanmugasundaram Abhishek Chennaka, Alekhya Gade Advanced Database.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
Xiang Li,1 Lili Mou,1 Rui Yan,2 Ming Zhang1
Who is the Expert? Combining Intention and Knowledge of Online Discussants in Collaborative RE Tasks Itzel Morales-Ramirez1,2, Matthieu Vergne1,2, Mirko.
4.12 & 4.13 UNDERSTAND DATA-COLLECTION METHODS TO EVALUATE THEIR APPROPRIATENESS FOR THE RESEARCH PROBLEM/ISSUE. RATING SCALES 4.00 Understand promotion.
Associative Query Answering via Query Feature Similarity
A Comparative Study of Link Analysis Algorithms
Jiawei Han Department of Computer Science
Fusing Rating-based and Hitting-based Algorithms in Recommender Systems Xin Xin
Presentation transcript:

Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007, Banff, Alberta, Canada.

Outline  Automatically identify users with expertise.  Analysis of the java forum  Test various network based ranking algorithms such as HITS and PageRank  Use simulations rules to evaluate how other alogorithms perform on Java Forum.  Evaluate performance in communities with different characteristics.

Introduction  Expertise Finder – Systems that help to find others with appropriate expertise to answer a question.  Current Expertise finders – Modern Information retrieval techniques.  Represent as term vector, match expertise queries using standard IR techniques.  Problem : Reflect if a person knows about a topic but does not distinguish person’s relative expertise levels.  Solution – Use network based ranking algorithm + content analysis.

Expertise Network  Usually have discussion thread structure Not a network focused on social relationships User replies because of interest in content. CEN – Community Expertise Network – Distribution of expertise along with network responses Structural Prestige – Closely related. Receiving more positive choices is prestigious.

Empirical Study – Java Forum  People come to ask questions.  87 sub forums with large diversity of users.  333,314 messages in 49,888 threads.  13,739 nodes and 55,761 edges.  Used human raters and selected 135 users – omitting users postings less than 10 times.

Characterizing the Network  Bow-tie Structure analysis Degree Distribution – To capture Level of interaction. Scale Free - Highly uneven distribution of participation. Degree Correlations Indegree – how many people a given user helps. Does not provide users’ own tendency to provide help- Eg. Only reply to newbies or talk to similar expertise level people. For Each asker-replier count indegree of replier vs asker.

Expertise Ranking Algorithms  Simple Statistical Measure Answers lot = knows the topic well. Spammers – inflammatory or disruptive posts. Handling Problem  Users’ relevance feedback. AnswerNum – No of questions answered. Also count no of users a user helped. Shows broader or greater expertise.

Z- Score Measures  Replying many = High Expertise  Asking many = lacks expertise on topics  Z – Score Combines both q + a.  Measure how different from a random user Post answers with p = 0.5 so n*p =n/2 replies Std Dev. Sqrt ( n*p*(1-p) = Sqrt(n) / 2  Asks and answers ~= 0, Answer more +

Expertise Rank Algorithm  Problem in Counting no posts user answered 100 newbie questions ranked equally expert as 100 advanced users’ ques.  Adopt method similar to PageRank.  Intuition B<-A and C<- B.C’s Expertise boosted. C(Ui) – Total no of users helping U1 d – Damping factor was set to 0.85 Could also be weighted including W iA – No times i was helped by A In this study, weighting does not improve the accuracy.

Evaluations  2 raters- Java Programming experts.  Five Levels of Expertise Rating.

Statistical Metrics  Frequently used correlation measures Spearmans rho : Does not handle weak ordering(i.e. Multiple items in ranking such that neither item is preferred over the other). Kendall’s Tau : Gives equal weight to any interchange of equal distance, no matter where it occurs. Eg between 1 & 2, 101 &102 TopK :Calculates Kendall’s Tau only for highest 20 ranks

Performance of Various Algorithms in different statistical metrics.

Simulations  The Need for it Understanding the human dynamics that shape an online community. This will help select appropriate algorithm for communities where dynamics different from the Java Forum.  2 Models - Best Preferred and Just Better Network

Best Preferred Network  Many experts answered others’ questions and seldom asked questions.  Very much similar to the Java Forum. P of replying increases exponentially with expertise level difference between 2 users

Just Better Network  Eg. Within an Organisation, experts may be under time constraints. Choose to answer only questions makes best use of their expertise.  Users having slightly better level of expertise answers.  U’s probability of answering a’s question

Contd…  Users make best use of their time  They are more selective in answering.  ExpertiseRank propagates expertise score from newbies to intermediate users who answer their question.  From them to experts.  In General ExpertiseRank outperforms others.

Network generated from both the models.

Summary & Future Work  Structural Information can be used to evaluate expertise network in online setting.  Relative expertise could be found using social network-based algorithms.  These algorithms did nearly as well as human raters.  In Future, Combine content information – to differentiate specific knowledge and structural information.

THANK YOU !!!

Human raters Vs Algorithms