School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
/ Where innovation starts 1212 Technische Universiteit Eindhoven University of Technology 1 Incorporating Cognitive/Learning Styles in a General-Purpose.
Analysis and Modeling of Social Networks Foudalis Ilias.
Comparison of Social Networks by Likhitha Ravi. Outline What is a social network? Elements of social network Previous studies What is missing in previous.
Introduction to Labor Marketplaces: Taskcn Uichin Lee KAIST KSE KSE801: Human Computation and Crowdsourcing.
Telligent Social Analytics Research & Tools Marc A. Smith Chief Social Scientist Telligent Systems.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Evaluating Search Engine
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Social networks and information sharing Lada Adamic.
Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
Oozing out knowledge in human brains to the Internet Lada Adamic School of Information University of Michigan
How naïve are people on Internet Final, June 1st.
Computing Trust in Social Networks
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
Defense: Knowledge Sharing and Yahoo Answers: Everyone Knows Something L. A. Adamic, et al.
Web Projections Learning from Contextual Subgraphs of the Web Jure Leskovec, CMU Susan Dumais, MSR Eric Horvitz, MSR.
Link Structure and Web Mining Shuying Wang
EFFECTS OF COMMUNITY SIZE AND CONTACT RATE ON SYNCHRONOUS SOCIAL Q&A Ryen W. White Microsoft Research Matthew Richardson Microsoft Research Yandong Liu.
Algorithms for Data Mining and Querying with Graphs Investigators: Padhraic Smyth, Sharad Mehrotra University of California, Irvine Students: Joshua O’
School of Information University of Michigan Expertise Networks in Online Communities: Structure and Algorithms Lada Adamic joint work with Jun Zhang and.
Overview of Web Data Mining and Applications Part I
Quality-aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim Singapore Management University Aixin Sun.
Department of Computer Science, University of California, Irvine Site Visit for UC Irvine KD-D Project, April 21 st 2004 The Java Universal Network/Graph.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα
1. Learning Outcomes At the end of this lecture, you should be able to: –Define the term “Usability Engineering” –Describe the various steps involved.
Models of Influence in Online Social Networks
School of Information University of Michigan Expertise Sharing Dynamics in Online Forums Lada Adamic joint work with Jun Zhang, Mark Ackerman, Eytan Bakshy,
Research Meeting Seungseok Kang Center for E-Business Technology Seoul National University Seoul, Korea.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Building your brand as a recruiter using social media tools Esther Riesenbeck
Soon-Hyung Yook, Sungmin Lee, Yup Kim Kyung Hee University NSPCS 08 Unified centrality measure of complex networks.
Uichin Lee, Jihyoung Kim *, Eunhee Yi **, Juyup Sung, Mario Gerla * KAIST Knowledge Service Engineering * UCLA Computer Science ** LG UX R&D Lab
Introduction to Q&A systems Yahoo! Answer and Naver KiN KSE 801 Uichin Lee.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.
DIGITAL COMMUNITIES Chapter Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall.
1 Discovering Authorities in Question Answer Communities by Using Link Analysis Pawel Jurczyk, Eugene Agichtein (CIKM 2007)
To Blog or Not to Blog: Characterizing and Predicting Retention in Community Blogs Imrul Kayes 1, Xiang Zuo 1, Da Wang 2, Jacob Chakareski 3 1 University.
7th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Privacy Concerns vs. User Behavior in Community Question Answering.
TWITTER What is Twitter, a Social Network or a News Media? Haewoon Kwak Changhyun Lee Hosung Park Sue Moon Department of Computer Science, KAIST, Korea.
Complex Networks First Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Predicting Positive and Negative Links in Online Social Networks
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
LOGO Finding High-Quality Content in Social Media Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis and Gilad Mishne (WSDM 2008) Advisor.
Finding high-Quality contents in Social media BY : APARNA TODWAL GUIDED BY : PROF. M. WANJARI.
Web Intelligence Complex Networks I This is a lecture for week 6 of `Web Intelligence Example networks in this lecture come from a fabulous site of Mark.
With each device or application that expands the bandwidth of available information, the computer ’ s understanding of us remains unchanged.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
1 CS 430: Information Discovery Lecture 5 Ranking.
SUPPORTING SYNCHRONOUS SOCIAL Q&A THROUGHOUT THE QUESTION LIFECYCLE Matthew Richardson Ryen White Microsoft Research.
By Bundhun Amit Varma HMOA  Define Online Discussion  Recognise models of online discussions ◦ Synchronous ◦ Asynchronous  Distinguish three.
Models of Web-Like Graphs: Integrated Approach
CS 540 Database Management Systems Web Data Management some slides are due to Kevin Chang 1.
GRAPH AND LINK MINING 1. Graphs - Basics 2 Undirected Graphs Undirected Graph: The edges are undirected pairs – they can be traversed in any direction.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Groups of vertices and Core-periphery structure
Who is the Expert? Combining Intention and Knowledge of Online Discussants in Collaborative RE Tasks Itzel Morales-Ramirez1,2, Matthieu Vergne1,2, Mirko.
On Assigning Implicit Reputation Scores in an Online Labor Marketplace
User Joining Behavior in Online Forums
Architecture Components
Knowledge Management Systems
Generative Model To Construct Blog and Post Networks In Blogosphere
The likelihood of linking to a popular website is higher
Jiawei Han Department of Computer Science
WorkShop on Community Question Answering on the Web
Graph and Link Mining.
Presentation transcript:

School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School of Information, University of Michigan International Symposium on Self-Organizing Online Communities March 31 st, 2007

motivation  lots of people are turning to question-answer forums for help  automatically infer the expertise of participants  expertise could be used to rank answers, or recommend posts one could reply to methods  empirical evaluation of ranking algorithms  social network analysis  simulation  understand underlying dynamics  predict performance of ranking algorithms in communities with yet-unobserved dynamics

related work Netscan (Marc Smith & co) Robert Kraut commitment & online community Virtual communities (Barry Wellman) using link-based ranking algorithms to evaluate expertise in networks (Dom et al.) image credit: Danyel Fisher

Can we automatically infer expertise? We use PageRank, HITS, ask/reply ratios, etc. to try and automatically infer the expertise of the users Human raters read the posts made by users In online JavaForum, ask/reply ratio outperforms PageRank… Develop simulations: distribution of expertise (skewed) who asks questions most often? (novices) who answers questions 1. best expert most likely 2. someone a bit more expert

Constructing a community expertise network A BC Thread 1 Thread 2 Thread 1: Large Data, binary search or hashtable? user A Re: Large... user B Re: Large... user C Thread 2: Binary file with ASCII data user A Re: File with... user C A B C 1 1 A BC 1 2 A BC 1/2 1+1//2 A B C unweighted weighted by # threads weighted by shared credit weighted with backflow

JavaForum 87 sub-forums 1,438,053 messages community expertise network constructed: 196,191 users 796,270 edges Observations More than 55% of users usually only ask questions, while there are about 25% of users answer questions. Many questions are answered by few advanced users while majority of users only answer a few. Top repliers answer questions for everyone. However, less expert users tend to answer questions of others with lower expertise level.

Uneven participation number of people one replied to ‘answer people’ may reply to thousands of others ‘question people’ are also uneven in the number of repliers to their posts, but to a lesser extent

Not Everyone Asks/Replies Core: A strongly connected component, in which everyone asks and answers IN: Mostly askers. OUT: Mostly Helpers The Web is a bow tieThe Java Forum network is An uneven bow tie

relating network structure to Java expertise Human-rated expertise levels 2 raters 135 JavaForum users with >= 10 posts inter-rater agreement (  = 0.74,  = 0.83) for evaluation of algorithms, omit users where raters disagreed by more than 1 level (  = 0.80,  = 0.83) LCategoryDescription 5Top Java expertKnows the core Java theory and related advanced topics deeply. 4Java professionalCan answer all or most of Java concept questions. Also knows one or some sub topics very well, 3Java userKnows advanced Java concepts. Can program relatively well. 2Java learnerKnows basic concepts and can program, but is not good at advanced topics of Java. 1NewbieJust starting to learn java.

Structural Info Based Expertise Ranking Metrics # replies posted (# answers) experts can answer many questions # people replied to (# indegree) experts can answer questions from many different people z-score for the 2 above (observed –  )/  experts are above the mean in the above two metrics PageRank replying to people who reply to people higher level experts can answer mid-level experts HITS experts answer questions by people whose questions other experts have answered hubs point to good authorities

automated vs. human ratings # answers human rating automated ranking z # answers HITS authority indegree z indegree PageRank

JavaForum empirical evaluation of ranking algorithms simple local measures do as well (and better) than measures incorporating the wider network topology Top K Kendall’s  Spearman’s  # answers z-score # answers indegree z-score indegree PageRank HITS authority

Modeling community structure to explain algorithm performance

simulating probability of expertise pairing suppose: expertise is uniformly distributed probability of posing a question is inversely proportional to expertise p ij = probability a user with expertise j replies to a user with expertise i 2 models: ‘best’ preferred‘just better’ preferred j>i

visualization Best “preferred”just better

degree correlation profiles best preferred (simulation)just better (simulation) degree-degree correlations between asker and helper asker indegree

It can tell us when to use which algorithms Preferred Helper: ‘ just better ’ Preferred Helper: ‘ best available ’

Different ranking algorithms perform differently In the ‘just better’ model, a node is correctly ranked by PageRank but not by HITS

simplest models do not capture all ‘local’ interactions

Summary Expertise Networks have interesting characteristics A set of useful metrics Simulation as an analysis tool There are rich design opportunities Find experts with the help of structural information (and content analysis) Predict good answers Re-order questions/answers to match expertise questions posed by experts wait an average of 9 hours for the first reply novice questions are answered in 40 minutes working paper: “Expertise-Level based Interface Personalization for Online Help-seeking Communities”

Looking at diverse sets of question-answer forums (Yahoo Answers) Expertise across different topics Using explicit ratings for evaluation of automated expertise identification & incorporation into algorithms (battling spam) Users’ expertise change over time Developing applications, e.g. recommender engines for questions Future Work cars & transportation maintenance & repairs beauty & style hair

for more info ExpertiseRank algorithms and evaluations Zhang, J., Ackerman, M.S., Adamic, L., Expertise Networks in Online Communities: Structure and Algorithms, WWW’07 Simulations of expertise networks Zhang, J., Ackerman, M.S., Adamic, L., CommunityNetSimulator: Using Simulations to Study Online Community Network Formation and Implications, C&T2007 Jun Zhang personal.si.umich.edu/~junzh Mark Ackerman Lada Adamic personal.umich.edu/~ladamic