Download presentation
Presentation is loading. Please wait.
Published byEric Chapman Modified over 9 years ago
1
Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012, Lyon, France
2
Introduction 22/2/20162
3
Question Quality 22/2/20163 Number of tag-of-interests Number of answers
4
Motivation Question quality affects answer quality – Low quality questions hinder the CQA services – High quality questions promote the development of the community Identifying question quality facilitates question search and recommendation 22/2/20164
5
Outline Problem Definition Data Two Studies – Factors Affecting Question Quality – Prediction of Question Quality Discussion and Conclusion 22/2/20165
6
Problem Definition 22/2/20166 Figure 1. Construct of question quality in CQA
7
Data Description 22/2/20167 Table 1. Summary of data in Entertainment & Music category and its subcategories
8
Ground Truth 22/2/20168 NTA 4321 44432 34332 23321 12211 Table 2. Rule base for the ground truth setting RM Table 3. Summary of questions in four levels Level1234 Count53,80662,19269,83652,715 NTA: number of tag-of-interests + number of answers RM: reciprocal of the minutes for getting the best answer 1 23 4
9
Study One: Factors Affecting Question Quality Possible Factors Process – Select the two most popular subcategories (say, Music and Movies) and check their distributions of question quality – Track askers with at least five questions in both these two subcategories 22/2/20169 Askers Topics
10
Observations 22/2/201610 Table 4. Summary of question quality for different askers
11
Observations 22/2/201611 Question Quality
12
Study Two: Prediction of Question Quality Modeling the relationships among questions, topics and askers as a bipartite graph 22/2/201612 Asking Expertise Question Quality
13
Mutual Reinforcement Label Propagation for Predicting Question Quality 22/2/201613 ? ? ? ? ? ? ?
14
MRLP 22/2/201614 similar users’ asking expertise question quality asking expertise similar questions’ quality
15
Data for Study Two 22/2/201615
16
Methods Comparison Logistic Regression – LG_Q and LG_QA Stochastic Gradient Boosted Tree (Friedman, J. H., 1999) – SGBT_Q and SGBT_QA Harmonic Function (Zhou et al., 2007) – HF_Q and HF_QA 22/2/201616
17
Experimental Results: Accuracy 22/2/201617
18
Sensitivity & Specificity Sensitivity measures the algorithm’s ability to identify high quality questions Sensitivity = TP/(TP+FN) Specificity measures the algorithm’s ability to identify low quality questions Specificity = TN/(TN+FP) 22/2/201618
19
Experimental Results: Music 22/2/201619
20
Experimental Results: Movies 22/2/201620
21
Discussion MRLP is more effective in distinguishing high quality questions from low quality ones than state-of-the-art methods At present, neither MRLP nor other methods achieves satisfactory performance due to the influence of features 22/2/201621
22
Discussion Salient features? – User study via crowdsourcing sytems 22/2/201622
23
Conclusion Define Question Quality in CQA Conduct two studies to investigate question quality in CQA services – Analyze the factors influencing question quality – Propose a mutual reinforcement-based label propagation algorithm to predict question quality Future Work – Explore more salient features – Utilize question quality to improve question search and question recommendation 22/2/201623
24
Thank You! Q&A
25
Data Description 238,549 resolved questions under the Entertainment & Music category of Yahoo! Answers Question Features – Text, post time, etc. Asker Features – Total points, No. of questions asked, No. of questions resolved, etc. 22/2/201625
26
MRLP 22/2/201626 For the question part of the bipartite graph, we create edges between any two questions within same topics: n × n probabilistic transition matrix For the asker part of the bipartite graph, we generate the probabilistic transition matrix M similarly.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.