Presentation is loading. Please wait.

Presentation is loading. Please wait.

Question-Answering on Yahoo!Answers: Preliminary Results Rong Tang Sheila Denn OCLC/ALISE LIS Research Grant Presentation ALISE 2009 January 23, 2009.

Similar presentations

Presentation on theme: "Question-Answering on Yahoo!Answers: Preliminary Results Rong Tang Sheila Denn OCLC/ALISE LIS Research Grant Presentation ALISE 2009 January 23, 2009."— Presentation transcript:

1 Question-Answering on Yahoo!Answers: Preliminary Results Rong Tang Sheila Denn OCLC/ALISE LIS Research Grant Presentation ALISE 2009 January 23, 2009

2 Background Yahoo!Answers Social Q&A 25+ pre-defined categories Users post questions, answer questions, rate answers, provide comments One best answer chosen by the asker or through vote Users may provide comments



5 Rating/Voting/Commenting

6 Our Research Project Funded by OCLC/ALISE Grant Program and Simmons College President’s Fund for Research Project Staff: Rong Tang (PI) Sheila Denn (Co-PI) Sam Kalat (technology consultant, programmer) Laura Saunders (Research Assistant) The project wiki page documents the relevant literature and project progression, with extensive meeting notes on coding decisions project wiki page project wiki page

7 Research Questions Are existing question taxonomies (such as those in Graesser et al. (1994) and Freed (1994)) valid in a social Q&A environment? What are the relationships between the linguistic characteristics, functional properties, and subject content of the questions and the kinds of responses that they receive? What are the characteristics of answers that are chosen as “best” answers? What is the role of the social function vs. the information function in social Q&A? What are the implications of the above for provision of library and information services?

8 Previous Research Question classification Wh- questions (Robinson & Rackstraw, 1972) Conceptual question categories (Lehnert, 1978) Content-based question categories (Graesser, et al., 1994) Reference question classification (Pomerantz, 2005) Questions in Dynamic Semantics (Aloni, Butler, & Dekker, 2007) Answer classification Much less research here than with question classification Answer selection rules (Lehnert, 1978) Criteria based on Yahoo!Answers comments (Kim et al., 2007)

9 Previous Research (cont.) Formal studies of Online Q&A Answerers: “specialists” vs. “synthesists” (Gazan, 2006) Questioners: “seekers” vs. “sloths” (Gazan, 2007) Question purpose (Graesser, et al., 1994) Filling knowledge gaps Establishing and monitoring common ground Coordinating social action Directing the conversation and controlling attention

10 Research Plan Data collection and sampling Gathered a stratified random sample of 3,000 question-answer sets, including any comments Stratified by 25 top-level categories assigned by Yahoo!Answers Data coding Content analysis at multiple levels SyntacticSemanticPragmatic

11 Research Plan (cont.) Data Analysis Descriptive statistics will be produced for: Frequency of answers provided per question Average length of time to first answer Distribution of subject categories Distribution of question and answer types Distribution of chosen answer types Correlation analysis will be performed for: Linguistic characteristics of questions and answers Functional categories of questions and answers Subject categories of questions and answers

12 Progress to Date Sample has been collected Preliminary coding has begun Syntactic coding of questions is complete Wh- questions Inversion questions Other questions Multiparts Double coding Syntactic coding of question descriptions is complete Number of questions included in description text Type of questions

13 Data Coding Two coders perform coding individually then go over the coding to reach consensus on final coding of each question Use of informal language presents a challenge for coding Is it a question if it doesn’t include a question mark? Is it a question simply because it has a question mark in the end? Should “WTF” be coded a “what” question or other question? Or not at all? Coding multiparts of a question, eg., “Why do husbands feel they have to lie to other women about being married, and when the other woman finds out?” Double coding questions such as "Is there anywhere you can listen to citizen band radio online?"

14 Preliminary Results

15 Number of Answers Per Question

16 Length to Receive 1 st Answer

17 Wh-question frequency “What” Questions

18 Wh-question frequency “Why” Questions

19 Wh-question frequency “How” Questions

20 Wh-question frequency “Inversion” Questions

21 Next Steps Start semantic and pragmatic analysis of questions Start answer analysis Start comment coding Explore the association and features of Q and A and C Develop a conceptual and analytical model for social Q&A

22 Questions?

Download ppt "Question-Answering on Yahoo!Answers: Preliminary Results Rong Tang Sheila Denn OCLC/ALISE LIS Research Grant Presentation ALISE 2009 January 23, 2009."

Similar presentations

Ads by Google