11 Learning to Suggest Questions in Online Learning to Suggest Questions in Online Forums Tom Chao Zhou, Chin-Yew Lin, Irwin King Michael R.

Slides:



Advertisements
Similar presentations
Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
Advertisements

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
A Markov Random Field Model for Term Dependencies Donald Metzler and W. Bruce Croft University of Massachusetts, Amherst Center for Intelligent Information.
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
Investigation of Web Query Refinement via Topic Analysis and Learning with Personalization Department of Systems Engineering & Engineering Management The.
MusicSense: Contextual Music Recommendation using Emotional Allocation Modeling Rui Cai, Chao Zhang, Chong Wang, Lei Zhang, and Wei-Ying Ma Proceedings.
Affinity Rank Yi Liu, Benyu Zhang, Zheng Chen MSRA.
Maryam Karimzadehgan (U. Illinois Urbana-Champaign)*, Ryen White (MSR), Matthew Richardson (MSR) Presented by Ryen White Microsoft Research * MSR Intern,
Online Spelling Correction for Query Completion Huizhong Duan, UIUC Bo-June (Paul) Hsu, Microsoft WWW 2011 March 31, 2011.
Huizhong Doan, Yonbo Cao, Chin-Yew Lin and Yong Yu Shanghai Jiao Tong University & MSRA ACL /7/91Rick Liu.
Quality-aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim Singapore Management University Aixin Sun.
SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.
Finding Similar Questions in Large Question and Answer Archives Jiwoon Jeon, W. Bruce Croft and Joon Ho Lee Retrieval Models for Question and Answer Archives.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Language Models Hongning Wang Two-stage smoothing [Zhai & Lafferty 02] c(w,d) |d| P(w|d) = +  p(w|C) ++ Stage-1 -Explain unseen words -Dirichlet.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
A New Suffix Tree Similarity Measure for Document Clustering
INTRODUCING THE WEB INTELLIGENCE (WIT) GROUP Microsoft Research Asia.
Learning with Social Media
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog,
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Question Routing in Community Question Answering: Putting Category in Its Place 1 The Chinese University of Hong Kong, Shatin, N.T., Hong Kong 2 AT&T Labs.
Gao Cong, Long Wang, Chin-Yew Lin, Young-In Song, Yueheng Sun SIGIR’08 Speaker: Yi-Ling Tai Date: 2009/02/09 Finding Question-Answer Pairs from Online.
Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.
Positional Relevance Model for Pseudo–Relevance Feedback Yuanhua Lv & ChengXiang Zhai Department of Computer Science, UIUC Presented by Bo Man 2014/11/18.
Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Finding Experts Using Social Network Analysis 2007 IEEE/WIC/ACM International Conference on Web Intelligence Yupeng Fu, Rongjing Xiang, Yong Wang, Min.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
1 A Biterm Topic Model for Short Texts Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.
Liangjie Hong and Brian D. Davison Department of Computer Science and Engineering Lehigh University SIGIR 2009.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Local Linear Matrix Factorization for Document Modeling Institute of Computing Technology, Chinese Academy of Sciences Lu Bai,
A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June , 2008, Columbus Ohio.
Recommender Systems with Social Regularization Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu The Chinese University of Hong Kong Irwin.
More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
Link Distribution on Wikipedia [0407]KwangHee Park.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Date: 2012/5/28 Source: Alexander Kotov. al(CIKM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Interactive Sense Feedback for Difficult Queries.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,
Poster Spotlights Conference on Uncertainty in Artificial Intelligence Catalina Island, United States August 15-17, 2012 Session: Wed. 15 August 2012,
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Finding Question-Answer Pairs from Online Forums ACM, SIGIR 08 Gao Cong Aalborg University, Aalborg, Denmark Long Wang Tianjin University, Tianjin, China.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu
Recommending Forum Posts to Designated Experts
Chinese Academy of Sciences, Beijing, China
Next Question Prediction
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Presentation transcript:

11 Learning to Suggest Questions in Online Learning to Suggest Questions in Online Forums Tom Chao Zhou, Chin-Yew Lin, Irwin King Michael R. Lyu, Young-In Song, Yunbo Cao Chinese University of Hong Kong Microsoft Research Asia AT&T Labs Research August 11, San Francisco, USA

22 Learning to Suggest Questions in Online Background Motivation Related Work Experiments Our Approach Conclusions and Future Work

33 Learning to Suggest Questions in Online Background Online forum –Web application –Interactive, domain-specific –E.g. travel, sports, programming

44 Learning to Suggest Questions in Online Background Threads Each thread contains a discussion topic

55 Learning to Suggest Questions in Online Background Questions are focus –[Shrestha and McKeown 2004] Mining knowledge, Question-Answer pairs –[Cong et al. 2008][Bian et al. 2008] Question search –How is Orange Beach in Alabama? –Any idea about Orange Beach in Alabama? Limitation –Unware query only capture one aspect of a topic

66 Learning to Suggest Questions in Online Background Motivation Related Work Experiments Our Approach Conclusions and Future Work

77 Learning to Suggest Questions in Online Motivation Suggest semantically related questions –How is Orange Beach in Alabama? –Is the water pretty clear this time of year on Orange Beach? –Do they have chair and umbrella rentals on Orange Beach? –Topic: “Travel in Orange Beach” –beach, water, chair, umbrella, rental…

88 Learning to Suggest Questions in Online Motivation Benefits –Explore information needs from different aspects “Travel”: beach, water, chair, umbrella –Increase page views Enticing users’ clicks on suggested questions –Relevance feedback mechanism Mining users’ click through logs on suggested questions

99 Learning to Suggest Questions in Online Background Motivation Related Work Experiments Our Approach Conclusions and Future Work

10 Learning to Suggest Questions in Online Related Work Question search –Translation model [Jeon, Croft and Lee 2005][Duan et al. 2008] –Translation based language model [Xue, Jeon and Croft 2008] Question recommendation –MDL-based tree cut model [Cao et al. 2008] Differences –Fuse both lexical and latent semantic information –Utilizing interactive nature of online forums

11 Learning to Suggest Questions in Online Background Motivation Related Work Experiments Our Approach Conclusions and Future Work

12 Learning to Suggest Questions in Online Our Approach Document representation –Bag-of-words Independent Fine-grained representation Lexically similar –Topic model Assign a set of latent topic distributions to each word Capturing important relationships between words Coarse-grained representation Semantically related

13 Learning to Suggest Questions in Online Our Approach TopicTRLM –Topic-enhanced Translation-based Language Model

14 Learning to Suggest Questions in Online Our Approach TopicTRLM –q: a query, D: a candidate question –w: a word in query – : parameter balance weights of BoW and topic model –Jelinek-Mercer smoothing TRLM score: BoW LDA score: topic model

15 Learning to Suggest Questions in Online Our Approach TRLM –C: question corpus, :Dirichlet smoothing parameter –T(w|t): word to word translation probabilities Use of LDA K: number of topics, z: a topic

16 Learning to Suggest Questions in Online Our Approach Estimate T(w|t) –IBM model 1, monolingual parallel corpus –Questions are focus of forum discussions, questions posted by a thread starter (TS) during the discussion are very likely to explore different aspects of a topic Build parallel corpus –Extract questions posted by TS, question pool Q –Question-question pairs, enumerating combinations in Q –Aggregating all q-q pairs from each forum thread

17 Learning to Suggest Questions in Online Background Motivation Related Work Experiments Our Approach Conclusions and Future Work

18 Learning to Suggest Questions in Online Experiments Data set –Crawled from TripAdvisor –TST_LABEL: labeled data for 268 questions –TST_UNLABEL: 10,000 threads at least 2 questions posted by thread starters –TRAIN_SET: 1,976,522 questions,971,859 threads Parallel corpus to learn T(w|t) LDA training data Question repository Question detector –Labeled sequential pattern mining[Cong et al. 2008]

19 Learning to Suggest Questions in Online Experiments Data analysis Post level Forum discussions are quite interactive Power law # Threads# Threads that have replied posts from TS Average # replied posts from TS 1,412,141566,2561.9

20 Learning to Suggest Questions in Online Experiments Data analysis Question level 68.8% thread starters asked questions On average 2 questions are asked by thread starters in each thread Question is a focus of forum discussions # Threads# Threads TSs’ posts contain questions Average # questions in TSs’ posts 1,412,141971,8592.0

21 Learning to Suggest Questions in Online Experiments Word translation IBM 1: semantic relationships of words from semantically related questions LDA: co-occurrence relations in a question

22 Learning to Suggest Questions in Online Experiments Labeled question LDA performs the worst, coarse-grained TRLM > TR > QL TopicTRLM outperforms other approaches

23 Learning to Suggest Questions in Online Experiments Topics’ joint probability distribution –For each q, consider its first subsequent question q’ posted by the TS as relevant –For 10,000 q, LDA to infer the most probable topic, aggregate the counts of topic transitions –K * K topic transition matrix as ground truth –KL divergence, the smaller, the better

24 Learning to Suggest Questions in Online Background Motivation Related Work Experiments Our Approach Conclusions and Future Work

25 Learning to Suggest Questions in Online Conclusions and Future Work Summary –Propose a question suggestion application in forums –Propose a method to build parallel corpus of related questions –Propose TopicTRLM, which fuses lexical knowledge with latent semantic knowledge Future work –How to measure and diversify the suggested questions? –How question suggestion could help long query suggestion?

26 Learning to Suggest Questions in Online Thanks! Q & A

27 Learning to Suggest Questions in Online FAQ Q: Which tools do you use? A: –GIZA++ [Och and Ney 2003] train IBM model 1. –GibbsLDA++ [Phan, Nguyen and Horiguchi 2008] to conduct LDA training and inference. –Porter Stemmer to stem question words. –Stop word list by SMART system, but 5W1H were removed

28 Learning to Suggest Questions in Online FAQ Q: Which metrics do you use? A: Precision at Rank R –MAP: Mean average precision –MRR: Mean reciprocal rank –KL-divergence: Kullback-Leibler divergence

29 Learning to Suggest Questions in Online FAQ Q: How to tune parameters? A: We used 20 queries from TST_LABEL, and employ MAP to tune parameters

30 Learning to Suggest Questions in Online FAQ Q:Aligned monolingual questions A: –Has anyone had an experiences with the Eden Condos in Perdido Key? –Does anyone know how the beaches are there in Perdido key? –Can you go fishing right from the shore on Orange Beach? –What kinds of rods, and bait is needed for fishing down there?

31 Learning to Suggest Questions in Online FAQ Query likelihood language model using Dirichlet smoothing (QL)

32 Learning to Suggest Questions in Online FAQ Translation model using Dirichlet smoothing (TR)