Personalized, Interactive Question Answering on the Web

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Date: 2013/1/17 Author: Yang Liu, Ruihua Song, Yu Chen, Jian-Yun Nie and Ji-Rong Wen Source: SIGIR12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Adaptive.
Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,
Chapter 5: Introduction to Information Retrieval
Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
WWW 2014 Seoul, April 8 th SNOW 2014 Data Challenge Two-level message clustering for topic detection in Twitter Georgios Petkos, Symeon Papadopoulos, Yiannis.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Search Engines and Information Retrieval
Computer comunication B Information retrieval. Information retrieval: introduction 1 This topic addresses the question on how it is possible to find relevant.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
Chapter 5: Information Retrieval and Web Search
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Search Engines and Information Retrieval Chapter 1.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
1 Efficient Search Ranking in Social Network ACM CIKM2007 Monique V. Vieira, Bruno M. Fonseca, Rodrigo Damazio, Paulo B. Golgher, Davi de Castro Reis,
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
11 A Hybrid Phish Detection Approach by Identity Discovery and Keywords Retrieval Reporter: 林佳宜 /10/17.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
Chapter 6: Information Retrieval and Web Search
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Date : 2012/10/25 Author : Yosi Mass, Yehoshua Sagiv Source : WSDM’12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
A Novel Pattern Learning Method for Open Domain Question Answering IJCNLP 2004 Yongping Du, Xuanjing Huang, Xin Li, Lide Wu.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
Personalizing Web Search using Long Term Browsing History Nicolaas Matthijs, Cambridge Filip Radlinski, Microsoft In Proceedings of WSDM
21/11/20151Gianluca Demartini Ranking Clusters for Web Search Gianluca Demartini Paul–Alexandru Chirita Ingo Brunkhorst Wolfgang Nejdl L3S Info Lunch Hannover,
A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
PERSONALIZED DIVERSIFICATION OF SEARCH RESULTS Date: 2013/04/15 Author: David Vallet, Pablo Castells Source: SIGIR’12 Advisor: Dr.Jia-ling, Koh Speaker:
TO Each His Own: Personalized Content Selection Based on Text Comprehensibility Date: 2013/01/24 Author: Chenhao Tan, Evgeniy Gabrilovich, Bo Pang Source:
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)
Linguistic Graph Similarity for News Sentence Searching
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Web News Sentence Searching Using Linguistic Graph Similarity
IST 516 Fall 2011 Dongwon Lee, Ph.D.
Getting High on Search Engines with WebPositionGold
Information Retrieval
IR Theory: Evaluation Methods
Data Integration for Relational Web
Chapter 5: Information Retrieval and Web Search
Combining Keyword and Semantic Search for Best Effort Information Retrieval  Andrew Zitzelberger 1.
Date : 2013/1/10 Author : Lanbo Zhang, Yi Zhang, Yunfei Chen
CS246: Information Retrieval
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
Information Retrieval and Web Design
Information Retrieval and Web Design
WSExpress: A QoS-Aware Search Engine for Web Services
Introduction to Search Engines
Journal of Web Semantics 55 (2019)
Connecting the Dots Between News Article
Presentation transcript:

Personalized, Interactive Question Answering on the Web Silvia Quarteroni (COLING’08) Speaker: Cho, Chin Wei Advisor: Dr. Koh, Jia-Ling Date: 06/07/2010

Outline Introduction Baseline System Architecture User Modeling for Personalization Personalized QA Algorithm Interactivity Conclusion & Future Works

Introduction Two of the current issues of Question Answering (QA) systems: The lack of personalization to the individual users’ needs. The lack of interactivity by which at the end of each QA session the context of interaction is lost. user often issue queries not as standalone questions but in the context of a wider information need

Baseline System Architecture Question Processing The query is classified and the two top expected answer types are estimated. It is then submitted to the underlying search engine. Document Retrieval The top n documents are retrieved from the search engine (Google) and split into sentences.

Baseline System Architecture Answer Extraction A sentence-level similarity metric combining lexical, syntactic and semantic criteria is applied to the query and to each retrieved document sentence to identify candidate answer sentences. Candidate answers are ordered by relevance to the query; the Google rank of the answer source document is used as a tie-breaking criterion. The list of top ranked answers is then returned to the user in an HTML page.

User Modeling for Personalization Age range 7 − 10, 11 − 16, adult Reading level basic, medium, advanced Profile a set of textual documents, bookmarks and Web pages of interest.

Reading Level Estimation 180 HTML documents from a collection of Web portals where pages are explicitly annotated by the publishers according to the three reading levels above. We use unigram language modeling to model the reading level of subjects in primary and secondary school.

Reading Level Estimation For each unclassified document D, a unigram language model is built. The estimated reading level of D is the language model lmi maximizing the likelihood that D has been generated by lmi ,where i is {basic,medium, advanced}

Profile Estimation Profile estimation is based on key-phrase extraction We use Kea, which splits documents into phrases and chooses some of the phrases as be key-phrases based on two criteria: the first index of their occurrence in the source documents their TF × IDF score with respect to the current document collection Kea outputs for each document in the set a ranked list where the candidate key-phrases are in decreasing order We chose to use the top 6 as key-phrases for each document

Personalized QA Algorithm 1.The retrieved documents’ reading levels are estimated 2.Documents having a different reading level from the user are discarded; if the remaining documents are insufficient, part of the incompatible documents having a close reading level are kept 3.From the documents remaining from step 2, key phrases are extracted using Kea 4.The remaining documents are split into sentences 5.Document topics are matched with the topics in the UM that represent the user’s interests 6.Candidate answers are extracted from the documents and ordered by relevance to the query

Personalized QA Algorithm 7. As an additional answer relevance criterion, the degree of match between the candidate answer document topics and the user’s topics of interest is used and a new ranking is computed on the initial list of candidate answer

Personalized QA Algorithm kij, the j-th key-phrase extracted from Retri Retri, the document represented in the i-th row in Retr Pn, the n-th document in P Retri the document represented in the i-th row in Retr, and Pn the one represented in the n-th row of P 4. Given kij , i.e. the j-th key-phrase extracted from Retri, and Pn, i.e. the n-th document in P, we call w(kij , Pn) the relevance of kij with respect to Pn

Interactivity 1.An optional reciprocal greeting, followed by a question q from the user 2.q is analyzed to detect whether it is related to previous questions or not 3.(a) If q is unrelated to the preceding questions, it is submitted to the QA component (b) If q is related to the preceding questions (follow-up question), it is interpreted by the system in the context of previous queries; New q’ is either directly submitted to the QA component or a request for confirmation (grounding) is issued to the user; If he/she does not agree, the system asks the user to reformulate the question until it can be interpreted by the QA component

Interactivity 4.As soon as the QA component results are available, an answer a is provided. 5.The system enquires whether the user is interested in submitting new queries. 6.Whenever the user wants to terminate the interaction, a final greeting is exchanged.

Handling follow-up questions S: stack of previous user questions If q is elliptic (i.e. contains no verbs), its keywords are completed with the keywords extracted by the QA component from the previous question in S for which there exists an answer. The completed query is submitted to the QA component; If q contains pronoun/adjective anaphora, a chunker is used to find the most recent compatible antecedent in S. This must be a NP compatible in number with the referent. If q contains NP anaphora, the first NP in S containing all the words in the referent is used to replace the latter in q. When no antecedent can be found, a clarification request is issued by the system until a resolved query can be submitted to the QA component.

Evaluation - Reading Level 20 subjects aged between 16 and 52 They evaluated the answers returned by the system to 24 questions into 3 groups (basic, medium and advanced reading levels) Result 94% for Agreement of Advanced Level 85% for Agreement of Medium Level 72% for Agreement of Basic Level

Evaluation - Profile For each user, we created the following 3 questions so that he/she would submit them to the QA system: Qper related to the user’s profile, for answering which the personalized version of YourQA would be used; Qbas related to the user’s profile, for which the baseline version of the system would be used Qunr unrelated to the user’s profile, hence not affected by personalization.

Evaluation For each of the five results separately TEST1: This result is useful to me: 5) Yes, 4) Mostly yes, 3) Maybe, 2) Mostly not, 1) Not at all TEST2: This result is related to my profile: For the five results taken as a whole TEST3: Finding the info I wanted in the result page took: 1) Too long, 2) Quite long, 3) Not too long, 4) Quite little, 5) Very little TEST4: For this query, the system results were sensitive to my profile:

Evaluation TEST1: This result is useful to me TEST2: This result is related to my profile TEST3: Finding the info I wanted in the result page took TEST4: For this query, the system results were sensitive to my profile

Evaluation Interactive QA evaluation

Conclusion An efficient and light-weight method to personalize the results of a Web-based QA system based on a User Model representing individual users’ reading level, age range and interests. A dialogue management model for interactive QA based on a chat interface.

Future Works Facebook

Future Works Open Graph API <html xmlns:og="http://opengraphprotocol.org/schema/"> <head> <title>The Rock (1996)</title> <meta property="og:title" content="The Rock" /> <meta property="og:type" content="movie" /> <meta property="og:url" content="http://www.imdb.com/title/tt0117500/" /> <meta property="og:image" content="http://ia.media- imdb.com/images/rock.jpg" /> … people, place, business, …more