Natural Language Processing Group Department of Computer Science University of Sheffield, UK IR4QA: An Unhappy Marriage Mark A. Greenwood.

Slides:



Advertisements
Similar presentations
Critical Reading Strategies: Overview of Research Process
Advertisements

Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Information Extraction Lecture 4 – Named Entity Recognition II CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
EDUC 200 Final Power Point Presentation Scott Reding Spring 2005.
TextMap: An Intelligent Question- Answering Assistant Project Members:Ulf Hermjakob Eduard Hovy Chin-Yew Lin Kevin Knight Daniel Marcu Deepak Ravichandran.
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
Information Retrieval in Practice
Search Engines and Information Retrieval
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
Information Retrieval in Practice
Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.
1 CS 430: Information Discovery Lecture 20 The User in the Loop.
LING 573: Deliverable 3 Group 7 Ryan Cross Justin Kauhl Megan Schneider.
Project Workshops Results and Evaluation. General The Results section presents the results to demonstrate the performance of the proposed solution. It.
Overview of Search Engines
Web Searching. Web Search Engine A web search engine is designed to search for information on the World Wide Web and FTP servers The search results are.
Dr. Alireza Isfandyari-Moghaddam Department of Library and Information Studies, Islamic Azad University, Hamedan Branch
User Interface Evaluation Usability Inquiry Methods
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
Search Engines and Information Retrieval Chapter 1.
Building a Simple Question Answering System Mark A. Greenwood Natural Language Processing Group Department of Computer Science University of Sheffield,
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
Finding Similar Questions in Large Question and Answer Archives Jiwoon Jeon, W. Bruce Croft and Joon Ho Lee Retrieval Models for Question and Answer Archives.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
A Data Driven Approach to Query Expansion in Question Answering Leon Derczynski, Robert Gaizauskas, Mark Greenwood and Jun Wang Natural Language Processing.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
Proposal for Term Project J. H. Wang Mar. 2, 2015.
Subject (Exam) Review WSTA 2015 Trevor Cohn. Exam Structure Worth 50 marks Parts: – A: short answer [14] – B: method questions [18] – C: algorithm questions.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
Why is Computational Linguistics Not More Used in Search Engine Technology? John Tait University of Sunderland, UK.
TOPIC CENTRIC QUERY ROUTING Research Methods (CS689) 11/21/00 By Anupam Khanal.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Alexey Kolosoff, Michael Bogatyrev 1 Tula State University Faculty of Cybernetics Laboratory of Information Systems.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
ITGS Databases.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
Techniques for Collaboration in Text Filtering 1 Ian Soboroff Department of Computer Science and Electrical Engineering University of Maryland, Baltimore.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
DESIGNING AN ARTICLE Effective Writing 3. Objectives Raising awareness of the format, requirements and features of scientific articles Sharing information.
Why IR test collections are so bad Mark Sanderson University of Sheffield.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
AQUAINT Mid-Year Workshop: Observations and Comments Jimmy Lin MIT Artificial Intelligence Laboratory.
Information Retrieval in Practice
Information Retrieval in Practice
Evaluation Anisio Lacerda.
Information Retrieval (in Practice)
Proposal for Term Project
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Structure of IR Systems
Presentation transcript:

Natural Language Processing Group Department of Computer Science University of Sheffield, UK IR4QA: An Unhappy Marriage Mark A. Greenwood

Outline of Talk Background ‘Ancient’ History Recent Past An Uncertain Future Possible New Directions

Background Although QA is not new, the language processing community has yet to develop a clearly articulated and commonly accepted guiding framework and research methodology, parallel to that of IR, MT, or text summarization. As a result, despite ten years of system evaluations in the TREC QA track for specific kinds of questions and answers, the community does not have a clear idea how much progress was made during that period for QA in general. OAQA09 Call for Papers

Background We will focus here on the selection of promising documents which can be subjected to further processing in order to extract exact answers to questions. The common approach to this problem has been to employ an IR engine to retrieve a small set of relevant documents, a field known as IR4QA. The rest of this talk will explain  How we got to this point  Why it is fundamentally flawed  Where we might go from here

Outline of Talk Background ‘Ancient’ History Recent Past An Uncertain Future Possible New Directions

‘Ancient’ History Traditionally IR and QA were separate research areas They had different users and goals The inputs and outputs to both systems were radically different Both had their own strengths and weaknesses

‘Ancient’ History Early QA systems were usually just interfaces to structured data  LUNAR (Woods, 1973)  BASEBALL (Green et al., 1961) Those systems which worked over text were usually based around reading comprehension exercises and used scenario templates  SAM (Schank and Abelson, 1977) Questions varied in length but were asking for information which wasn’t known to the user Systems were not open-domain, i.e. LUNAR only knew about moon rocks.

‘Ancient’ History In comparison to QA systems early IR systems could be applied to any document collection  Performance varied from collection to collection but in principal Queries were usually quite long and described the documents the user was looking for  The CACM collection is a good example Systems returned full documents not exact answers  As the user already knew what they were looking for this was OK  Full documents doesn’t help when you don’t know what you are looking for as you then have to read all the returned documents

Outline of Talk Background ‘Ancient’ History Recent Past An Uncertain Future Possible New Directions

Recent Past Recent QA research has been guided by the TREC evaluations The TREC QA track was originally conceived as a task that would interest both the IR and IE communities  Focused IR  Open-Domain IE It was hoped that over time the two communities would work together to develop new combined approaches Unfortunately it would seem that the IR community is not, on the whole, interested in the QA task

Recent Past Most, if not all, modern QA systems have adopted a (roughly) three stage architecture: question analysis, document retrieval, and answer extraction.

Recent Past IR4QA has not been aggressively researched by the community yet we know that...  IR performance places an upper-bound on end-to-end performance – a commonly quoted figure is 60% (Tellex et al., 2003)  Even if we look at the top 1000 documents no relevant documents are returned for 8% of the questions (Hovy et al., 2000)  Most systems use off-the-shelf IR components with little or no tuning to the task, i.e. Lucene, Okapi...  Complex multi-query strategies have been tried in an effort to solve the problem, but they only serve to highlight how bad performance at this step actually is.

Recent Past IR4QA has focused on the development and evaluation of the document retrieval component in such systems. The main problems are  QA researchers are not IR researchers  We don’t fully understand the intricate details of IR engines  QA and IR are fundamentally different tasks

Recent Past Commonly accepted evaluation framework consists of (Roberts and Gaizauskas, 2004)  Coverage – the proportion of documents for which at least one answer bearing document is retrieved  Redundancy – the average number of answer bearing documents retrieved for a question

Recent Past There have been two workshops focused on the problem of IR4QA  Sheffield, SIGIR 2004  Manchester, Coling 2008 The main conclusions of both were that  IR4QA is very hard  Approaches that lead to increased IR performance do not necessarily lead to appreciable increases in end-to-end performance  Selection of documents shouldn’t be performed in isolation from the rest of the system

Outline of Talk Background ‘Ancient’ History Recent Past An Uncertain Future Possible New Directions

An Uncertain Future It seems clear that, on the whole, the IR community are not interested in QA Using off-the-shelf IR components has been shown to introduce unacceptable caps on performance The IR4QA community need to consider radically different approaches to the problem of selecting relevant documents from large corpora

Outline of Talk Background ‘Ancient’ History Recent Past An Uncertain Future Possible New Directions

Answer extraction requires complex text processing  Answer extraction techniques don’t scale well  Some form of text selection component is required There are two orthogonal directions we could take  Continue to use traditional IR techniques but discard the traditional view of what makes a document (and/or query)  Continue to work with traditional documents but use a radically different selection approach We need approaches that scale – working on AQUAINT size collections is nice for self contained experiments but shouldn’t be the end goal!

What Is A Document? Topic Indexing and Retrieval (Ahn and Webber, 2008) throws away the common idea of documents while using a standard IR engine to directly retrieve answers not text. Topics are entities that answer questions  People, companies, locations etc. Topic documents are built by simply joining together all sentences from a corpus that contain the topic (or variations of, i.e. Bill Clinton and William Clinton) QA is then a matter of retrieving the most relevant topic document using an IR engine and returning the associated topic as the answer

What Is A Document?

Let The Data Guide You A decade of recent QA research has yielded a lot of useful data We have lots of example questions (at least a few thousand just from TREC) each of which...  Has a known correct answer  Is associated with at least one answer bearing document We should use this data to guide new selection approaches.  A simple approach would be to perform query expansion by looking for terms which are often associated with correct answers to certain question types (Derczynski et al., 2008)  Look for patterns in the answer bearing documents and index collections based on these patterns rather than words

Answer By Understanding I’ve always been of the opinion that QA is intelligent IR  Where intelligence equates to some level of understanding This suggests we should index meaning not just textual content.  Take into account co-reference when selecting text passages  Indexing relations should allow for more focused selection  ‘Hybrid’ search that uses annotations and text (Bhagdev et al., 2008)

DISCUSSION

References Kisuh Ahn and Bonnie Webber Topic Indexing and Retrieval for Factoid QA. In Proceedings of the 2nd Workshop on Information Retrieval for Question Answering (IR4QA). Ravish Bhagdev, Sam Chapman, Fabio Ciravegna, Vitaveska Lanfranchi and Daniela Petrelli Hybrid Search: Effectively Combining Keywords and Semantic Searches. In Proceedings of the 5th European Semantic Web Conference, ESWC 08, Tenerife. Leon Derczynski, Jun Wang, Robert Gaizauskas and Mark A. Greenwood A Data Driven Approach to Query Expansion in Question Answering. In Proceedings of the 2nd Workshop on Information Retrieval for Question Answering (IR4QA). Bert F. Green, Alice K. Wolf, Carol Chomsky, and Kenneth Laughery BASEBALL: An Automatic Question Answerer. In Proceedings of the Western Joint Computer Conference, volume 19, pages Eduard Hovy, Laurie Gerber, Ulf Hermjakob, Michael Junk, and Chin-Yew Lin Question Answering in Webclopedia. In Proceedings of the 9th Text REtrieval Conference. Ian Roberts and Robert Gaizauskas Evaluating Passage Retrieval Approaches for Question Answering. In Proceedings of 26th European Conference on Information Retrieval (ECIR’04), pages , University of Sunderland, UK. Roger C. Schank and Robert Abelson Scripts, Plans, Goals and Understanding. Hillsdale. Stefanie Tellex, Boris Katz, Jimmy Lin, Aaron Fernandes, and Gregory Marton Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering. In Proceedings of the Twenty-Sixth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages , Toronto, Canada, July. William Woods Progress in Natural Language Understanding - An Application to Lunar Geology. In AFIPS Conference Proceedings, volume 42, pages