A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Slides:

Advertisements

Similar presentations

Introduction to Information Retrieval

Advertisements

WWW 2014 Seoul, April 8 th SNOW 2014 Data Challenge Two-level message clustering for topic detection in Twitter Georgios Petkos, Symeon Papadopoulos, Yiannis.

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.

Group Members: Satadru Biswas ( ) Tanmay Khirwadkar ( ) Arun Karthikeyan Karra (05d05020) CS Course Seminar Group-2 Question Answering.

Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.

1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.

Answer Extraction Ling573 NLP Systems and Applications May 19, 2011.

Information Retrieval in Practice

Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.

Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,

Information Retrieval and Question-Answering

The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.

1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.

Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University

1 Question-Answering via the Web: the AskMSR System Note: these viewgraphs were originally developed by Professor Nick Kushmerick, University College Dublin,

An Analysis of the AskMSR Question-Answering System Eric Brill, Susan Dumais, and Michelle Banko Microsoft Research.

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.

Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.

Chapter 5: Information Retrieval and Web Search

Overview of Search Engines

Information Retrieval in Practice

Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.

CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.

RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.

Going Beyond Simple Question Answering Bahareh Sarrafzadeh CS 886 – Spring 2015.

AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.

Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.

Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.

Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar.

1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)

Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.

21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.

Chapter 6: Information Retrieval and Web Search

QUESTION AND ANSWERING. Overview What is Question Answering? Why use it? How does it work? Problems Examples Future.

Presenter: Shanshan Lu 03/04/2010

Dataware’s Document Clustering and Query-By-Example Toolkits John Munson Dataware Technologies 1999 BRS User Group Conference.

Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.

Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.

Question Answering over Implicitly Structured Web Content

Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.

BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.

Evaluation of (Search) Results How do we know if our results are any good? Evaluating a search engine  Benchmarks  Precision and recall Results summaries:

Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.

Oxygen Indexing Relations from Natural Language Jimmy Lin, Boris Katz, Sue Felshin Oxygen Workshop, January, 2002.

Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.

How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”

Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.

Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.

Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.

Automatic Question Answering  Introduction  Factoid Based Question Answering.

Information Retrieval

Liangjie Hong and Brian D. Davison Department of Computer Science and Engineering Lehigh University SIGIR 2009.

A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.

Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.

1 Answering English Questions by Computer Jim Martin University of Colorado Computer Science.

Shallow & Deep QA Systems Ling 573 NLP Systems and Applications April 9, 2013.

1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.

Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.

CS791 - Technologies of Google Spring A Webbased Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.

An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.

Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.

Information Retrieval in Practice

Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance Hello everyone,

CSE 635 Multimedia Information Retrieval

CS246: Information Retrieval

Information Retrieval and Web Design

Introduction to Search Engines

Presentation transcript:

A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal

Contents: Introduction Why Question Answering ? The Architecture of a Generic QA System Issues with traditional QA Systems The Web Solution- AskMSR Current Research Work Conclusion

Question answering (QA), in information retrieval, is the task of automatically answering a question posed in natural language (NL) using either a pre-structured database or a collection of natural language documents. Goal : to retrieve answers to questions rather than full documents or best-matching passages QA=Information Retrieval + Information Extraction to find short answers to fact-based questions Introduction

Why Question Answering ? Google – Query driven search Answers to a query are documents Question Answering – Answer driven search Answers to a query are phrases

Question Processing question query Passage Retrieval Answer Extraction answers Document Retrieval Document Retrieval The Architecture of a Generic QA System

Question Processing Captures the semantics of the question; Tasks: Determine the question type Determining the answer type Extract keywords from the question and formulate a query

Question Types Class 1 Answer: single datum or list of items C: who, when, where, how (old, much, large) Class 2 A: multi-sentence C: extract from multiple sentences Class 3 A: across several texts C: comparative/contrastive Class 4 A: an analysis of retrieved information C: synthesized coherently from several retrieved fragments Class 5 A: result of reasoning C: word/domain knowledge and common sense reasoning

Types of QA Closed-domain QA systems: are built for very specific domain and exploit expert knowledge in them.  very high accuracy  require extensive language processing and limited to one domain Open-domain QA systems: can answer any question from any collection.  can potentially answer any question  very low accuracy

Keyword Selection List of keywords in the question to help in finding relevant texts Some systems expanded them with lexical/semantic alternations for better matching: inventor -> invent have been sold -> sell dog -> animal

Passage Retrieval Extracts passages that contain all selected keywords Passage quality based on loops: In the first iteration use the first 6 keyword selection heuristics If no. passages < a threshold  query is too strict  drop a keyword If no. passages > a threshold  query is too relaxed  add a keyword

Answer Extraction  Pattern matching between question and the representation of the candidate answer-bearing texts  A set of candidate answers is produced  Ranking according to likelihood of correctness.

QA SystemOutput AnswerBusSentences AskJeeves (ask.com) Documents/direct answers IONAUTPassages LCCSentences MulderExtracted answers QuASMDocument blocks STARTMixture WebclopediaSentences Example of Answer Processing

Issues with traditional QA Systems Retrieval is performed against small set of documents Extensive use of linguistic resources POS tagging, Named Entity Tagging, WordNet etc. Difficult to recognize answers that do not match question syntax E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth is perhaps America’s most infamous assassin having fired the bullet that killed Abraham Lincoln.

The Web can help Web – A gigantic data repository with extensive data redundancy Factoids likely to be expressed in hundreds of different ways At-least a few will match the way the question was asked E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth shot President Abraham Lincoln.

AskMSR: Details

Step 1: Rewrite queries Intuition: The user’s question is often syntactically quite close to sentences that contain the answer E.g. Q-Where is the Louvre Museum located? A- The Louvre Museum is located in Paris  Classify question into specific categories.  Category-specific transformation rules  Expected answer “Data type” (E.g. Date, Person, Location, …) Step 2: Query search engine Send all rewrites to a Web search engine Retrieve top N answers For speed, search engine’s “snippets” are used instead of full text or the actual document

Step 3: Mining N-Grams Enumerate all N-grams (N=1,2,3 say) in all retrieved snippets Use hash table to make this efficient Weight of an n-gram: occurrence count, each weighted by “reliability” (weight) of rewrite that fetched the document Step 4: Filtering N-Grams Each question type is associated with one or more “data-type filters” = regular expression When… Where… What … Who … Boost score of n-grams that do match regular exp Lower score of n-grams that don’t match regular exp Date Location Person

Step 5: Tiling the Answers Dickens Charles Dickens Mr Charles Scores merged, discard old n-grams Mr Charles Dickens Score 45 N-Grams tile highest-scoring n-gram N-Grams Repeat, until no more overlap Example: “Who created the character of Scrooge?”

Current Research Work Human Question Answering Performance Using an Interactive Document Retrieval System Document retrieval + QA system : the ability of the users answering the questions on their own using an interactive document retrieval system and result compared and evaluated by QA systems Towards Automatic Question Answering over Social Media by Learning Question Equivalence Patterns Collaborative Question Answering (CQA) systems : accessed on an existing archive, in which users answer each other questions, many questions to be asked have already been asked and answered and group equivalence patterns are generated for questions having syntactic similarities

An Automatic Answering System with Template Matching for Natural Language Questions Closed-domain system : template matching is applied, to provide a service for cell phones by SMS, Frequently Asked Questions (FAQ) are used as sample data.  Preprocessing  Question Template Matching  Answering

Conclusion Question Answering requires more complex NLP techniques compared to other forms of Information Retrieval There is a huge possibility that complex Automatic QA systems can replace simple web search systems, but the Automatic QA are still non-trivial research fields as Document and Information Retrieval, are huge with many different approaches, which are still not all fully developed

References An Analysis of the AskMSR Question Answering System, Eric Brill et. al., Proceedings of the Conference on Empirical Methods in Natural Association for Computational Linguistics. Language Processing (EMNLP), Philadelphia, July 2002, pp New Trends in Automatic Question Answering by Univ.- Doz. Dr.techn. Christian Gutl

Thank You..