Group Members: Satadru Biswas (05005021) Tanmay Khirwadkar (05005016) Arun Karthikeyan Karra (05d05020) CS 626-460 Course Seminar Group-2 Question Answering.

Slides:

Advertisements

Similar presentations

Improved TF-IDF Ranker

Advertisements

1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.

Answer Extraction Ling573 NLP Systems and Applications May 19, 2011.

Question-Answering: Shallow & Deep Techniques for NLP Ling571 Deep Processing Techniques for NLP March 9, 2011 Examples from Dan Jurafsky)

Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.

Information Retrieval and Question-Answering

The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.

Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.

Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.

ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.

1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.

Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University

1 Question-Answering via the Web: the AskMSR System Note: these viewgraphs were originally developed by Professor Nick Kushmerick, University College Dublin,

An Analysis of the AskMSR Question-Answering System Eric Brill, Susan Dumais, and Michelle Banko Microsoft Research.

Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.

Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.

Chapter 5: Information Retrieval and Web Search

Information Retrieval in Practice

AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.

Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.

A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

1 The BT Digital Library A case study in intelligent content management Paul Warren

Going Beyond Simple Question Answering Bahareh Sarrafzadeh CS 886 – Spring 2015.

Automatic Answer Validation in Open-Domain Question Answering Hristo Tanev TCC,ITC - IRST.

COMPUTER-ASSISTED PLAGIARISM DETECTION PRESENTER: CSCI 6530 STUDENT.

AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.

Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.

Querying Structured Text in an XML Database By Xuemei Luo.

Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.

QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.

A Language Independent Method for Question Classification COLING 2004.

Day 14 Information Retrieval Question Answering 1.

21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.

1 Just-in-Time Interactive Question Answering Language Computer Corporation Sanda Harabagiu, PI John Lehmann John Williams Paul Aarseth.

Chapter 6: Information Retrieval and Web Search

Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.

Question Answering over Implicitly Structured Web Content

Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.

WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.

October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.

AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation Dallas, Texas.

Evaluation of (Search) Results How do we know if our results are any good? Evaluating a search engine  Benchmarks  Precision and recall Results summaries:

Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.

Oxygen Indexing Relations from Natural Language Jimmy Lin, Boris Katz, Sue Felshin Oxygen Workshop, January, 2002.

Querying Web Data – The WebQA Approach Author: Sunny K.S.Lam and M.Tamer Özsu CSI5311 Presentation Dongmei Jiang and Zhiping Duan.

Automatic Question Answering  Introduction  Factoid Based Question Answering.

Information Retrieval

For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.

Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.

公司標誌 Question Answering System Introduction to Q-A System 資訊四 B 張弘霖資訊四 B 王惟正.

1 Answering English Questions by Computer Jim Martin University of Colorado Computer Science.

Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.

Shallow & Deep QA Systems Ling 573 NLP Systems and Applications April 9, 2013.

Improving QA Accuracy by Question Inversion John Prager, Pablo Duboue, Jennifer Chu-Carroll Presentation by Sam Cunningham and Martin Wintz.

1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.

Bringing Order to the Web : Automatically Categorizing Search Results Advisor ： Dr. Hsu Graduate ： Keng-Wei Chang Author ： Hao Chen Susan Dumais.

A Simple English-to-Punjabi Translation System By : Shailendra Singh.

AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.

Query Reformulation & Answer Extraction

Web IR: Recent Trends; Future of Web Search

CSE 635 Multimedia Information Retrieval

CS246: Information Retrieval

Information Retrieval

Introduction to Search Engines

Presentation transcript:

Group Members: Satadru Biswas ( ) Tanmay Khirwadkar ( ) Arun Karthikeyan Karra (05d05020) CS Course Seminar Group-2 Question Answering

Outline Introduction Why Question Answering ? AskMSR FALCON Conclusion

Introduction Question Answering (QA) is the task of automatically answering a question posed in natural language. To find the answer to a question, a QA computer program may use either a pre-structured database or a collection of natural language documents (a text corpus such as the World Wide Web or some local collection).

A few sample questions Q: Who shot President Abraham Lincoln? A: John Wilkes Booth Q: How many lives were lost in the Pan Am crash in Lockerbie? A: 270 Q: How long does it take to travel from London to Paris through the Channel? A: three hours 45 minutes Q: Which Atlantic hurricane had the highest recorded wind speed? A: Gilbert (200 mph)

Why Question Answering ? Google – Query driven search Answers to a query are documents Question Answering – Answer driven search Answers to a query are phrases

Approaches Question classification Finding entailed answer type Use of WordNet High-quality document search

Question Classes Class 1 Answer: single datum or list of items C: who, when, where, how (old, much, large) Example: Who shot President Abraham Lincoln? Answer: John Wilkes Booth Class 2 A: multi-sentence C: extract from multiple sentences Example: Who was Picasso? Answer: Picasso was great Spanish painter Class 3 A: across several texts C: comparative/contrastive Example: What are the Valdez Principles?

Question Classes (contd…) Class 4 A: an analysis of retrieved information C: synthesized coherently from several retrieved fragments Example: Which Atlantic hurricane had the highest recorded wind speed? Answer: Gilbert (200 mph) Class 5 A: result of reasoning C: word/domain knowledge and common sense reasoning Example: What did Richard Feynman say upon hearing he would receive the Nobel Prize in Physics?

Types of QA Closed-domain question answering deals with questions under a specific domain, and can be seen as an easier task because NLP systems can exploit domain-specific knowledge frequently formalized in ontologies. Open-domain question answering deals with questions about nearly everything, and can only rely on general ontologies and world knowledge. On the other hand, these systems usually have much more data available from which to extract the answer.

QA - Concepts Question Classes: Different types of questions require the use of different strategies to find the answer. Question Processing: A semantic model of question understanding and processing is needed, one that would recognize equivalent questions, regardless of the speech act or of the words, syntactic inter-relations or idiomatic forms. Context and QA: Questions are usually asked within a context and answers are provided within that specific context. Data sources for QA: Before a question can be answered, it must be known what knowledge sources are available.

Cont... Answer Extraction: Answer extraction depends on the complexity of the question, on the answer type provided by question processing, on the actual data where the answer is searched, on the search method and on the question focus and context. Answer Formulation: The result of a QA system should be presented in a way as natural as possible. Real time question answering: There is need for developing Q&A systems that are capable of extracting answers from large data sets in several seconds, regardless of the complexity of the question, the size and multitude of the data sources or the ambiguity of the question. Multi-lingual QA: The ability to answer a question posed in one language using an answer corpus in another language (or even several).

Cont... Interactive QA: Often the questioner might want not only to reformulate the question, but (s)he might want to have a dialogue with the system. Advanced reasoning for QA: More sophisticated questioners expect answers which are outside the scope of written texts or structured databases. User profiling for QA: The user profile captures data about the questioner, comprising context data, domain of interest, reasoning schemes frequently used by the questioner, common ground established within different dialogues between the system and the user etc.

Tanmay Khirwadkar Question Answering with the Help of WEB

Issues with traditional QA Systems Retrieval is performed against small set of documents Extensive use of linguistic resources POS tagging, Named Entity Tagging, WordNet etc. Difficult to recognize answers that do not match question syntax E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth is perhaps America’s most infamous assassin having fired the bullet that killed Abraham Lincoln.

The Web can help ! Web – A gigantic data repository with extensive data redundancy Factoids likely to be expressed in hundreds of different ways At-least a few will match the way the question was asked E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth shot President Abraham Lincoln.

AskMSR Based on Data-Redundancy of the Web Process the question Form a web-search engine query Recognize the answer-type Rank answers on basis of frequency Project the answers on TREC-corpus

1. Query Reformulation Question is often syntactically close to answer E.g. Where is the Louvre Museum located? The Louvre Museum is located in Paris Who created the character of Scrooge? Charles Dickens created the character of Scrooge.

1. Query Reformulation Classify the query into 7 categories Who, When, Where … Hand-crafted category-specific rewrite rules [String, L/R/-, Weight] Weight – preference for a query “Abraham Lincoln born on” preferred to “Abraham” “Lincoln” “born” String – Simple String Manipulations

1. Query Reformulation E.g. For ‘where’ questions move ‘is’ to all possible locations – Q: What is relative humidity? [“is relative humidity”, LEFT, 5] [”relative is humidity”, RIGHT, 5] [”relative humidity is”, RIGHT, 5] [”relative humidity”, NULL, 2] [”relative” AND “humidity”, NULL, 1] Some rewrites may be non-sensical

2. Query Search Engine Send all rewrites to a Web search engine Retrieve top N answers ( ) For speed, rely just on search engine’s “snippets”, not the full text of the actual document

3. N-gram Harvesting Process the snippet to retrieve string to left/right of query Enumerate all n-grams (1, 2 and 3) Score of n-gram - Occurrence frequency weighted by ‘weight’ of rewrite rule that fetched the summary Formula:

4. Filtering Answers Apply filters based on question-types of queries Regular Expressions Natural Language Analysis E.g. “Genghis Khan”, “Benedict XVI” Boost score of answer when it matches expected answer-type Remove answers from candidate list When set of answers is a closed set “Which country …”, “How many times …”

5. Answer Tiling Shorter N-grams have higher weights Solution: Perform tiling Combine overlapping shorter n-grams into longer n-grams Score = maximum(constituent n-grams) E.g. Pierre Baron (5) Baron de Coubertin (20) de Coubertin (10) Pierre Baron de Coubertin (20)

6. Answer Projection Retrieve support ing documents from document collection for each answer Use a standard IR system IR Query : Web-Query + Candidate Answer

Results SystemStrictLenient AskMSR MRR No Answers AskMSR2 MRR No Answers

- Arun Karthikeyan Karra (05d05020) FALCON (Boosting Knowledge for QA systems)

FALCON Introduction Another QA system Integrates syntactic, semantic and pragmatic knowledge for achieving better performance Handles question reformulations, incorporates Wordnet semantic net, performs unifications on semantic forms to extract answers

Architecture of FALCON Source: FALCON: Boosting Knowledge for Answer Engines, Harabagiu et. al.

Working of FALCON: A gist Source: FALCON: Boosting Knowledge for Answer Engines, Harabagiu et. al.

Question Reformulations (1) Source: FALCON: Boosting Knowledge for Answer Engines, Harabagiu et. al.

Question Reformulations (2) Source: FALCON: Boosting Knowledge for Answer Engines, Harabagiu et. al.

Expected Answer Type (1)

Expected Answer Type (2) Source: FALCON: Boosting Knowledge for Answer Engines, Harabagiu et. al.

Semantic Knowledge Source: FALCON: Boosting Knowledge for Answer Engines, Harabagiu et. al.

Key words and Alternations Morphological Alternations Lexical Alternations Who killed Martin Luther King? How far is the moon? Semantic Alternations

Results Reported 692 Questions Key word alternations used for 89 questions TREC-9 (Text Retrieval Conference) Source: FALCON: Boosting Knowledge for Answer Engines, Harabagiu et. al.

Conclusion Question Answering requires more complex NLP techniques compared to other forms of Information Retrieval Two main approaches; Data Redundancy – AskMSR Boosting Knowledge Base – FALCON Ultimate Goal : System which we can ‘talk’ to There is a long way to go... And a lot more money to come

References Data Intensive Question Answering, Eric Brill et.al., TREC-10, 2001 An Analysis of the AskMSR Question-Answering System, Eric Brill et. al., Proceedings of the Conference on Empirical Methods in Natural Association for Computational Linguistics. Language Processing (EMNLP), Philadelphia, July 2002, pp FALCON: Boosting Knowledge for Answer Engines, Sanda Harabagiu, Dan Moldovan et. al., Southern Methodist University, TREC-9, Wikipedia

EXTRA SLIDES

Abductive Knowledge