Traditional Question Answering System: an Overview

Slides:



Advertisements
Similar presentations
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Advertisements

Improved TF-IDF Ranker
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Vikas BhardwajColumbia University NLP for the Web – Spring 2010 Improving QA Accuracy by Question Inversion Prager et al. IBM T.J. Watson Res. Ctr. 02/18/2010.
Information Retrieval in Practice
Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Intranet Mediator Clement Yu Department of Computer Science University of Illinois at Chicago.
Data-oriented Content Query System: Searching for Data into Text on the Web Mianwei Zhou, Kevin Chen-Chuan Chang Department of Computer Science UIUC 1.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
1 Information Retrieval and Web Search Introduction.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
A Natural Language Interface for Crime-related Spatial Queries Chengyang Zhang, Yan Huang, Rada Mihalcea, Hector Cuellar Department of Computer Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A semantic approach for question classification using.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Extracting Semantic Constraint from Description Text for Semantic Web Service Discovery Dengping Wei, Ting Wang, Ji Wang, and Yaodong Chen Reporter: Ting.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Chapter 6: Information Retrieval and Web Search
AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.
NTCIR /21 ASQA: Academia Sinica Question Answering System for CLQA (IASL) Cheng-Wei Lee, Cheng-Wei Shih, Min-Yuh Day, Tzong-Han Tsai, Tian-Jian Jiang,
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Querying Web Data – The WebQA Approach Author: Sunny K.S.Lam and M.Tamer Özsu CSI5311 Presentation Dongmei Jiang and Zhiping Duan.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Information Retrieval
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
Keyword Translation Accuracy and Cross-Lingual Question Answering in Chinese and Japanese Teruko Mitamura Mengqiu Wang Hideki Shima Frank Lin In CMU EACL.
Improving QA Accuracy by Question Inversion John Prager, Pablo Duboue, Jennifer Chu-Carroll Presentation by Sam Cunningham and Martin Wintz.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Integrating linguistic knowledge in passage retrieval for question answering J¨org Tiedemann Alfa Informatica, University of Groningen HLT/EMNLP 2005.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Information Retrieval in Practice
Einat Minkov University of Haifa, Israel CL course, U
Semantic Parsing for Question Answering
Reading Report on Hybrid Question Answering System
Information Retrieval and Web Search
Search Engine Architecture
Robust Semantics, Information Extraction, and Information Retrieval
Information Retrieval and Web Search
CS416 Compiler Design lec00-outline September 19, 2018
Information Retrieval and Web Search
Web IR: Recent Trends; Future of Web Search
Introduction CI612 Compiler Design CI612 Compiler Design.
Thanks to Bill Arms, Marti Hearst
CSE 635 Multimedia Information Retrieval
CS416 Compiler Design lec00-outline February 23, 2019
CS246: Information Retrieval
Search Engine Architecture
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Information Retrieval and Web Search
Presentation transcript:

Traditional Question Answering System: an Overview 汤顺雷

Category NLIDB (Natural Language Interface over Database) QA over text Document-base: TREC, given a corpus Web-base: START(AILab MIT, 1993) Semantic ontology-base … Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155.

NLI over Database Early: Current: … Domain-specific Pattern matching e.g. Baseball 1961, Lunar 1971 Current: … Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01): 29-81.

NLI over Database Early: example COUNTRY CAPITAL LANGUAGE France Paris French Italy Rome Italian … Early: example pattern1: ... "capital" ... <country> pattern2: ... "capital" ... "country" “What is the capital of Italy?”  pattern1  Rome "List the capital of every country”  pattern2  <Paris, France>, <Rome, Italy> Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01): 29-81.

NLI over Database Current: Intermediate representation Front end / back end E.g. PRECISE 2003 (http://www.cs.washington.edu/research/nli) Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003: 149-157.

PRECISE UW, 2003 Ambiguity resolution Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003: 149-157.

PRECISE UW, 2003 Question: What are the HP jobs on a Unix system? SQL: SELECT DISTINCT Description FROM JOB WHERE { Platform = ‘Unix’ AND Company = ‘HP’ } Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003: 149-157.

QA over text Input: NL queries (Natural Language questions) Resource: plain text: Large, uncontrolled text / Web pages Weak knowledge: e.g. WordNet, gazetteers Problem: Semantic meaning of NL questions and text sentences Logical reasoning Errors, redundancies, ambiguity in text … Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155.

QA over text Question type: Factoids: “How many people live in Israel?” List questions: “what countries speak Spanish?” Definition questions: “what is question answering system?” Complex interactive QA (ciQA): “What [familial ties] exist between [dinosaurs] and [birds]?” Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155.

QA over text – START START: SynTactic Analysis using Reversible Transformations The world’s first Web-based QA system – 1993 Developed by AI Lab, MIT http://start.csail.mit.edu/ Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155.

QA over text – START Step 1  Step 2  Answer  Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155.

QA over text – Architecture Query analysis: preprocess and light (or deep) semantic analysis NL queries  Query object Retrieval engine: IR engine(documents)/search engine (over the Web) Query object  Passages list (containing candidate answers) Answer generator: filtering, merging of candidate passages Passage list  Answer Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155.

QA over text – Architecture CMU – JAVELIN in TREC11, 2002 IBM – PIQUANT II in TREC13, 2004 Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322. (left) Ittycheriah A, Franz M, Zhu W J, et al. IBM's Statistical Question Answering System[C]//TREC. 2000. (right)

QA over text – Question analysis Goal: generate Query object (contains constraints on answer finding) Query object: Keywords  extended keywords Question type hierarchy: Question type Expected answer type Methods: tokenize, POS tag, parser, Name-entity(NE) recognition, relation extraction … Example: LASSO, FALCON Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

LASSO – Question analysis SMU in TREC8, 1999 55% short answer, 64.5% long answer Keyword drop Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8: 65-73.

LASSO – Question analysis “What is the largest city in Germany?” Question type: WHAT Question focus: “largest city” Answer type: LOCATION Keywords: … NL Question Wh-term Question type Disambiguate: who/whom, where, when, why Ambiguate: what, how, which, name(v.) Question focus Keywords Combine Answer type Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8: 65-73.

LASSO – Question analysis “What is the largest city in Germany?” Question type: WHAT Question focus: “largest city” Answer type: LOCATION Keywords: … Modifier noun  Keyword drop Name-entity  Quotation  Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8: 65-73.

FALCON – Question analysis Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Best of TREC9, in 2000 58% of short answer 76% of long answer Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9: 479-488.

FALCON – Question analysis Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Question similarity similarity matrix transaction closure Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9: 479-488.

FALCON – Question analysis Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9: 479-488.

FALCON – Question analysis Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9: 479-488.

QA over text – Retrieval Engine Input: Query object Output: passage list (contain candidate answers) Example: JAVELIN Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

JAVELIN – Retrieval Engine Input: Query object (question type + answer type + keyword set) Relax algorithm: Passage POS tagging  get word data type Indexing pair <passage word, data type> While not meet time && space constraints Retrieval keyword set // comparing word and data type between passage and question Add to candidate passage list Keyword set relax // add synonym keywords Return candidate passage list Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

FDUQA – Retrieval Engine “Name cities that have an Amtrak terminal.” Dealing with List questions Treat as a factoid question, get "New York" "New York" as a seed appearing at <A> in patterns Validate matched sentences “New York” “Preliminary plans by Amtrak that were released yesterday call for stops of its high-speed express service in Boston , Providence , R.I. , New Haven , Conn. , New York , Philadelphia , Baltimore and Washington .” Patterns: P1. (including|include|included) (<A>)+ and <A> P2. such as (<A>)+ and <A> P3. between <A> and <A> P4. (<A>)+ as well as <A> Type validate “Boston”, “Philadelphia”, “Baltimore”, “Washington” Wu L, Huang X. Lan You, Zhushuo Zhang, Xin Li, and Yaqian Zhou. 2004. FDUQA on TREC2004 QA Track[C]//Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004).

QA over text – Answer Generator Input: Query object (by question analysis) + passage list Generator answer by Merge: candidates that contain different part of answers Filter: drop redundancies and wrong answers Rank: rank candidates Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155.

PIQUANT II – Answer Generator Input passage list from: Statistic method based candidates: Wide coverage Redundancy robust Template method based candidates: Fail sometime Precise Strategy: take top 5 from statistic method candidates, merge into template candidates Chu-Carroll J, Czuba K, Prager J M, et al. IBM's PIQUANT II in TREC 2004[C]//TREC. 2004.

JAVELIN – Answer Generator Input: candidate documents containing candidate passages Each doc  same topic Strategy: merge top 1 rank of each doc Document 1 Passage 1 Passage 2 Passage n ⋮ Document 2 Document 3 ⋯ Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

Reference: Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): 125-155. Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01): 29-81. (NLIDB) Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003: 149-157. (PRECISE)

Reference: Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question- answering system at trec 2002[J]. Computer Science Department, 2002: 322. (CMU JAVELIN) Ittycheriah A, Franz M, Zhu W J, et al. IBM's Statistical Question Answering System[C]//TREC. 2000. (IBM PIQUANT) Chu-Carroll J, Czuba K, Prager J M, et al. IBM's PIQUANT II in TREC 2004[C]//TREC. 2004. (IBM PIQUANT II) Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8: 65-73. (SMU LASSO)

Reference: Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9: 479-488. (FALCON) Wu L, Huang X. Lan You, Zhushuo Zhang, Xin Li, and Yaqian Zhou. 2004. FDUQA on TREC2004 QA Track[C]//Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004). (FDU QA)

Thank you