Traditional Question Answering System: an Overview

Traditional Question Answering System: an Overview
汤顺雷

Category NLIDB (Natural Language Interface over Database) QA over text
Document-base: TREC, given a corpus Web-base: START(AILab MIT, 1993) Semantic ontology-base … Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):

NLI over Database Early: Current: … Domain-specific Pattern matching
e.g. Baseball 1961, Lunar 1971 Current: … Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01):

NLI over Database Early: example COUNTRY CAPITAL LANGUAGE France Paris
French Italy Rome Italian … Early: example pattern1: ... "capital" ... <country> pattern2: ... "capital" ... "country" “What is the capital of Italy?”  pattern1  Rome "List the capital of every country”  pattern2  <Paris, France>, <Rome, Italy> Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01):

NLI over Database Current: Intermediate representation
Front end / back end E.g. PRECISE 2003 ( Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003:

PRECISE UW, 2003 Ambiguity resolution
Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003:

PRECISE UW, 2003 Question: What are the HP jobs on a Unix system? SQL:
SELECT DISTINCT Description FROM JOB WHERE { Platform = ‘Unix’ AND Company = ‘HP’ } Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003:

QA over text Input: NL queries (Natural Language questions) Resource:
plain text: Large, uncontrolled text / Web pages Weak knowledge: e.g. WordNet, gazetteers Problem: Semantic meaning of NL questions and text sentences Logical reasoning Errors, redundancies, ambiguity in text … Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):

QA over text Question type:
Factoids: “How many people live in Israel?” List questions: “what countries speak Spanish?” Definition questions: “what is question answering system?” Complex interactive QA (ciQA): “What [familial ties] exist between [dinosaurs] and [birds]?” Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):

QA over text – START START: SynTactic Analysis using Reversible Transformations The world’s first Web-based QA system – 1993 Developed by AI Lab, MIT Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):

QA over text – START Step 1  Step 2  Answer 
Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):

QA over text – Architecture
Query analysis: preprocess and light (or deep) semantic analysis NL queries  Query object Retrieval engine: IR engine(documents)/search engine (over the Web) Query object  Passages list (containing candidate answers) Answer generator: filtering, merging of candidate passages Passage list  Answer Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):

QA over text – Architecture
CMU – JAVELIN in TREC11, 2002 IBM – PIQUANT II in TREC13, 2004 Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322. (left) Ittycheriah A, Franz M, Zhu W J, et al. IBM's Statistical Question Answering System[C]//TREC (right)

QA over text – Question analysis
Goal: generate Query object (contains constraints on answer finding) Query object: Keywords  extended keywords Question type hierarchy: Question type Expected answer type Methods: tokenize, POS tag, parser, Name-entity(NE) recognition, relation extraction … Example: LASSO, FALCON Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

LASSO – Question analysis
SMU in TREC8, 1999 55% short answer, 64.5% long answer Keyword drop Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8:

“What is the largest city in Germany?” Question type: WHAT Question focus: “largest city” Answer type: LOCATION Keywords: … NL Question Wh-term Question type Disambiguate: who/whom, where, when, why Ambiguate: what, how, which, name(v.) Question focus Keywords Combine Answer type Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8:

“What is the largest city in Germany?” Question type: WHAT Question focus: “largest city” Answer type: LOCATION Keywords: … Modifier noun  Keyword drop Name-entity  Quotation  Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8:

FALCON – Question analysis
Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Best of TREC9, in 2000 58% of short answer 76% of long answer Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9:

Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Question similarity similarity matrix transaction closure Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9:

Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9:

QA over text – Retrieval Engine
Input: Query object Output: passage list (contain candidate answers) Example: JAVELIN Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

JAVELIN – Retrieval Engine
Input: Query object (question type + answer type + keyword set) Relax algorithm: Passage POS tagging  get word data type Indexing pair <passage word, data type> While not meet time && space constraints Retrieval keyword set // comparing word and data type between passage and question Add to candidate passage list Keyword set relax // add synonym keywords Return candidate passage list Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

FDUQA – Retrieval Engine
“Name cities that have an Amtrak terminal.” Dealing with List questions Treat as a factoid question, get "New York" "New York" as a seed appearing at <A> in patterns Validate matched sentences “New York” “Preliminary plans by Amtrak that were released yesterday call for stops of its high-speed express service in Boston , Providence , R.I. , New Haven , Conn. , New York , Philadelphia , Baltimore and Washington .” Patterns: P1. (including|include|included) (<A>)+ and <A> P2. such as (<A>)+ and <A> P3. between <A> and <A> P4. (<A>)+ as well as <A> Type validate “Boston”, “Philadelphia”, “Baltimore”, “Washington” Wu L, Huang X. Lan You, Zhushuo Zhang, Xin Li, and Yaqian Zhou FDUQA on TREC2004 QA Track[C]//Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004).

QA over text – Answer Generator
Input: Query object (by question analysis) + passage list Generator answer by Merge: candidates that contain different part of answers Filter: drop redundancies and wrong answers Rank: rank candidates Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):

PIQUANT II – Answer Generator
Input passage list from: Statistic method based candidates： Wide coverage Redundancy robust Template method based candidates: Fail sometime Precise Strategy: take top 5 from statistic method candidates, merge into template candidates Chu-Carroll J, Czuba K, Prager J M, et al. IBM's PIQUANT II in TREC 2004[C]//TREC

JAVELIN – Answer Generator
Input: candidate documents containing candidate passages Each doc  same topic Strategy: merge top 1 rank of each doc Document 1 Passage 1 Passage 2 Passage n ⋮ Document 2 Document 3 ⋯ Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.

Reference: Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01): (NLIDB) Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003: (PRECISE)

Reference: Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question- answering system at trec 2002[J]. Computer Science Department, 2002: 322. (CMU JAVELIN) Ittycheriah A, Franz M, Zhu W J, et al. IBM's Statistical Question Answering System[C]//TREC (IBM PIQUANT) Chu-Carroll J, Czuba K, Prager J M, et al. IBM's PIQUANT II in TREC 2004[C]//TREC (IBM PIQUANT II) Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8: (SMU LASSO)

Reference: Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9: (FALCON) Wu L, Huang X. Lan You, Zhushuo Zhang, Xin Li, and Yaqian Zhou FDUQA on TREC2004 QA Track[C]//Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004). (FDU QA)

Thank you

Traditional Question Answering System: an Overview

Similar presentations

Presentation on theme: "Traditional Question Answering System: an Overview"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Traditional Question Answering System: an Overview

Similar presentations

Presentation on theme: "Traditional Question Answering System: an Overview"— Presentation transcript:

Similar presentations

About project

Feedback