Download presentation
Presentation is loading. Please wait.
1
Traditional Question Answering System: an Overview
汤顺雷
2
Category NLIDB (Natural Language Interface over Database) QA over text
Document-base: TREC, given a corpus Web-base: START(AILab MIT, 1993) Semantic ontology-base … Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):
3
NLI over Database Early: Current: … Domain-specific Pattern matching
e.g. Baseball 1961, Lunar 1971 Current: … Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01):
4
NLI over Database Early: example COUNTRY CAPITAL LANGUAGE France Paris
French Italy Rome Italian … Early: example pattern1: ... "capital" ... <country> pattern2: ... "capital" ... "country" “What is the capital of Italy?” pattern1 Rome "List the capital of every country” pattern2 <Paris, France>, <Rome, Italy> Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01):
5
NLI over Database Current: Intermediate representation
Front end / back end E.g. PRECISE 2003 ( Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003:
6
PRECISE UW, 2003 Ambiguity resolution
Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003:
7
PRECISE UW, 2003 Question: What are the HP jobs on a Unix system? SQL:
SELECT DISTINCT Description FROM JOB WHERE { Platform = ‘Unix’ AND Company = ‘HP’ } Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003:
8
QA over text Input: NL queries (Natural Language questions) Resource:
plain text: Large, uncontrolled text / Web pages Weak knowledge: e.g. WordNet, gazetteers Problem: Semantic meaning of NL questions and text sentences Logical reasoning Errors, redundancies, ambiguity in text … Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):
9
QA over text Question type:
Factoids: “How many people live in Israel?” List questions: “what countries speak Spanish?” Definition questions: “what is question answering system?” Complex interactive QA (ciQA): “What [familial ties] exist between [dinosaurs] and [birds]?” Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):
10
QA over text – START START: SynTactic Analysis using Reversible Transformations The world’s first Web-based QA system – 1993 Developed by AI Lab, MIT Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):
11
QA over text – START Step 1 Step 2 Answer
Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):
12
QA over text – Architecture
Query analysis: preprocess and light (or deep) semantic analysis NL queries Query object Retrieval engine: IR engine(documents)/search engine (over the Web) Query object Passages list (containing candidate answers) Answer generator: filtering, merging of candidate passages Passage list Answer Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):
13
QA over text – Architecture
CMU – JAVELIN in TREC11, 2002 IBM – PIQUANT II in TREC13, 2004 Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322. (left) Ittycheriah A, Franz M, Zhu W J, et al. IBM's Statistical Question Answering System[C]//TREC (right)
14
QA over text – Question analysis
Goal: generate Query object (contains constraints on answer finding) Query object: Keywords extended keywords Question type hierarchy: Question type Expected answer type Methods: tokenize, POS tag, parser, Name-entity(NE) recognition, relation extraction … Example: LASSO, FALCON Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.
15
LASSO – Question analysis
SMU in TREC8, 1999 55% short answer, 64.5% long answer Keyword drop Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8:
16
LASSO – Question analysis
“What is the largest city in Germany?” Question type: WHAT Question focus: “largest city” Answer type: LOCATION Keywords: … NL Question Wh-term Question type Disambiguate: who/whom, where, when, why Ambiguate: what, how, which, name(v.) Question focus Keywords Combine Answer type Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8:
17
LASSO – Question analysis
“What is the largest city in Germany?” Question type: WHAT Question focus: “largest city” Answer type: LOCATION Keywords: … Modifier noun Keyword drop Name-entity Quotation Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8:
18
FALCON – Question analysis
Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Best of TREC9, in 2000 58% of short answer 76% of long answer Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9:
19
FALCON – Question analysis
Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Question similarity similarity matrix transaction closure Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9:
20
FALCON – Question analysis
Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9:
21
FALCON – Question analysis
Question caching Cached answers Question reformulation Parse + NE recognize Answer type WordNet keywords yes no Question semantic form Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9:
22
QA over text – Retrieval Engine
Input: Query object Output: passage list (contain candidate answers) Example: JAVELIN Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.
23
JAVELIN – Retrieval Engine
Input: Query object (question type + answer type + keyword set) Relax algorithm: Passage POS tagging get word data type Indexing pair <passage word, data type> While not meet time && space constraints Retrieval keyword set // comparing word and data type between passage and question Add to candidate passage list Keyword set relax // add synonym keywords Return candidate passage list Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.
24
FDUQA – Retrieval Engine
“Name cities that have an Amtrak terminal.” Dealing with List questions Treat as a factoid question, get "New York" "New York" as a seed appearing at <A> in patterns Validate matched sentences “New York” “Preliminary plans by Amtrak that were released yesterday call for stops of its high-speed express service in Boston , Providence , R.I. , New Haven , Conn. , New York , Philadelphia , Baltimore and Washington .” Patterns: P1. (including|include|included) (<A>)+ and <A> P2. such as (<A>)+ and <A> P3. between <A> and <A> P4. (<A>)+ as well as <A> Type validate “Boston”, “Philadelphia”, “Baltimore”, “Washington” Wu L, Huang X. Lan You, Zhushuo Zhang, Xin Li, and Yaqian Zhou FDUQA on TREC2004 QA Track[C]//Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004).
25
QA over text – Answer Generator
Input: Query object (by question analysis) + passage list Generator answer by Merge: candidates that contain different part of answers Filter: drop redundancies and wrong answers Rank: rank candidates Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2):
26
PIQUANT II – Answer Generator
Input passage list from: Statistic method based candidates: Wide coverage Redundancy robust Template method based candidates: Fail sometime Precise Strategy: take top 5 from statistic method candidates, merge into template candidates Chu-Carroll J, Czuba K, Prager J M, et al. IBM's PIQUANT II in TREC 2004[C]//TREC
27
JAVELIN – Answer Generator
Input: candidate documents containing candidate passages Each doc same topic Strategy: merge top 1 rank of each doc Document 1 Passage 1 Passage 2 Passage n ⋮ Document 2 Document 3 ⋯ Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question-answering system at trec 2002[J]. Computer Science Department, 2002: 322.
28
Reference: Lopez V, Uren V, Sabou M, et al. Is question answering fit for the semantic web?: a survey[J]. Semantic Web, 2011, 2(2): Androutsopoulos I, Ritchie G D, Thanisch P. Natural language interfaces to databases–an introduction[J]. Natural language engineering, 1995, 1(01): (NLIDB) Popescu A M, Etzioni O, Kautz H. Towards a theory of natural language interfaces to databases[C]//Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 2003: (PRECISE)
29
Reference: Nyberg E, Mitamura T, Carbonell J G, et al. The javelin question- answering system at trec 2002[J]. Computer Science Department, 2002: 322. (CMU JAVELIN) Ittycheriah A, Franz M, Zhu W J, et al. IBM's Statistical Question Answering System[C]//TREC (IBM PIQUANT) Chu-Carroll J, Czuba K, Prager J M, et al. IBM's PIQUANT II in TREC 2004[C]//TREC (IBM PIQUANT II) Moldovan D I, Harabagiu S M, Pasca M, et al. LASSO: A Tool for Surfing the Answer Net[C]//TREC. 1999, 8: (SMU LASSO)
30
Reference: Harabagiu S M, Moldovan D I, Pasca M, et al. FALCON: Boosting Knowledge for Answer Engines[C]//TREC. 2000, 9: (FALCON) Wu L, Huang X. Lan You, Zhushuo Zhang, Xin Li, and Yaqian Zhou FDUQA on TREC2004 QA Track[C]//Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004). (FDU QA)
31
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.