Download presentation
Presentation is loading. Please wait.
1
A Basic Q/A System: Passage Retrieval
2
Outline Query Expansion Document Ranking Passage Retrieval Passage Re-ranking
3
Query Expansion Two different methods: Target Concatenation ○ Add the target for each question to the end of the question. Deletion/Addition ○ Deletion of wh-words + function words ○ Addition of synonyms and hypernyms (via WordNet)
4
Query Expansion Deletion ItemFreq. Function words144 Q-words7 Low content verbs30 Question Mark1 181
5
Query Expansion Addition Synonyms Hypernyms ○ First Ancestor Morphological variants ○ WordNet as thesaurus: wordnet.morphy
6
Document Retrieval Using Indri/Lemur Ran both query reformulation/expansion approaches through the software. Took the top 50 documents per query.
7
Passage Retrieval Used Indri/Lemur Took the top passage from each of the top 50 documents for each query. Query grammar #combine[passageWIDTH:INC] Default for system: 120 terms, 1000 terms window
8
Passage Re-ranking Modified the window size 500, 1000 terms Modified the number of top passages taken from the top 50 documents: 1, 5, 10, 20, 25 passages
9
Evaluation Document ranking Note: All results based on TREC-2004 QE ApproachMAP Target Concatenation0.3223 Subtraction + WordNet0.2381
10
Evaluation Passage Retrieval QE ApproachTypeMRR Target Concatenation Strict0.195439095783 Lenient0.392501775644 Subtraction + WordNet Strict0.180698539579 Lenient0.341009813194
11
Evaluation Passage re-ranking: Top N passages NTypeMRR 1 Strict0.117647058824 Lenient0.313725490196 5 Strict0.183088235294 Lenient0.375408496732 10 Strict0.190662931839 Lenient0.386188920012 20 Strict0.193482690336 Lenient0.390536326633 25 Strict0.194581567305 Lenient0.391579287296
12
Evaluation Passage Re-ranking: Window Size Window SizeTypeMRR 1000 Strict0.195439095783 Lenient0.392501775644 500 Strict0.209317276517 Lenient0.383193743722 100 Strict0.340829170969 Lenient0.48166863823
13
Conclusions “Less is Better”… for the most part. Query Expansion was not beneficial in improving passage retrieval. Smaller window size contributed to higher scores. Not the case for the top N passages though ○ Less passages resulted in lower scores ○ Mainly because of less passages to work with
14
Issues and Future Improvements Run times Poor performance times for “addition/subtraction” query expansion approach Too broad of a query ○ Reduce the number of hypernyms/synonyms Limited documents Only did 50, could have done more Same with passages
15
Issues and Future Improvements Query Grammar Change it to assist in passage re-ranking Examples ○ #score ○ passage length ○ different weights for different terms
16
Readings Query Expansion/Reformulation Kwok, Etzioni, and Weld, 2001 Lin, 2007 Fang, 2008 Aktolga et al, 2011 Passage Retrieval Tiedemann et al, 2008 Indri/Lemur documentation
17
Explorations CELEX English, Dutch, German Lexical resource Beneficial for adding Derivational variants Sepia MIT developed Symantec system Semantic Parsing for Named Entities Both not available online Query Expansion Techniques for Question Answering, by Matthew W. Bilotti
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.