A Basic Q/A System: Passage Retrieval
Outline Query Expansion Document Ranking Passage Retrieval Passage Re-ranking
Query Expansion Two different methods: Target Concatenation ○ Add the target for each question to the end of the question. Deletion/Addition ○ Deletion of wh-words + function words ○ Addition of synonyms and hypernyms (via WordNet)
Query Expansion Deletion ItemFreq. Function words144 Q-words7 Low content verbs30 Question Mark1 181
Query Expansion Addition Synonyms Hypernyms ○ First Ancestor Morphological variants ○ WordNet as thesaurus: wordnet.morphy
Document Retrieval Using Indri/Lemur Ran both query reformulation/expansion approaches through the software. Took the top 50 documents per query.
Passage Retrieval Used Indri/Lemur Took the top passage from each of the top 50 documents for each query. Query grammar #combine[passageWIDTH:INC] Default for system: 120 terms, 1000 terms window
Passage Re-ranking Modified the window size 500, 1000 terms Modified the number of top passages taken from the top 50 documents: 1, 5, 10, 20, 25 passages
Evaluation Document ranking Note: All results based on TREC-2004 QE ApproachMAP Target Concatenation Subtraction + WordNet0.2381
Evaluation Passage Retrieval QE ApproachTypeMRR Target Concatenation Strict Lenient Subtraction + WordNet Strict Lenient
Evaluation Passage re-ranking: Top N passages NTypeMRR 1 Strict Lenient Strict Lenient Strict Lenient Strict Lenient Strict Lenient
Evaluation Passage Re-ranking: Window Size Window SizeTypeMRR 1000 Strict Lenient Strict Lenient Strict Lenient
Conclusions “Less is Better”… for the most part. Query Expansion was not beneficial in improving passage retrieval. Smaller window size contributed to higher scores. Not the case for the top N passages though ○ Less passages resulted in lower scores ○ Mainly because of less passages to work with
Issues and Future Improvements Run times Poor performance times for “addition/subtraction” query expansion approach Too broad of a query ○ Reduce the number of hypernyms/synonyms Limited documents Only did 50, could have done more Same with passages
Issues and Future Improvements Query Grammar Change it to assist in passage re-ranking Examples ○ #score ○ passage length ○ different weights for different terms
Readings Query Expansion/Reformulation Kwok, Etzioni, and Weld, 2001 Lin, 2007 Fang, 2008 Aktolga et al, 2011 Passage Retrieval Tiedemann et al, 2008 Indri/Lemur documentation
Explorations CELEX English, Dutch, German Lexical resource Beneficial for adding Derivational variants Sepia MIT developed Symantec system Semantic Parsing for Named Entities Both not available online Query Expansion Techniques for Question Answering, by Matthew W. Bilotti