A Basic Q/A System: Passage Retrieval. Outline  Query Expansion  Document Ranking  Passage Retrieval  Passage Re-ranking.

A Basic Q/A System: Passage Retrieval

Outline  Query Expansion  Document Ranking  Passage Retrieval  Passage Re-ranking

Query Expansion  Two different methods: Target Concatenation ○ Add the target for each question to the end of the question. Deletion/Addition ○ Deletion of wh-words + function words ○ Addition of synonyms and hypernyms (via WordNet)

Query Expansion  Deletion ItemFreq. Function words144 Q-words7 Low content verbs30 Question Mark1 181

Query Expansion  Addition Synonyms Hypernyms ○ First Ancestor Morphological variants ○ WordNet as thesaurus: wordnet.morphy

Document Retrieval  Using Indri/Lemur  Ran both query reformulation/expansion approaches through the software.  Took the top 50 documents per query.

Passage Retrieval  Used Indri/Lemur  Took the top passage from each of the top 50 documents for each query.  Query grammar #combine[passageWIDTH:INC] Default for system: 120 terms, 1000 terms window

Passage Re-ranking  Modified the window size 500, 1000 terms  Modified the number of top passages taken from the top 50 documents: 1, 5, 10, 20, 25 passages

Evaluation  Document ranking Note: All results based on TREC-2004 QE ApproachMAP Target Concatenation0.3223 Subtraction + WordNet0.2381

Evaluation  Passage Retrieval QE ApproachTypeMRR Target Concatenation Strict0.195439095783 Lenient0.392501775644 Subtraction + WordNet Strict0.180698539579 Lenient0.341009813194

Evaluation  Passage re-ranking: Top N passages NTypeMRR 1 Strict0.117647058824 Lenient0.313725490196 5 Strict0.183088235294 Lenient0.375408496732 10 Strict0.190662931839 Lenient0.386188920012 20 Strict0.193482690336 Lenient0.390536326633 25 Strict0.194581567305 Lenient0.391579287296

Evaluation  Passage Re-ranking: Window Size Window SizeTypeMRR 1000 Strict0.195439095783 Lenient0.392501775644 500 Strict0.209317276517 Lenient0.383193743722 100 Strict0.340829170969 Lenient0.48166863823

Conclusions  “Less is Better”… for the most part. Query Expansion was not beneficial in improving passage retrieval. Smaller window size contributed to higher scores. Not the case for the top N passages though ○ Less passages resulted in lower scores ○ Mainly because of less passages to work with

Issues and Future Improvements  Run times Poor performance times for “addition/subtraction” query expansion approach Too broad of a query ○ Reduce the number of hypernyms/synonyms  Limited documents Only did 50, could have done more Same with passages

Issues and Future Improvements  Query Grammar Change it to assist in passage re-ranking Examples ○ #score ○ passage length ○ different weights for different terms

Readings  Query Expansion/Reformulation Kwok, Etzioni, and Weld, 2001 Lin, 2007 Fang, 2008 Aktolga et al, 2011  Passage Retrieval Tiedemann et al, 2008 Indri/Lemur documentation

Explorations  CELEX English, Dutch, German Lexical resource Beneficial for adding Derivational variants  Sepia MIT developed Symantec system Semantic Parsing for Named Entities  Both not available online  Query Expansion Techniques for Question Answering, by Matthew W. Bilotti

A Basic Q/A System: Passage Retrieval. Outline  Query Expansion  Document Ranking  Passage Retrieval  Passage Re-ranking.

Similar presentations

Presentation on theme: "A Basic Q/A System: Passage Retrieval. Outline  Query Expansion  Document Ranking  Passage Retrieval  Passage Re-ranking."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Basic Q/A System: Passage Retrieval. Outline  Query Expansion  Document Ranking  Passage Retrieval  Passage Re-ranking.

Similar presentations

Presentation on theme: "A Basic Q/A System: Passage Retrieval. Outline  Query Expansion  Document Ranking  Passage Retrieval  Passage Re-ranking."— Presentation transcript:

Similar presentations

About project

Feedback