ABDULLAH ALOTAYQ, DONG WANG, ED PHAM PROJECT BY: Q/A SYSTEM ABDULLAH ALOTAYQ, DONG WANG, ED PHAM PROJECT BY:
COMPONENTS: Query Processing Passage Retrieval Answer Extraction
QUERY PROCESSING Classification Package Mallet Classifiers: Maxent DecisionTree NaiveBayes Balanced Winnow
QUERY PROCESSING Features Semantic Morphological Neighboring (Syntactic)
QUERY PROCESSING Stemming NLTK stemmer Trigrams: Poor Classification results Named Entity Recognition NLTK NER Pre-trained model to do this task. 6 types of NE
Query Processing Our Results: For Binary: BalancedWinnow: Testing Accuracy = 0.804 MaxEnt: Testing Accuracy = 0.78 For Real Values: BalancedWinnow: Testing Accuracy = 0.784 MaxEnt: Testing Accuracy = 0.758 Named Entity Recognition Testing Accuracy = 0.802
QUERY EXPANSION Two different methods: Target Concatenation Add the target for each question to the end of the question. Deletion/Addition Deletion of wh-words + function words Addition of synonyms and hypernyms (via WordNet)
QUERY EXPANSION Addition Synonyms Hypernyms First Ancestor Morphological variants WordNet as thesaurus: wordnet.morphy Poor results
PASSAGE RETRIEVAL Using Indri/Lemur Ran both query reformulation/expansion approaches through the software. Took the top 50 documents per query.
PASSAGE RETRIEVAL Used Indri/Lemur Took the top passage from each of the top 50 documents for each query. Query grammar #combine[passageWIDTH:INC] Default for system: 120 terms, 1000 terms window
PASSAGE RETRIEVAL Passage Re-ranking Modified the window size 500, 1000 terms Modified the number of top passages taken from the top 50 documents: 1, 5, 10, 20, 25 passages
ANSWER EXTRACTION Stemming Applied to queries. Stopwords Applied it to the indexing Removed all the stopwords Removed all but the wh-words
ANSWER EXTRACTION Term Weighting Applied it to the queries Changed the weights of the target terms and query terms Utilized query grammar to implement this Snippet Extraction Using Indri’s API to implement this Encountered problems with fixed snippet size (due to hardcoding)
EVALUATION QE Approach MAP Target Concatenation 0.3223 Document ranking Note: All results based on TREC-2004 QE Approach MAP Target Concatenation 0.3223 Subtraction + WordNet 0.2381
EVALUATION (CONT.) Stopwords in indexing MAP No stopwords removed 0.3223 Stopwords removed 0.3262 Keeping WH-words 0.3407
EVALUATION (CONT.) Passage Retrieval QE Approach Type MRR Target Concatenation Strict 0.195439095783 Lenient 0.392501775644 Subtraction + WordNet 0.180698539579 0.341009813194
EVALUATION (CONT.) Window Size Type MRR 1000 Strict 0.195439095783 Passage Re-ranking: Window Size Window Size Type MRR 1000 Strict 0.195439095783 Lenient 0.392501775644 500 0.209317276517 0.383193743722 100 0.340829170969 0.48166863823
EVALUATION (CONT.) Strict Lenient Non-stemmed 0.340829170969 Stemming on query terms (using the Porter Stemmer) Strict Lenient Non-stemmed 0.340829170969 0.48166863823 Stemmed 0.372366362694 0.500396143568
EVALUATION (CONT.) Strict Lenient 100 0.290138907213 0.407219304738 Snippet Extraction (using Indri/Lemur with different window sizes) Strict Lenient 100 0.290138907213 0.407219304738 500 0.212304221474 0.352469418397 1000 0.201755977721 0.332300412861
EVALUATION (CONT.) Strict Lenient Term weighting on queries Balanced (no weights) 0.372366362694 0.500396143568 Query = .33, Target = .66 0.27309189993 0.415572848272 Query = .66, Target = .33 0.34215110982 0.466614280652 Query = .80, Target = .20 0.302979241834 0.420147005052
FINAL RESULTS Strict Lenient 100 0.078516123253 0.132782385953 250 TREC 2004 (Training Data) Strict Lenient 100 0.078516123253 0.132782385953 250 0.260734831756 0.36403939276 1000 0.385047304625 0.518316248372
FINAL RESULTS Strict Lenient 100 0.0617858062617 0.150900599492 250 TREC 2004 (Training Data) Strict Lenient 100 0.0617858062617 0.150900599492 250 0.131509659052 0.237742464648 1000 0.294352184431 0.477678260382
CONCLUSIONS Some things were helpful… Stemming Stopwords Window Size/Query Grammar changes While others weren’t… Our attempt at Query Expansion Term Weighting We found improvement from the previous deliverable, but nothing dramatic. Still a lot left to be desired for future work (i.e. apply other answer extraction methods)
FUTURE WORK Work more with the Text Snippet feature from Indri? Change the code to enable different snippet sizes Applying the work from query classification to our answer extraction or passage re- ranking Semantic Role Labeling Finding Bad Candidates Using redundancy-based QA ARANEA Structure-based extraction FrameNet
SOFTWARE PACKAGES USED Mallet Indri/Lemur NLTK Porter Stemmer Self-written Code Stanford Parser, Berkeley Parser
READINGS Employing Two Question Answering Systems in TREC-2005, Sanda Harabagiu & others. Query Expansion/Reformulation Kwok, Etzioni, and Weld, 2001 Lin, 2007 Fang, 2008 Aktolga et al, 2011 Passage Retrieval Tiedemann et al, 2008 Indri/Lemur documentation