Deliverable #2: Question Classification Group 5 Caleb Barr Maria Alexandropoulou
Software used JAVA in order to perform feature extraction Illinois Chunker was applied to extract chunks Python – Automating classification tasks – Preprocessing of data when necessary Mallet was used for the classification task
System Properties Classification Algorithms – MaxEnt – NaiveBayes Training data – Sum of: Li and Roth Training set 5 (5500 questions) TREC-2004 Test data – Li and Roth test data set – TREC-2005.xml
System Properties (cont.) Features extracted Focused on syntactic features since we targeted coarse classification (i.e. conclusion in Li and Roth) – Unigrams – Bigrams – Trigrams – Chunks with POS tags e.g. [NP (DT) (JJ) (NN)] – Head NP/VP chunks as in Li and Roth e.g. [NP (DT the) (JJS oldest) ] in “What is the oldest profession ? “
Runs performed Runs were performed for all combinations of classification algorithms and feature templates e.g. MaxEnt, Unigrams NaiveBayes, Unigrams, Bigrams, Chunks etc
Charts
Conclusions Maximum test accuracy – TREC10: UnigramsBigramsHeads Maxent – TREC2005: UnigramsBigramsHeads NaiveBayes (MaxEnt was very close) Trigrams affect accuracy negatively – bad feature
Sample confusion matrix for our best accuracy TREC_10_MaxEnt_UnigramBigramHeads: label012345total 0 DESC ENTY ABBR HUM NUM LOC