Academia Sinica, Taiwan 1/10 Argument Score Combination for Constituents Tzong-Han Tsai, Chia-Wei Wu, Yu- Chun Lin, and Wen-Lian Hsu Institute of Information Science Academia Sinica
Academia Sinica, Taiwan 2/10 Argument Score Combination for Constituents A0, A1, A2, …, …, [0.2, 0.4+1, 0.1, … ] constituent i [0.2, 0.4, 0.1, … ] ME SVM
Academia Sinica, Taiwan 3/10 Integer Program Global Optimization Integer Program ∑ ….+0.2Z i,0 + (0.4+1)Z i,1 + … + Constraints: e.g, No overlapping or embedding, No duplicate argument classes, Illegal arguments
Academia Sinica, Taiwan 4/10 Ordinary Classifier Combination Combining rank or score of different classifiers to improve performance Data fusion: Rank vs. score combination –A theoretical justification based on Cayley graphs –D. Frank Hsu, Jacob Shapiro and Isak Taksa –ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalRe ports/TechReports/2002/ doc.gzftp://dimacs.rutgers.edu/pub/dimacs/TechnicalRe ports/TechReports/2002/ doc.gz
Academia Sinica, Taiwan 5/10 Complementary Rate Component classifiers should –complement each others mistakes –accurate and diverse Complementary Measurement (Brill and Wu, 1998) –complementary rate of classifiers A and B:
Academia Sinica, Taiwan 6/10 A Comparison of PROSP and PSIPRED Protein Secondary Structure Prediction
Academia Sinica, Taiwan 7/10 Combine SVM and ME in SRL Accuracy –ME: 92.8% –SVM: 93.1% Complementary Rate –Comp (ME, SVM) = 39.1 –Comp (SVM, ME) = 31.8 Since both SVM and ME are accurate and diverse enough, combining SVM and ME is feasible.
Academia Sinica, Taiwan 8/10 Rank-Score Combination Since SVM’s F-score (75.76) is higher than ME (74.46), and when SVM and ME have different answers, the answer is more likely to be from SVM than ME, we devised a pro-SVM score combination method. Emphasize on the arguments which SVM consider most probable. (rank #1 by SVM)
Academia Sinica, Taiwan 9/10 Experiment Results Pure ME Pure SVM ME + IP 75.1 Comb + IP 76.53
Academia Sinica, Taiwan 10/10 Conclusions We use an intermediate score combination scheme based on ME and SVM rather than at the final system level. –ME+SVM improves the F-measure by 0.7 % –Integer Program improves the result by 0.7% Our system has the best precision in almost every category, but somewhat worse recall –We only use one parse result for each test sentence. In retrospect, it is probably worthwhile to consider multiple parse results and combine them to raise the recall.