Exemplar-SVM for Action Recognition Week 12 Presented by Christina Peterson
Recognition Accuracies on UCF Sports data set Method Accuracy (%) Diving Golfing Kicking Lifting Riding Running Skating Swing-bench High-swing Walking Rodriguez et al. [1] 69.2 68 61 66 75 74 73 - Yeffet and Wolf [2] 79.3 100 65 67 69 92 86 Le et al. [4] 86.5 77.8 80 66.7 83.3 90.9 Wu et al. [6] 91.3 88 93 84 95 91 Action Bank [7] 95.0 83 89 Standard Multiclass-SVMs 90.2 50 71.4 Combined Exemplar-SVMs 94.4 96.6 87.5 91.7 98.1 97.9 82.5
Confusion Matrix: Combined Exemplar-SVM Di Go Ki Li Ho Ru Sk Sb Ss Wa Diving 96.6 1.9 1.5 87.5 8.3 4.2 91.7 100.0 83.3 16.7 98.1 97.9 2.1 2.8 94.4 3.3 10.9 82.5 Golf Kick Lift Horse-Ride Run Skateboard Swing-bench Swing-side Walk
Modifications Ran STIP for kicking action class Lowered the threshold for weak interest points to obtain more interest points The number of interest points collected for each video should be approximately equal to each other
Constraints The Exemplar Set The Validation Set/Test Set Each exemplar should be a good representation of the action class Needs a good variety The Validation Set/Test Set The validation set and test set should be similar to each other Motion Color Confusing Videos were omitted from all sets For example: Accelerating a skateboard by foot closely resembles walking
References [1] M. D. Rodriguez, J. Ahmed, and M. Shah. Action mach: A spatio- temporal maximum average correlation height filter for action recognition. In CVPR, 2008. [2] Yeffet and L. Wolf. Local trinary patterns for human action recognition. In ICCV, 2009. [3] H. Wang, M. Ullah, A. Klaser, I. Laptev, and C. Schmid. Evaluation of local spatio-temporal features for action recognition. In BMVC, 2009. [4] Q. Le, W. Zou, S. Yeung, and A. Ng. Learning hierarchical invariant spatiotemporal features for action recognition with independent subspace analysis. In CVPR, 2011. [5] A. Kovashka and K. Grauman. Learning a hierarchy of discriminative spacetime neighborhood features for human action recognition. InCVPR, 2010. [6] X. Wu, D. Xu, L. Duan, and J. Luo. Action recognition using context and appearance distribution features. InCVPR, 2011. [7] S. Sadanand and J. J. Corso. Action bank: A high-level representation of activity in video. CVPR, 2012.