Object Classification through Deconvolutional Neural Networks Student: Carlos Rubiano Mentor: Oliver Nina
ChaLearn Looking at People Action/interaction spotting on RGB data Recognize actions using 235 performances of 11 action classes recorded and manually labeled in continuous RGB sequences of people performing natural isolated and collaborative behaviors
Video Classification with CNN Use CNN structure to classify actions in videos Using similar network that classifies CIFAR and takes inputs of 32 x 32 x 3 Also do this using ImageNet network and takes inputs of 256 x 256 x 3 Extract the frames from the videos Applied optical flow on the frames of the videos Resized the data for faster performance, and match inputs of the network Optical flow gives temporal information across two frames
ChaLearn Looking at People Dataset Optical Flow Optical Flow
Model
Results from ChaLearn Looking at People 2014 With automatic segmentation
Results on Validation Set Spatial DCNN Stacks Randomly Initialized Pretrained with Fine tuning 1 frame 0.53 0.57 3 frames 0.55 5 frames 0.6 Temporal DCNN Stacks Randomly Initialized Pretrained with Fine tuning 1 frame 0.58 0.5 3 frames 0.54 - 5 frames With manual segmentation
Also… Add: - motion boundary histograms - Dense SIFT - HOG