Action Recognition.

Action Recognition

Dataset UCF101 HMDB51 Kinetics

HMDB51 51 classes 7,000 clips

Kinetics 400 classes 300,000 clips

Architectures 3D Convnet 2D convnet → LSTM

3D Convnet Uses 3d kernel to interpret temporal data
Is slower to train as it

2D Convnet → LSTM Can use image recongnition 2D convnet as a starting point to speed training Can be very deep due to using LSTM

Python, Tensorflow, Caffe, Examples
Comfortable with Python Have used Tensorflow Need to finish installing Caffe Coding examples. ( HMDB51

Presentation on theme: "Action Recognition."— Presentation transcript: