Query-Focused Video Summarization – Week 1 Jacob Laurel Mentors: Aidean Sharghi and Dr. Boqing Gong UCF CRCV REU 2016
Project Proposal Goal: Summarize a video with a diverse subset of frames (meaning no redundancy). Also frames will be annotated so that a semantic summary will also be produced by the algorithm. Annotate a video data set to use for testing and training Possible applications of this project: Egocentric video summarization (i.e. Google Glass) , surveillance footage summarization, video search engine recommendations, pose estimation
Determinantal Point Process Probabilistic process Motivation: We wish to randomly take a diverse subset A, of some set S We want to formulate our subset such that elements with diverse/different features are most likely (Intuitively, elements with similar features “repel” each other) Such probabilities can be computed via determinants of a matrix (hence name)
Fig. 3. A diverse subset of frames from a video Illustration of DPPs Fig 1. MATLAB demo of a SDPP Fig 2. Graphical illustration of points uniformly sampled (left) and distributed according to a DPP (right) Fig. 3. A diverse subset of frames from a video
Step 1) Data Preparation Annotate and prepare videos taken from UT Egocentric dataset This will be done in conjunction with a GUI that can specify semantic information about the frames (used when querying)
Step 2) Design of the DPP Our Design will follow a sequential structure Different Neural Network configurations can be used to learn the best kernel matrix Existing structure incorporates a single hidden layer Different image features will be experimented with to determine the best model. Potential directions (as of 5/26) SIFT features GIST global image descriptor Visual Bag of words model We also propose to incorporate semantic features into the feature vector
Step 3) Experimental Procedure The DPP will be tested using various feature vectors in OpenCV and Matlab Neural Network to learn the kernel matrix will be constructed in Keras or Caffe Model will need to be trained from scratch since no existing weights for this application are readily available Scope will focus on feed-forward Neural Networks, as opposed to CNN’s to reduce computational requirements and allow for more degrees of freedom for other areas
Checklist for this week: Downloaded all necessary software Familiarized with Keras and reproduced simple NN’s Ran and edited MATLAB DPP code Familiarized with existing literature and downloaded the Data set from UT’s website Reproduced simple version of Dr. Gong’s paper
Results from implementing a simple DPP Fig. 5. Original video Fig. 4. Subset of images generated by the DPP kernel
Next Week To-do Annotate Data set, with short descriptions Generate Ground truth summaries Compare method with other state-of-the-art approaches