CVPR19
Motivation From image to video Collection of images? 1.5fps Fail to realize the potential offered by the preceding frames Feature reuse and warping Constrained by video dynamics
Framework (Accel) Reference branch Update branch Correction Anchoring
Network design Feature subnetwork Nfeat Task subnetwork Ntask Remove conv5 (stride 32 to 16) Task subnetwork Ntask Feature projection: Conv 1*1 Scoring label: Conv 1*1 Up-sampling Block: x16 Output block Softmax and argmax
Accel Reference NRfeat Resnet 101 Update NUfeat Resnet-18 ~ resnet-101
Algorithm If is_keyframe: Execute Save Else: W: FlowNet SF: Conv1*1
Training Pretraining reference network and update network Fine-tuning reference network and update network Training Accel keyframe interval n Ij-(n-1) as keyframe CE loss
Experiments
Experiments
Experiments
CVPR19
Motivation “However, we find that segmentation performance across the entire video varies dramatically when selecting an alternative frame for annotation. ”
Motivation How to select the best frame for annotation? Given m videos (n frames for each video) Whole video Input: video output: frame index LSTM or 3D conv m training samples Performance of images m*n training samples Relative performance of images m* 𝑛 2 m* 𝑛 𝑘+2 with reference frames
BubbleNet Loss function: Frame indices: Generating Performance Labels
BubbleNet How many passes? Reference frames Bubble sort: 1 BubbleNet: 1 (n forward passes)
Experiments
Experiments
Experiments