CVPR19.

Slides:



Advertisements
Similar presentations
Computations with Big Image Data Phuong Nguyen Sponsor: NIST 1.
Advertisements

Large-Scale Object Recognition with Weak Supervision
Self-Supervised Segmentation of River Scenes Supreeth Achar *, Bharath Sankaran ‡, Stephen Nuske *, Sebastian Scherer *, Sanjiv Singh * * ‡
Spatial Pyramid Pooling in Deep Convolutional
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
Hands segmentation Pat Jangyodsuk. Motivation Alternative approach of finding hands Instead of finding bounding box, classify each pixel whether they’re.
Beauty is Here! Evaluating Aesthetics in Videos Using Multimodal Features and Free Training Data Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang School of.
Ch 5b: Discriminative Training (temporal model) Ilkka Aho.
Fully Convolutional Networks for Semantic Segmentation
ACADS-SVMConclusions Introduction CMU-MMAC Unsupervised and weakly-supervised discovery of events in video (and audio) Fernando De la Torre.
Strong Supervision From Weak Annotation Interactive Training of Deformable Part Models ICCV /05/23.
Spatial Localization and Detection
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Week 3 Emily Hand UNR. Online Multiple Instance Learning The goal of MIL is to classify unseen bags, instances, by using the labeled bags as training.
A Hierarchical Deep Temporal Model for Group Activity Recognition
Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.
Naifan Zhuang, Jun Ye, Kien A. Hua
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
Recent developments in object detection
Unsupervised Learning of Video Representations using LSTMs
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Object Detection based on Segment Masks
[Ran Manor and Amir B.Geva] Yehu Sapir Outlines Review
Interactive Offline Tracking for Color Objects
Textual Video Prediction Week 2
Adversarial Learning for Neural Dialogue Generation
Pick samples from task t
Compositional Human Pose Regression
Tracking parameter optimization
Huazhong University of Science and Technology
Deep Belief Networks Psychology 209 February 22, 2013.
Gesture recognition using deep learning
Textual Video Prediction
Computer Vision James Hays
Aoxiao Zhong Quanzheng Li Team HMS-MGH-CCDS
A critical review of RNN for sequence learning Zachary C
Image Classification.
NormFace:
Gradient Checks for ANN
Counting in Dense Crowds using Deep Learning
Mentor: Salman Khokhar
Neural Networks Geoff Hulten.
DeltaV Neural – Lab Entry
Deep Neural Networks: A Hands on Challenge Deep Neural Networks: A Hands on Challenge Deep Neural Networks: A Hands on Challenge Deep Neural Networks:
RCNN, Fast-RCNN, Faster-RCNN
Logistic Regression & Transfer Learning
Deploy Tensorflow on PySpark
Meta Learning (Part 2): Gradient Descent as LSTM
Abnormally Detection
Image processing and computer vision pipeline for segmentation and cell detection. Image processing and computer vision pipeline for segmentation and cell.
Human-object interaction
Deep Object Co-Segmentation
Rgh.
Topographic maps Differ from other maps because they show elevations in addition to direction, location and other features.
Neural Machine Translation using CNN
Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.
Object Detection Implementations
Presented By: Harshul Gupta
Week 3 Presentation Ngoc Ta Aidean Sharghi.
Multi-UAV to UAV Tracking
Week 3 Volodymyr Bobyr.
Week 7 Presentation Ngoc Ta Aidean Sharghi
Learning to Detect Human-Object Interactions with Knowledge
Truman Action Recognition Status update
Visual Grounding.
Adrian E. Gonzalez , David Parra Department of Computer Science
CVPR 2019 Oral Samvit Jain; Xin Wang; Joseph E. Gonzalez
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.
Do Better ImageNet Models Transfer Better?
Presentation transcript:

CVPR19

Motivation From image to video Collection of images? 1.5fps Fail to realize the potential offered by the preceding frames Feature reuse and warping Constrained by video dynamics

Framework (Accel) Reference branch Update branch Correction Anchoring

Network design Feature subnetwork Nfeat Task subnetwork Ntask Remove conv5 (stride 32 to 16) Task subnetwork Ntask Feature projection: Conv 1*1 Scoring label: Conv 1*1 Up-sampling Block: x16 Output block Softmax and argmax

Accel Reference NRfeat Resnet 101 Update NUfeat Resnet-18 ~ resnet-101

Algorithm If is_keyframe: Execute Save Else: W: FlowNet SF: Conv1*1

Training Pretraining reference network and update network Fine-tuning reference network and update network Training Accel keyframe interval n Ij-(n-1) as keyframe CE loss

Experiments

Experiments

Experiments

CVPR19

Motivation “However, we find that segmentation performance across the entire video varies dramatically when selecting an alternative frame for annotation. ”

Motivation How to select the best frame for annotation? Given m videos (n frames for each video) Whole video Input: video output: frame index LSTM or 3D conv m training samples Performance of images m*n training samples Relative performance of images m* 𝑛 2 m* 𝑛 𝑘+2 with reference frames

BubbleNet Loss function: Frame indices: Generating Performance Labels

BubbleNet How many passes? Reference frames Bubble sort: 1 BubbleNet: 1 (n forward passes)

Experiments

Experiments

Experiments