CVPR19.

Slides:

Advertisements

Similar presentations

Computations with Big Image Data Phuong Nguyen Sponsor: NIST 1.

Advertisements

Large-Scale Object Recognition with Weak Supervision

Self-Supervised Segmentation of River Scenes Supreeth Achar *, Bharath Sankaran ‡, Stephen Nuske *, Sebastian Scherer *, Sanjiv Singh * * ‡

Spatial Pyramid Pooling in Deep Convolutional

Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.

Hands segmentation Pat Jangyodsuk. Motivation Alternative approach of finding hands Instead of finding bounding box, classify each pixel whether they’re.

Beauty is Here! Evaluating Aesthetics in Videos Using Multimodal Features and Free Training Data Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang School of.

Ch 5b: Discriminative Training (temporal model) Ilkka Aho.

Fully Convolutional Networks for Semantic Segmentation

ACADS-SVMConclusions Introduction CMU-MMAC Unsupervised and weakly-supervised discovery of events in video (and audio) Fernando De la Torre.

Strong Supervision From Weak Annotation Interactive Training of Deformable Part Models ICCV /05/23.

Spatial Localization and Detection

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Week 3 Emily Hand UNR. Online Multiple Instance Learning The goal of MIL is to classify unseen bags, instances, by using the labeled bags as training.

A Hierarchical Deep Temporal Model for Group Activity Recognition

Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.

Naifan Zhuang, Jun Ye, Kien A. Hua

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Recent developments in object detection

Unsupervised Learning of Video Representations using LSTMs

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.

Object Detection based on Segment Masks

[Ran Manor and Amir B.Geva] Yehu Sapir Outlines Review

Interactive Offline Tracking for Color Objects

Textual Video Prediction Week 2

Adversarial Learning for Neural Dialogue Generation

Pick samples from task t

Compositional Human Pose Regression

Tracking parameter optimization

Huazhong University of Science and Technology

Deep Belief Networks Psychology 209 February 22, 2013.

Gesture recognition using deep learning

Textual Video Prediction

Computer Vision James Hays

Aoxiao Zhong Quanzheng Li Team HMS-MGH-CCDS

A critical review of RNN for sequence learning Zachary C

Image Classification.

Gradient Checks for ANN

Counting in Dense Crowds using Deep Learning

Mentor: Salman Khokhar

Neural Networks Geoff Hulten.

DeltaV Neural – Lab Entry

Deep Neural Networks: A Hands on Challenge Deep Neural Networks: A Hands on Challenge Deep Neural Networks: A Hands on Challenge Deep Neural Networks:

RCNN, Fast-RCNN, Faster-RCNN

Logistic Regression & Transfer Learning

Deploy Tensorflow on PySpark

Meta Learning (Part 2): Gradient Descent as LSTM

Abnormally Detection

Image processing and computer vision pipeline for segmentation and cell detection. Image processing and computer vision pipeline for segmentation and cell.

Human-object interaction

Deep Object Co-Segmentation

Topographic maps Differ from other maps because they show elevations in addition to direction, location and other features.

Neural Machine Translation using CNN

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Object Detection Implementations

Presented By: Harshul Gupta

Week 3 Presentation Ngoc Ta Aidean Sharghi.

Multi-UAV to UAV Tracking

Week 3 Volodymyr Bobyr.

Week 7 Presentation Ngoc Ta Aidean Sharghi

Learning to Detect Human-Object Interactions with Knowledge

Truman Action Recognition Status update

Visual Grounding.

Adrian E. Gonzalez , David Parra Department of Computer Science

CVPR 2019 Oral Samvit Jain; Xin Wang; Joseph E. Gonzalez

Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.

Do Better ImageNet Models Transfer Better?

Presentation transcript:

CVPR19

Motivation From image to video Collection of images？ 1.5fps Fail to realize the potential offered by the preceding frames Feature reuse and warping Constrained by video dynamics

Framework (Accel) Reference branch Update branch Correction Anchoring

Network design Feature subnetwork Nfeat Task subnetwork Ntask Remove conv5 (stride 32 to 16) Task subnetwork Ntask Feature projection: Conv 1*1 Scoring label: Conv 1*1 Up-sampling Block: x16 Output block Softmax and argmax

Accel Reference NRfeat Resnet 101 Update NUfeat Resnet-18 ~ resnet-101

Algorithm If is_keyframe: Execute Save Else: W: FlowNet SF: Conv1*1

Training Pretraining reference network and update network Fine-tuning reference network and update network Training Accel keyframe interval n Ij-(n-1) as keyframe CE loss

Experiments

Experiments

Experiments

CVPR19

Motivation “However, we find that segmentation performance across the entire video varies dramatically when selecting an alternative frame for annotation. ”

Motivation How to select the best frame for annotation? Given m videos (n frames for each video) Whole video Input: video output: frame index LSTM or 3D conv m training samples Performance of images m*n training samples Relative performance of images m* 𝑛 2 m* 𝑛 𝑘+2 with reference frames

BubbleNet Loss function: Frame indices: Generating Performance Labels

BubbleNet How many passes? Reference frames Bubble sort: 1 BubbleNet: 1 (n forward passes)

Experiments

Experiments

Experiments