CVPR 2019 Oral Samvit Jain; Xin Wang; Joseph E. Gonzalez

Slides:

Advertisements

Similar presentations

Indoor Segmentation and Support Inference from RGBD Images Nathan Silberman, Derek Hoiem, Pushmeet Kohli, Rob Fergus.

Advertisements

Large-Scale Entity-Based Online Social Network Profile Linkage.

SPONSORED BY SA2014.SIGGRAPH.ORG Annotating RGBD Images of Indoor Scenes Yu-Shiang Wong and Hung-Kuo Chu National Tsing Hua University CGV LAB.

Patch to the Future: Unsupervised Visual Prediction

Discriminative Segment Annotation in Weakly Labeled Video Kevin Tang, Rahul Sukthankar Appeared in CVPR 2013 (Oral)

Learning Convolutional Feature Hierarchies for Visual Recognition

What’s Making That Sound ?

Action recognition with improved trajectories

Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.

NATIONAL TECHNICAL UNIVERSITY OF ATHENS Image, Video And Multimedia Systems Laboratory Background

Fully Convolutional Networks for Semantic Segmentation

Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,

Active Frame Selection for Label Propagation in Videos Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas.

Learning video saliency from human gaze using candidate selection CVPR2013 Poster.

Deep Residual Learning for Image Recognition

Hierarchical Motion Evolution for Action Recognition Authors: Hongsong Wang, Wei Wang, Liang Wang Center for Research on Intelligent Perception and Computing,

1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.

Week 4 Report UCF Computer Vision REU 2012 Paul Finkel 6/11/12.

Naifan Zhuang, Jun Ye, Kien A. Hua

Recent developments in object detection

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

A. M. R. R. Bandara & L. Ranathunga

Object Detection based on Segment Masks

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Saliency-guided Video Classification via Adaptively weighted learning

Compositional Human Pose Regression

Arjun Watane Soumyabrata Dey

Structured Predictions with Deep Learning

Video-based human motion recognition using 3D mocap data

Unsupervised Learning and Autoencoders

Deep Residual Learning for Image Recognition

Adversarially Tuned Scene Generation

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601.

Action Recognition in Temporally Untrimmed Videos

Normalized Cut Loss for Weakly-supervised CNN Segmentation

Computer Vision James Hays

Two-Stream Convolutional Networks for Action Recognition in Videos

Video understanding using part based object detection models

EVA2: Exploiting Temporal Redundancy In Live Computer Vision

CellNetQL Image Segmentation without Feature Definition

GAN Applications.

Deep Neural Networks for Onboard Intelligence

Outline Background Motivation Proposed Model Experimental Results

Object Classification through Deconvolutional Neural Networks

Predicting Body Movement and Recognizing Actions: an Integrated Framework for Mutual Benefits Boyu Wang and Minh Hoai Stony Brook University Experiments:

Scientific Method The scientific method is the process scientist use to determine the truth. There are four steps.

Y x Linear vs. Non-linear.

边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University

Dynamic Neural Networks Joseph E. Gonzalez

Human-object interaction

U-Net: Convolutional Network for Segmentation

Feature Selective Anchor-Free Module for Single-Shot Object Detection

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation

Presented By: Harshul Gupta

Unrolling the shutter: CNN to correct motion distortions

End-to-End Facial Alignment and Recognition

Report 7 Brandon Silva.

Week 3 Volodymyr Bobyr.

Volodymyr Bobyr Supervised by Aayushjungbahadur Rana

Self-Supervised Cross-View Action Synthesis

Truman Action Recognition Status update

Visual Question Answering

Computing the Stereo Matching Cost with a Convolutional Neural Network

Nguyen Ngoc Hoang, Guee-Sang Lee, Soo-Hyung Kim, Hyung-Jeong Yang

CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.

Presentation transcript:

Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video CVPR 2019 Oral Samvit Jain; Xin Wang; Joseph E. Gonzalez University of California, Berkeley

Semantic Segmentation on Video Problem Definition Input: a video clip, not any ground truth Output: segmentation of each frame

Semantic Segmentation on Video Classical Approach Segment on each single frame

Accel, an efficient approach “Cheap” feature extraction net → Resnet 18 “Expensive” feature extraction net → Resnet 101

Network Architecture Optical Flow Warp Operation

Experiment Ablation Study 𝑁 𝑅 is always ResNet 101

Experiment Accuracy vs. inference time On CityScapes Dataset On CamVid Dataset

Experiment Comparison with Others On CityScapes Dataset On CamVid Dataset

r1. input frames r2. Accel NR branch r3. Accel NU branch r4. NR+NU, Resnet18

Conclusions The structure is very simple. And it looks faster and could get higher performance. But the comparison is unfair. The baseline is not STOA. Where is BiSeNet and ICNet??