CVPR 2019 Oral Samvit Jain; Xin Wang; Joseph E. Gonzalez

Slides:



Advertisements
Similar presentations
Indoor Segmentation and Support Inference from RGBD Images Nathan Silberman, Derek Hoiem, Pushmeet Kohli, Rob Fergus.
Advertisements

Large-Scale Entity-Based Online Social Network Profile Linkage.
SPONSORED BY SA2014.SIGGRAPH.ORG Annotating RGBD Images of Indoor Scenes Yu-Shiang Wong and Hung-Kuo Chu National Tsing Hua University CGV LAB.
Patch to the Future: Unsupervised Visual Prediction
Discriminative Segment Annotation in Weakly Labeled Video Kevin Tang, Rahul Sukthankar Appeared in CVPR 2013 (Oral)
Learning Convolutional Feature Hierarchies for Visual Recognition
What’s Making That Sound ?
Action recognition with improved trajectories
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
NATIONAL TECHNICAL UNIVERSITY OF ATHENS Image, Video And Multimedia Systems Laboratory Background
Fully Convolutional Networks for Semantic Segmentation
Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,
Active Frame Selection for Label Propagation in Videos Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas.
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
Deep Residual Learning for Image Recognition
Hierarchical Motion Evolution for Action Recognition Authors: Hongsong Wang, Wei Wang, Liang Wang Center for Research on Intelligent Perception and Computing,
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Week 4 Report UCF Computer Vision REU 2012 Paul Finkel 6/11/12.
Naifan Zhuang, Jun Ye, Kien A. Hua
Recent developments in object detection
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
A. M. R. R. Bandara & L. Ranathunga
Object Detection based on Segment Masks
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Saliency-guided Video Classification via Adaptively weighted learning
Compositional Human Pose Regression
Arjun Watane Soumyabrata Dey
Structured Predictions with Deep Learning
Video-based human motion recognition using 3D mocap data
Unsupervised Learning and Autoencoders
Deep Residual Learning for Image Recognition
Adversarially Tuned Scene Generation
Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601.
Action Recognition in Temporally Untrimmed Videos
Normalized Cut Loss for Weakly-supervised CNN Segmentation
Computer Vision James Hays
Two-Stream Convolutional Networks for Action Recognition in Videos
Video understanding using part based object detection models
EVA2: Exploiting Temporal Redundancy In Live Computer Vision
CellNetQL Image Segmentation without Feature Definition
GAN Applications.
Deep Neural Networks for Onboard Intelligence
Outline Background Motivation Proposed Model Experimental Results
Object Classification through Deconvolutional Neural Networks
Predicting Body Movement and Recognizing Actions: an Integrated Framework for Mutual Benefits Boyu Wang and Minh Hoai Stony Brook University Experiments:
Scientific Method The scientific method is the process scientist use to determine the truth. There are four steps.
Y x Linear vs. Non-linear.
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Dynamic Neural Networks Joseph E. Gonzalez
Human-object interaction
U-Net: Convolutional Network for Segmentation
Feature Selective Anchor-Free Module for Single-Shot Object Detection
Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.
CVPR19.
Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation
Presented By: Harshul Gupta
Unrolling the shutter: CNN to correct motion distortions
End-to-End Facial Alignment and Recognition
Report 7 Brandon Silva.
Week 3 Volodymyr Bobyr.
Jiahe Li
Volodymyr Bobyr Supervised by Aayushjungbahadur Rana
Self-Supervised Cross-View Action Synthesis
Truman Action Recognition Status update
Visual Question Answering
Computing the Stereo Matching Cost with a Convolutional Neural Network
Nguyen Ngoc Hoang, Guee-Sang Lee, Soo-Hyung Kim, Hyung-Jeong Yang
CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.
Presentation transcript:

Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video CVPR 2019 Oral Samvit Jain; Xin Wang; Joseph E. Gonzalez University of California, Berkeley

Semantic Segmentation on Video Problem Definition Input: a video clip, not any ground truth Output: segmentation of each frame

Semantic Segmentation on Video Classical Approach Segment on each single frame

Accel, an efficient approach “Cheap” feature extraction net → Resnet 18 “Expensive” feature extraction net → Resnet 101

Network Architecture Optical Flow Warp Operation

Experiment Ablation Study 𝑁 𝑅 is always ResNet 101

Experiment Accuracy vs. inference time On CityScapes Dataset On CamVid Dataset

Experiment Comparison with Others On CityScapes Dataset On CamVid Dataset

r1. input frames r2. Accel NR branch r3. Accel NU branch r4. NR+NU, Resnet18

Conclusions The structure is very simple. And it looks faster and could get higher performance. But the comparison is unfair. The baseline is not STOA. Where is BiSeNet and ICNet??