Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Slides:



Advertisements
Similar presentations
Computer Vision Lecture 18: Object Recognition II
Advertisements

Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Spatial Pyramid Pooling in Deep Convolutional
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Presented by: Kamakhaya Argulewar Guided by: Prof. Shweta V. Jain
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Robust Object Tracking by Hierarchical Association of Detection Responses Present by fakewen.
Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.
Fully Convolutional Networks for Semantic Segmentation
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Spatial Localization and Detection
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
April 21, 2016Introduction to Artificial Intelligence Lecture 22: Computer Vision II 1 Canny Edge Detector The Canny edge detector is a good approximation.
Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
Naifan Zhuang, Jun Ye, Kien A. Hua
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Analysis of Sparse Convolutional Neural Networks
Object Detection based on Segment Masks
Deep Learning Amin Sobhani.
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Understanding and Predicting Image Memorability at a Large Scale
Efficient Deep Model for Monocular Road Segmentation
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Fully Convolutional Networks for Semantic Segmentation
Human-level control through deep reinforcement learning
Computer Vision James Hays
A critical review of RNN for sequence learning Zachary C
Introduction to Neural Networks
Image Classification.
Tractable MAP Problems
Object Detection + Deep Learning
Object Detection Creation from Scratch Samsung R&D Institute Ukraine
Neural Networks Geoff Hulten.
Lecture: Deep Convolutional Neural Networks
Papers 15/08.
Outline Background Motivation Proposed Model Experimental Results
SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC
John H.L. Hansen & Taufiq Al Babba Hasan
RCNN, Fast-RCNN, Faster-RCNN
Emir Zeylan Stylianos Filippou
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Unsupervised Perceptual Rewards For Imitation Learning
Automatic Handwriting Generation
Human-object interaction
Keshav Balasubramanian
“Traditional” image segmentation
Motivation Semantic Transformation Module Most of the existing works neglect the semantic relationship between the visual feature and linguistic knowledge,
Feature Selective Anchor-Free Module for Single-Shot Object Detection
Semantic Segmentation
Object Detection Implementations
CVPR19.
Presented By: Harshul Gupta
Background Task Fashion image inpainting Some conceptions
Deep Structured Scene Parsing by Learning with Image Descriptions
Volodymyr Bobyr Supervised by Aayushjungbahadur Rana
Week 7 Presentation Ngoc Ta Aidean Sharghi
Learning to Cluster Faces on an Affinity Graph
Point Set Representation for Object Detection and Beyond
Introduction Face detection and alignment are essential to many applications such as face recognition, facial expression recognition, age identification,
Visual Grounding.
CVPR 2019 Poster.
Presentation transcript:

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Method generating a dictionary of non-local prototype masks over the entire image predicting a set of linear combination coefficients per instance

Framework

Protonet

Head Architecture

Other Improvements Fast NMS we simply allow already-removed detections to suppress other detections first compute a c × n × n pairwise IoU matrix X for the top n detections sorted descending by score find which detections to remove by checking if there are any higher-scoring detections with a corresponding IoU greater than some threshold t.

Other Improvements Semantic Segmentation Loss we simply attach a 1x1 conv layer with c output channels directly to the largest feature map (P3) in our backbone

Experiments

Experiments

Experiments

Experiments

Motivation how and where to add the supervision from detection ground-truth and the one from a different network

feature mimic

Two-stage Mimic The prediction of the detector in Faster-RCNN or R-FCN detector can be regarded as a classification task. the category classification information learned by the large model can be passed to the small network

Result

Result

Motivation Human can recognize the ”gist” of the scene and it is accomplished by relying on relevant prior knowledge.

contributions a memory-guided interleaving framework where multiple feature extractors an adaptive interleaving policy demonstrate on-device the fastest mobile video detection model

Framework

Interleaved Models SSD-style [24] detection f0 optimized for accuracy f1 optimized for speed Shared memory

Interleaved Models Memory Module Bottlenecking Divide the LSTM state into groups and use grouped convolutions

Result

Adaptive Interleaving Policy We denote the state as: The action history is a binary vector of length 20. For all k, the k-th entry of η is 1 if f1 was run k steps ago and 0 otherwise.

Adaptive Interleaving Policy we define the reward as the sum of a speed reward and an accuracy reward. For the speed reward, we simply define a positive constant γ and give γ reward when f1 is run.

Adaptive Interleaving Policy For the accuracy reward, we compute the detection losses after running each feature extractor. take the loss difference between the minimum-loss feature extractor and the selected feature extractor

Adaptive Interleaving Policy

Inference Optimizations Asynchronous Inference Quantization

Experiments

Experiments