Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Method generating a dictionary of non-local prototype masks over the entire image predicting a set of linear combination coefficients per instance

Framework

Protonet

Head Architecture

Other Improvements Fast NMS
we simply allow already-removed detections to suppress other detections first compute a c × n × n pairwise IoU matrix X for the top n detections sorted descending by score find which detections to remove by checking if there are any higher-scoring detections with a corresponding IoU greater than some threshold t.

Other Improvements Semantic Segmentation Loss
we simply attach a 1x1 conv layer with c output channels directly to the largest feature map (P3) in our backbone

Experiments

Motivation how and where to add the supervision from detection ground-truth and the one from a different network

feature mimic

Two-stage Mimic The prediction of the detector in Faster-RCNN or R-FCN detector can be regarded as a classification task. the category classification information learned by the large model can be passed to the small network

Result

Motivation Human can recognize the ”gist” of the scene and it is accomplished by relying on relevant prior knowledge.

contributions a memory-guided interleaving framework where multiple feature extractors an adaptive interleaving policy demonstrate on-device the fastest mobile video detection model

Framework

Interleaved Models SSD-style [24] detection
f0 optimized for accuracy f1 optimized for speed Shared memory

Interleaved Models Memory Module Bottlenecking
Divide the LSTM state into groups and use grouped convolutions

Result

Adaptive Interleaving Policy
We denote the state as: The action history is a binary vector of length 20. For all k, the k-th entry of η is 1 if f1 was run k steps ago and 0 otherwise.

we define the reward as the sum of a speed reward and an accuracy reward. For the speed reward, we simply define a positive constant γ and give γ reward when f1 is run.

For the accuracy reward, we compute the detection losses after running each feature extractor. take the loss difference between the minimum-loss feature extractor and the selected feature extractor

Inference Optimizations
Asynchronous Inference Quantization

Experiments

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Similar presentations

Presentation on theme: "Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Similar presentations

Presentation on theme: "Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks."— Presentation transcript:

Similar presentations

About project

Feedback