R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Slides:



Advertisements
Similar presentations
Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)
Advertisements

Lecture 6: Classification & Localization
Classification spotlights
1 Accurate Object Detection with Joint Classification- Regression Random Forests Presenter ByungIn Yoo CS688/WST665.
R-CNN By Zhang Liliang.
Spatial Pyramid Pooling in Deep Convolutional
On the Object Proposal Presented by Yao Lu
From R-CNN to Fast R-CNN
Generic object detection with deformable part-based models
Presented by: Kamakhaya Argulewar Guided by: Prof. Shweta V. Jain
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Detection, Segmentation and Fine-grained Localization
Object Detection with Discriminatively Trained Part Based Models
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Object detection, deep learning, and R-CNNs
Fully Convolutional Networks for Semantic Segmentation
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Object Detection Overview Viola-Jones Dalal-Triggs Deformable models Deep learning.
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Cascade Region Regression for Robust Object Detection
Lecture 4a: Imagenet: Classification with Localization
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Spatial Localization and Detection
Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
Recent developments in object detection
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Faster R-CNN – Concepts
Convolutional Neural Network
Object Detection based on Segment Masks
Compact Bilinear Pooling
Object detection with deformable part-based models
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Understanding and Predicting Image Memorability at a Large Scale
Lecture 24: Convolutional neural networks
Combining CNN with RNN for scene labeling (segmentation)
Object detection, deep learning, and R-CNNs
Training Techniques for Deep Neural Networks
CS6890 Deep Learning Weizhen Cai
Object detection as supervised classification
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Object detection.
Fully Convolutional Networks for Semantic Segmentation
Bird-species Recognition Using Convolutional Neural Network
Computer Vision James Hays
Recognition IV: Object Detection through Deep Learning and R-CNNs
Introduction to Neural Networks
Image Classification.
RGB-D Image for Scene Recognition by Jiaqi Guo
“The Truth About Cats And Dogs”
Object Detection + Deep Learning
On-going research on Object Detection *Some modification after seminar
Object Classes Most recent work is at the object level We perceive the world in terms of objects, belonging to different classes. What are the differences.
Faster R-CNN By Anthony Martinez.
Lecture: Deep Convolutional Neural Networks
Outline Background Motivation Proposed Model Experimental Results
Object Tracking: Comparison of
RCNN, Fast-RCNN, Faster-RCNN
Heterogeneous convolutional neural networks for visual recognition
Convolutional Neural Network
Department of Computer Science Ben-Gurion University of the Negev
Deep Object Co-Segmentation
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Semantic Segmentation
Object Detection Implementations
Jiahe Li
Presentation transcript:

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016

Sources Main paper Additional sources: “Rich feature hierarchies for accurate object detection and semantic segmentation”, Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik; UC Berkeley 2014 Additional sources: Stanford course: Convolutional Neural Networks for Visual Recognition Ross Girshick papers and slides 11/11/2018

Introduction Computer vision problems: Classification Localization Detection Segmentation 11/11/2018

Classification problem The problem is decide what is shown on the picture : a dog or a cat Solution ConvNet AlexNet 11/11/2018

Localization problem The localization problem: Is it a cat (or cats) on a picture The number of cats is known Draw a box around the cat 11/11/2018

Detection problem The detection problem: Find all cats and dogs on the picture The number of cats and dogs isn’t known Draw a box around each cat 11/11/2018

Segmentation problem The segmentation problem: Find all cats on the picture The number of cats isn’t known Mark all pixels belong to the cats (draw a couture) 11/11/2018

Localization Method Localization regression network Classification + Localization The output : Class Box is defined by (x, y, w, h) Classification + Localization regression network class Softmax loss (x,y,w,h) L2 loss 11/11/2018

Localization Method Sliding Window: Overfeat Run classification + regression network at multiple locations on a high resolution image Combine classifier and regressor predictions across all locations for final prediction 11/11/2018

Localization Method Sliding Window: Overfeat Run classification + regression network at multiple locations on a high resolution image Combine classifier and regressor predictions across all locations for final prediction 11/11/2018

Localization Method Sliding Window: Overfeat Run classification + regression network at multiple locations on a high resolution image Combine classifier and regressor predictions across all locations for final prediction Overfeat method reduces the complexity by joint calculations 11/11/2018

Detection Sliding Window Apply window all possible positions and scales Very complex Pre-processing for feature extraction Histogram of oriented gradients (HOG) Next run linear classifier on all possible positions 11/11/2018

Detection We would like to run CNN on many positions and scales Problem: It is too complex Solution: Run CNN on sub-set of the positions, which are most likely representing the desired object 11/11/2018

Region proposals Selective search Grouping Edge boxes Window scoring Good comparison: “What makes for effective detection proposals?”, Jan Hosang at el 11/11/2018

Selective search Start form a set of pixels Merge similar color pixels to the regions 11/11/2018

R-CNN Idea 11/11/2018

R-CNN Idea Region proposals (selective search) ~2000 per image Wrap each region to predefined size sub-image Run CNN on each sub-image for 4096 future extraction Classification (per class SVM) Regression of localization 11/11/2018

R-CNN training Step1: Start with trained CNN (AlexNet) trained on ImegeNet (1000 classes) Step 2: Modify the last fully connected layer to output 21 classes scores (20 +background) Train CNN with PASCAL VOC data set (20 classes) Step 3: Run region propose algorithm on the images set Wrap the regions to sub-images Run forward the sub-images through the CNN and save the feature pool layer outputs Step 4: Train the SVMs using save before features. Separate binary SVM for each class Step 5 For each class train box regression. Linear regression model is used. The goal is a final tuning of the proposed region 11/11/2018

Performance evaluation Metric : mean average precision (mAP) 0-100% A detection is succeeded, if the class is correct and Intersection of Unions (IoU) with the true object box < threshold ( e.g. 0.5) Data sets PASCAL VOC: 20 classes, ~20K images ILSVRC 2014: 200 classes, ~470K images 11/11/2018

Results VOC 2010 test 11/11/2018

Results 11/11/2018

Problem of R-CNN Slow at test time: needs to run full forward pass for each region proposal Complicated and far from optimal training SVM is CNN are trained as separated systems 11/11/2018

Fast R-CNN First applies CNN on large scale image, the output is feature map Shared computation for the proposed regions Poling of the regions of interest the feature map The rest is the same as in R-CNN 11/11/2018

Fast R-CNN Simple training Less computation at the test time 11/11/2018

Faster R-CNN In fast R-CNN, the bottle neck is region proposal method A new region proposal network Simplified training Speed up X200 as compared to R-CNN 11/11/2018

Questions ? Thank you 11/11/2018