Faster R-CNN By Anthony Martinez.

Slides:

Advertisements

Similar presentations

Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)

Advertisements

Regionlets for Generic Object Detection

Lecture 6: Classification & Localization

Recognition: A machine learning approach

Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

R-CNN By Zhang Liliang.

DeeperVision and DeepInsight Solutions

From R-CNN to Fast R-CNN

Generic object detection with deformable part-based models

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

Last part: datasets and object collections. CMU/MIT frontal facesvasc.ri.cmu.edu/idb/html/face/frontal_images cbcl.mit.edu/software-datasets/FaceData2.html.

Detection, Segmentation and Fine-grained Localization

Object detection, deep learning, and R-CNNs

Fully Convolutional Networks for Semantic Segmentation

Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.

Cascade Region Regression for Robust Object Detection

Lecture 4a: Imagenet: Classification with Localization

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Spatial Localization and Detection

Radboud University Medical Center, Nijmegen, Netherlands

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Recent developments in object detection

LSUN Semantic Segmentation Extended PSPNet

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

Faster R-CNN – Concepts

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

Machine Learning for Big Data

Object Detection based on Segment Masks

Object detection with deformable part-based models

Data Mining, Neural Network and Genetic Programming

Object detection, deep learning, and R-CNNs

Lecture 5 Smaller Network: CNN

Training Techniques for Deep Neural Networks

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Non-linear classifiers Neural networks

Object detection.

HOGgles Visualizing Object Detection Features

A Convolutional Neural Network Cascade For Face Detection

Speaker: Lingxi Xie Authors: Lingxi Xie, Qi Tian, Bo Zhang

Bird-species Recognition Using Convolutional Neural Network

Computer Vision James Hays

Aoxiao Zhong Quanzheng Li Team HMS-MGH-CCDS

Recognition IV: Object Detection through Deep Learning and R-CNNs

Image Classification.

Counting in Dense Crowds using Deep Learning

Object Detection + Deep Learning

On-going research on Object Detection *Some modification after seminar

Very Deep Convolutional Networks for Large-Scale Image Recognition

Smart Robots, Drones, IoT

Object Detection Creation from Scratch Samsung R&D Institute Ukraine

Outline Background Motivation Proposed Model Experimental Results

Object Tracking: Comparison of

Analysis of Trained CNN (Receptive Field & Weights of Network)

RCNN, Fast-RCNN, Faster-RCNN

Course Recap and What’s Next?

Department of Computer Science Ben-Gurion University of the Negev

Human-object interaction

Deep Object Co-Segmentation

Semantic Segmentation

Object Detection Implementations

Learning Deconvolution Network for Semantic Segmentation

End-to-End Facial Alignment and Recognition

CSC 578 Neural Networks and Deep Learning

Point Set Representation for Object Detection and Beyond

Introduction Face detection and alignment are essential to many applications such as face recognition, facial expression recognition, age identification,

Van-Thanh Hoang May 11, 2019 Improving Object Localization with Fitness NMS and Bounded IoU Loss Lachlan Tychsen-Smith, Lars.

Presentation transcript:

Faster R-CNN By Anthony Martinez

VS Faster R-CNN (Object localization/ detection) Dog (Image classification) VS

Faster R-CNN Self driving cars? Pedestrian detection Surveillance Data analysis Extract information from images …... Faster R-CNN Why do we need this?

Regions with Convolutional Neural Networks (R-CNN) Fast R-CNN { Selective Search Faster R-CNN Fast R-CNN Faster R-CNN

Bounding Box proposals Selective Search

Selective Search Faster R-CNN replaces bounding box proposals with a fully convolutional method. Why? Because convolutions, that’s why! Actually, SS was really slow.

Two main parts Region Proposal Network Fast R-CNN (also this) Pre-trained network

VGG-16 Nothing to See here. image 13 sharable convolutions

VGG- 16 Feature Map RPN Fast R-CNN

Regional Proposal Network (RPN) Foreground vs Background Bounding Box regression Feed bounding boxes into Fast RCNN Regional Proposal Network (RPN)

Mapping the center of the receptive fields (0,0) (1,3) (0,0) (16,48)

Anchor Boxes k anchor boxes 3 scales (8,16, 32) 3 aspect ratios (.5, 1, 2) Stride 16 WHk anchors W Feature Map Nothing to See here. H Center (x,y)

Anchor Boxes

Anchor Boxes

RPN Object vs Not an Object Object = 1 to: Anchor RPN Object = 1 to: Anchors with the highest Intersection-over-Union(IoU) IoU > 0.7 with any ground truth box. Not object = -1 If IoU <0.3

RPN 512-d (x,y) Mapping the center of the receptive fields (Sx,Sy)

RPN (512 × (2 + 4) × 9) parameters for VGG-16) SigmoidCrossEntropyLoss SmoothL1Loss RPN 512

RPN Multi-task loss: Mini batch size =256 Hyper parameter =10 Only if p*= 1 Number of Anchor locations

Fast R-CNN RoI Pooling NMS For each proposal (Non-max suppression)

Region of Interest (RoI): Fast R-CNN Region of Interest (RoI):

Region of Interest (RoI): Fast R-CNN Region of Interest (RoI): 3X3 RoI pooling .74 | .39 | .34 .2 | .16 | .73 .83 | .97 | .88

Region of Interest (RoI): Fast R-CNN 7X7 RoI pooling Per proposal Region of Interest (RoI): .74 | .39 | .34 .2 | .16 | .73 .83 | .97 | .88 Only a problem for segmentation

Fast R-CNN label softmax RoI Pooling Fully connected layers NMS For each proposal Bbox regression bbox

Pascal 2007 1 GPU ~ 6 hrs. ResNet50 with FPN 100000 iterations aeroplane 0.6956 bicycle 0.7861 bird 0.7006 boat 0.5905 bottle 0.5609 bus 0.7418 car 0.7928 cat 0.8337 chair 0.4722 cow 0.7728 Dining table 0.619 dog 0.8537 horse 0.8002 motorbike 0.7759 person 0.7748 Potted plant 0.4103 sheep 0.6723 sofa 0.672 train 0.7412 Tv monitor 0.656 mAP 0.69612 1 GPU ~ 6 hrs. ResNet50 with FPN 100000 iterations Base LR: .0025 Steps: 30K, 40K -> *.1 RoIAlign

1 GPU ~ 5 hrs. 1 GPU ~ 8 hrs. ResNet50 ResNet50 60K iterations Base LR: .0025 Steps: 30K, 40K -> *.1 RoIAlign 1 GPU ~ 8 hrs. ResNet50 100K iterations Base LR: .001 Steps: 30K, 40K -> *.1 RoIPooling mAP 0.6425 mAP 0.6667 1 GPU ~ 7 hrs. ResNet50 100K iterations Base LR: .002 Steps: 30K, 40K, 80K -> *.1 RoIAlign 1 GPU ~ 8 hrs. ResNet50 200K iterations Base LR: .001 Steps: 30K, 40K, 130K, 140K -> *.1 RoIPooling mAP 0.7023 mAP 0.6031

References https://arxiv.org/pdf/1506.01497.pdf (Faster R-CNN) https://arxiv.org/pdf/1504.08083.pdf (Fast R-CNN) https://arxiv.org/pdf/1506.06981.pdf (R-CNN minus R) https://koen.me/research/pub/uijlings-ijcv2013-draft.pdf (Selective Search for Object Detection) https://arxiv.org/pdf/1703.06870.pdf (Mask R-CNN) http://host.robots.ox.ac.uk/pascal/VOC/ https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0 http://kaiminghe.com/iccv15tutorial/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf https://lovesnowbest.site/2018/02/27/Intro-to-Object-Detection/ https://blog.deepsense.ai/region-of-interest-pooling-explained/ https://tryolabs.com/blog/2018/01/18/faster-r-cnn-down-the-rabbit-hole-of-modern-object-detection/