Faster R-CNN – Concepts Student Presentation by: Assaf Livne Based on the work of: Ross Girdhick, Shaoqing Ren, Kaiming He אוניברסיטת בן-גוריון בנגב Ben-Gurion University of the Negev Faculty of Engineering Sciences Department of Electrical Engineering
Introduction R-CNN Concepts Fast R-CNN Concepts Faster R-CNN Concepts 5/12/2016 Introduction R-CNN Concepts Fast R-CNN Concepts Faster R-CNN Concepts Conclusion
17/11/2016 Introduction
Introduction ImageNet 17/11/2016 https://www.youtube.com/watch?v=n5uP_LP9SmM
17/11/2016 Introduction Kitti https://www.youtube.com/watch?v=QtgIyJlIn44 https://www.youtube.com/watch?v=Vj17JK3J1JU https://www.youtube.com/watch?v=s4gY6pw_mPQ
17/11/2016 R-CNN
17/11/2016 R-CNN concepts Lets combine Localization NN and classification NN in the simplest way.
17/11/2016 http://www.robots.ox.ac.uk/~tvg/publications/talks/fast-rcnn-slides.pdf
17/11/2016 https://webcourse.cs.technion.ac.il/236815/Spring2016/ho/WCFiles/RCNN_X3_6pp.pdf
Training Process Take a pre-trained classification network. 17/11/2016 Training Process Take a pre-trained classification network. Re-train the last fully connected layer with the objects that need to be detected + "no-object" class. Get all proposals(=~2000 p/image), resize them to match the cnn input, then save to disk. Train SVM to classify between object and background. BB Regression: Train a linear regression classifier that will output some correction factor. https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/object_localization_and_detection.html
17/11/2016 R-CNN drawbacks Numerous Candidate object locations must be processed – slow training time and test time Ad hoc training objective. correlation between image space location and detected class is developing.
17/11/2016 Fast R-CNN
17/11/2016 Fast R-CNN concepts Instead of running the CNN on every proposal lets try to save some resources. Train all the layers in a single stage. No memory consumption. Inspired from the VGG16 concept - Very Deep CNN.
17/11/2016 http://www.robots.ox.ac.uk/~tvg/publications/talks/fast-rcnn-slides.pdf, 33
17/11/2016 http://www.robots.ox.ac.uk/~tvg/publications/talks/fast-rcnn-slides.pdf
17/11/2016 ROI pooling Type of max-pooling with a pool size dependent on the input, so that the output always has the same size. This is done because fully connected layer always expected the same input size. https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/object_localization_and_detection.html
17/11/2016
17/11/2016
17/11/2016
R-CNN vs Fast R-CNN Testing time: mAP (VOC 2007): 49s 2.32s 66% 66.9% 17/11/2016 R-CNN vs Fast R-CNN Testing time: 49s 2.32s mAP (VOC 2007): 66% 66.9%
17/11/2016 Fast R-CNN drawbacks Still depends on an external object proposal system . Which is the major bottleneck from computing resources point of view.
17/11/2016 Faster R-CNN
17/11/2016 Faster R-CNN concepts Using the already running CNN to infer region proposals.
Faster R-CNN Pipe-Lines 17/11/2016 Faster R-CNN Pipe-Lines Get feature maps from the deep convolution layers. Train a Region Proposal Network (RPN). Give proposals to the ROI pooling layer. Send proposals to a fully connected layer to finish the classification.
17/11/2016 https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/object_localization_and_detection.html
Region proposal Network (RPN) 17/11/2016 Region proposal Network (RPN) Basically the RPN is a sliding window which slides on the feature map. Sends as an output the locations of the proposals windows.
Fast R-CNN vs Faster R-CNN 17/11/2016 Fast R-CNN vs Faster R-CNN Testing time: 2.32s 0.2s mAP (VOC 2007): 66.9% 66.9%
17/11/2016 Conclusion
17/11/2016 Conclusion “Using the recently popular terminology of neural networks with ’attention’ mechanisms, the RPN module tells the Fast R-CNN module where to look.”
’Attention’ Mechanisms 17/11/2016 ’Attention’ Mechanisms Rather than using all available information, we need to select the most pertinent piece of information. https://www.youtube.com/watch?v=IGQmdoK_ZfY
RNN – Encoder Decoder Model 17/11/2016 RNN – Encoder Decoder Model https://talbaumel.github.io/attention/
RNN - Attention Model 17/11/2016 https://talbaumel.github.io/attention/
Natural Language Processing (NLP) 17/11/2016 Natural Language Processing (NLP) Show, Attend and Tell – Kelvin Xu et al 2015
Natural Language Processing (NLP) 17/11/2016 Natural Language Processing (NLP) Show, Attend and Tell – Kelvin Xu et al 2015
Thank you for your attention! 17/11/2016 Thank you for your attention!