Training Techniques for Deep Neural Networks

Slides:



Advertisements
Similar presentations
Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)
Advertisements

A brief review of non-neural-network approaches to deep learning
Improving the Fisher Kernel for Large-Scale Image Classification Florent Perronnin, Jorge Sanchez, and Thomas Mensink, ECCV 2010 VGG reading group, January.
Aggregating local image descriptors into compact codes
Lecture 6: Classification & Localization
Limin Wang, Yu Qiao, and Xiaoou Tang
ImageNet Classification with Deep Convolutional Neural Networks
Karen Simonyan Andrew Zisserman
Spatial Pyramid Pooling in Deep Convolutional
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Fully Convolutional Networks for Semantic Segmentation
Deep Convolutional Nets
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.
Lecture 4a: Imagenet: Classification with Localization
Deep Residual Learning for Image Recognition
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.
Convolutional Neural Networks at Constrained Time Cost (CVPR 2015) Authors : Kaiming He, Jian Sun (MSR) Presenter : Hyunjun Ju 1.
Cancer Metastases Classification in Histological Whole Slide Images
Recent developments in object detection
Learning to Compare Image Patches via Convolutional Neural Networks
Demo.
Faster R-CNN – Concepts
Deep Feedforward Networks
The Relationship between Deep Learning and Brain Function
Object Detection based on Segment Masks
Compact Bilinear Pooling
Data Mining, Neural Network and Genetic Programming
Computer Science and Engineering, Seoul National University
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
The Problem: Classification
Learning Mid-Level Features For Recognition
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
ECE 6504 Deep Learning for Perception
Efficient Deep Model for Monocular Road Segmentation
Deep Belief Networks Psychology 209 February 22, 2013.
CS6890 Deep Learning Weizhen Cai
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
ECE 599/692 – Deep Learning Lecture 6 – CNN: The Variants
Fully Convolutional Networks for Semantic Segmentation
Computer Vision James Hays
Introduction to Neural Networks
Image Classification.
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
INF 5860 Machine learning for image classification
Object Classification through Deconvolutional Neural Networks
Very Deep Convolutional Networks for Large-Scale Image Recognition
Smart Robots, Drones, IoT
KFC: Keypoints, Features and Correspondences
Lecture: Deep Convolutional Neural Networks
Outline Background Motivation Proposed Model Experimental Results
Analysis of Trained CNN (Receptive Field & Weights of Network)
Mihir Patel and Nikhil Sardana
Image Classification & Training of Neural Networks
Heterogeneous convolutional neural networks for visual recognition
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Course Recap and What’s Next?
CSC 578 Neural Networks and Deep Learning
Reuben Feinman Research advised by Brenden Lake
Human-object interaction
Deep Object Co-Segmentation
Natalie Lang Tomer Malach
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Image recognition.
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Training Techniques for Deep Neural Networks 9/22/2018 Training Techniques for Deep Neural Networks Deep Learning Seminar School of Electrical Engineering – Tel Aviv University Yuval YaacobY & Tal Shapira

Presentation Based on the Paper 9/22/2018 Training Techniques for Deep Neural Networks

Outline Background Paper Scenarios Training and target datasets 9/22/2018 Background Fisher vector Structure of Convolutional Neural Networks Paper Scenarios Training and target datasets Paper results Training Techniques for Deep Neural Networks

Generic Visual Categorization Using Shallow Methods 9/22/2018 Patch detection: interest points, segments, regular patches… Feature extraction: SIFT, color statistics, moments… Visual dictionary: Kmeans, GMM, Random Forest… Image representations: BOV, FV… Classification: SVM, Softmax… Training Techniques for Deep Neural Networks

SIFT Scale Invariant Feature Transform Distinctive Image Features from Scale-Invariant Keypoints, 2004, D.Lowe, University of British Columbia Scale Invariant Feature Transform Extract keypoints and compute its descriptors 9/22/2018 Training Techniques for Deep Neural Networks

Fisher Vector(1) Feature vector is derivative w.r.t probabilistic model Measure Similarity using the Fisher Kernel Fisher Information Matrix Learning a classifier on Fisher Kernel equals learning a linear classifier on with 9/22/2018 Training Techniques for Deep Neural Networks

Fisher Vector(2) choose uλ where to be a Gaussian mixture model (GMM) Let X be a D-dimensional local features from an image Fisher vector is the concatenation of for i = 1. . .K, and is therefore 2KD-dimensional 9/22/2018 Training Techniques for Deep Neural Networks

Improved Fisher Vector Improving the Fisher Kernel for Large-Scale Image Classification Florent Perronnin, Jorge S´anchez, and Thomas Mensink, 2010 L2 Normalization Remove FV dependence on the amount of image specific information / background information Power Normalization “Unsparsify” the representation of the FV Spatial Pyramids 9/22/2018 Training Techniques for Deep Neural Networks

IFV 1: L2 Normalization 9/22/2018 By construction the Fisher Vector discards descriptors which are likely to occur in any image The FV focus on image specific features However, the FV depends on the amount of image specific information / background information L2 Normalization to remove this dependence Training Techniques for Deep Neural Networks

IFV 2: Power Normalization 9/22/2018 As the number of Gaussians increase, the FV becomes sparser fewer descriptors are assigned with a significant probability to each Gaussian Power normalization to “unsparsify”: Training Techniques for Deep Neural Networks

IFV 3: Spatial Pyramids Multi-level recursive image decomposition Take rough geometry into account Repeatedly subdividing an image and computing histograms of local features at increasingly fine resolutions by pooling descriptor-level statistics 9/22/2018 Training Techniques for Deep Neural Networks

Structure of CNN General CNN 9/22/2018 Training Techniques for Deep Neural Networks

Structure of Alexnet ImageNet Classification with Deep Convolutional Neural Networks Krizhevsky et al. 1.2M images in 1K categories 5 Convolutional Layers and 3 Fully Connected Layers 9/22/2018 Training Techniques for Deep Neural Networks

Convolutional Layer (1) Accepts a volume of size W1×H1×D1 Requires four hyper-parameters: Number of filters K their spatial extent F (receptive field size) the stride S the amount of zero padding P Produces a volume of size W2×H2×D2 where: W2=(W1−F+2P)/S+1 H2=(H1−F+2P)/S+1 D2=K 9/22/2018 Training Techniques for Deep Neural Networks

Convolutional Layer (2) Receptive Field Size – 11 Stride – 4 Zero Padding – 0 (3) Conv Layer output: 55x55x96 55*55*96 = 290,400 neurons each has 11*11*3 = 363 weights Parameter sharing 9/22/2018 Training Techniques for Deep Neural Networks

RELU Nonlinearity Non-saturating nonlinearity (RELU) Quick to learn 9/22/2018 Non-saturating nonlinearity (RELU) Quick to learn Training Techniques for Deep Neural Networks

Pooling Layers Max-Pooling Overlapping Pooling 9/22/2018 Max-Pooling Overlapping Pooling We generally observe during training that models with overlapping pooling find it slightly more difficult to overfit Training Techniques for Deep Neural Networks

Local Response Normalization (LRN) 9/22/2018  “lateral inhibition” Normalize across channels “brightness normalization” Reduces Alexnet top-1 and top-5 error rates by 1.4% and 1.2% respectively Training Techniques for Deep Neural Networks

Fully-Connected Layer 9/22/2018 Penultimate Layer Dropout layer Softmax Layer Training Techniques for Deep Neural Networks

One-Vs-The-Rest SVM Classifier 9/22/2018 Training Techniques for Deep Neural Networks

Paper Scenarios Scenario 1: Shallow representation (IFV) 9/22/2018 Scenario 1: Shallow representation (IFV) Scenario 2: Deep representation (CNN) with pre training Scenario 3: Deep representation (CNN) with pre training and fine tuning Training Techniques for Deep Neural Networks

Training Set ILSVRC-2012 Data augmentation RGB color jittering 9/22/2018 ILSVRC-2012 1000 object categories from ImageNet 1.2M training images Using gradient descent with momentum Data augmentation RGB color jittering Training Techniques for Deep Neural Networks

Data augmentation Generates additional examples of the class 9/22/2018 Generates additional examples of the class 3 strategies: No augmentation (cropping if needed) Flip augmentation – mirroring images C+F augmentation – cropping and flipping Training Techniques for Deep Neural Networks

Target Dataset ILSVRC-2012 – top-5 classification error 9/22/2018 ILSVRC-2012 – top-5 classification error PASCAL VOC (2007 and 2012) – mAP Caltech-101 and Caltech-256 – mean class accuracy Training Techniques for Deep Neural Networks

Scenario 1: IFV Modifications for IFV: 9/22/2018 Modifications for IFV: Intra-normalization of descriptor block Spatially-extended local descriptors Use of color features with SIFT descriptors Training Techniques for Deep Neural Networks

Scenario 2: CNN with pre training 9/22/2018 3 CNN architectures with different accuracy/speed trade-off Fast (CNN-F) medium( (CNN-M) Also with lower dimensional image representation (full7) Slow (CNN-S) Colored vs. grayscale images Training Techniques for Deep Neural Networks

Paper CNN Structures 9/22/2018 Training Techniques for Deep Neural Networks

Scenario 3: CNN with pre training and fine tuning 9/22/2018 Fine tuning on the target dataset – the last layer has output dimensionality equal to number of classes (CNN-S) VOC-2007 and VOC-2012 – multi-label dataset One-vs-rest classification loss function Ranking hinge loss Caltech-101 – single label dataset softmax regression Training Techniques for Deep Neural Networks

Results(1) 9/22/2018 Training Techniques for Deep Neural Networks

Results(2) 9/22/2018 Data augmentation improves performance by ~3% for both IFV and CNN Color descriptors yields worse performance Combination of SIFT and color descriptors improves performance by ~1% for IFV For CNN, grayscale input drops performance by ~3% CNN-based methods outperforms the shallow encodings – improvement of ~10% Also smaller dimensional output features Training Techniques for Deep Neural Networks

Results(3) Intra-normalization improves performance by ~1% for IFV 9/22/2018 Intra-normalization improves performance by ~1% for IFV Both CNN-M and CNN-S outperform the CNN-F by 2-3% margin CNN-M is simpler and marginally faster We can reduce output dimensionality from 2048 to 128 with only a drop of ~2% The fine tuning on VOC-2007 using ranking hinge loss improves 2.7% Training Techniques for Deep Neural Networks

Results(4) 9/22/2018 Training Techniques for Deep Neural Networks

Conclusions 9/22/2018 Rigorous empirical evaluation of CNN-based methods for image classification and comparison with shallow feature encoding methods Performance of shallow representation can improved by adopting data augmentation Deep architectures outperformance the shallow methods Fine tuning can improve results Training Techniques for Deep Neural Networks

Thank you for your attention. Questions? 9/22/2018 Training Techniques for Deep Neural Networks