Very Deep Convolutional Networks for Large-Scale Image Recognition

Slides:

Advertisements

Similar presentations

Lecture 6: Classification & Localization

Advertisements

ImageNet Classification with Deep Convolutional Neural Networks

Karen Simonyan Andrew Zisserman

OverFeat Part1 Tricks on Classification

Example: ZIP Code Recognition Classification of handwritten numerals.

Deeper is Better Latha Pemula.

Feedforward semantic segmentation with zoom-out features

ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.

Convolutional Neural Network

ConvNets for Image Classification

Deep Residual Learning for Image Recognition

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Lecture 3a Analysis of training of NN

Convolutional Neural Networks at Constrained Time Cost (CVPR 2015) Authors : Kaiming He, Jian Sun (MSR) Presenter : Hyunjun Ju 1.

Radboud University Medical Center, Nijmegen, Netherlands

Deep Learning and Its Application to Signal and Image Processing and Analysis Class III - Fall 2016 Tammy Riklin Raviv, Electrical and Computer Engineering.

Deep Residual Learning for Image Recognition

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

Summary of “Efficient Deep Learning for Stereo Matching”

Compact Bilinear Pooling

Data Mining, Neural Network and Genetic Programming

Data Mining, Neural Network and Genetic Programming

Computer Science and Engineering, Seoul National University

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

The Problem: Classification

Principal Solutions AWS Deep Learning

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Ajita Rattani and Reza Derakhshani,

Inception and Residual Architecture in Deep Convolutional Networks

ECE 6504 Deep Learning for Perception

Lecture 5 Smaller Network: CNN

Neural Networks 2 CS446 Machine Learning.

Training Techniques for Deep Neural Networks

Efficient Deep Model for Monocular Road Segmentation

Urban Sound Classification with a Convolution Neural Network

Machine Learning: The Connectionist

Deep Residual Learning for Image Recognition

Non-linear classifiers Neural networks

ECE 599/692 – Deep Learning Lecture 6 – CNN: The Variants

Deep Learning Convoluted Neural Networks Part 2 11/13/

Fully Convolutional Networks for Semantic Segmentation

Aoxiao Zhong Quanzheng Li Team HMS-MGH-CCDS

Introduction to Neural Networks

Face Recognition with Deep Learning Method

Image Classification.

Two-Stream Convolutional Networks for Action Recognition in Videos

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

CS 4501: Introduction to Computer Vision Training Neural Networks II

Smart Robots, Drones, IoT

CSC 578 Neural Networks and Deep Learning

Object Detection Creation from Scratch Samsung R&D Institute Ukraine

Neural Networks Geoff Hulten.

Lecture: Deep Convolutional Neural Networks

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

Convolutional Neural Networks

Mihir Patel and Nikhil Sardana

Image Classification & Training of Neural Networks

Heterogeneous convolutional neural networks for visual recognition

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

CSC 578 Neural Networks and Deep Learning

Department of Computer Science Ben-Gurion University of the Negev

Natalie Lang Tomer Malach

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Image recognition.

Learning Deconvolution Network for Semantic Segmentation

Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.

CSC 578 Neural Networks and Deep Learning

Presentation transcript:

Very Deep Convolutional Networks for Large-Scale Image Recognition By Vincent Cheong

VGG Net Attempt to improve upon Krizhevsky et al. (2012)’s architecture. Increasing depth while other architectural parameters are fixed.

Architecture 8 – 16 Convolution Layers (3 x 3, 1 x 1) ReLU Stride: 1 Padding (for 3 x 3): 1 Convolution filter width starting from 64 and doubling after each max-pooling layer until it reaches 512 ReLU Following each hidden layer 5 Max pooling layers (2 x 2) Stride: 2 After every few convolution layers 3 Fully connected layers Dropout applied to first two layers Local response normalization removed for most configurations

3 x 3 vs 7 x 7 Convolution Layers A stack of three 3 x 3 conv. layers has an effective receptive field of 7 x 7. Why three 3 x 3 layers over a single 7 x 7? Three non-linear rectification layers instead of one Lower number of parameters Three 3 x 3 with C channels = 3(3C2) = 27C2 parameters 7 x 7 = 49C2 parameters 3 x 3 is the minimum size for capturing features such as left/right, up/down, and center

1 x 1 Projection onto space of same dimensionality Increases non-linearity without affecting receptive fields

Training Weight initialization Random initialization for configuration A Layers used to initialize deeper architectures Multinomial logistic regression utilizing mini-batch gradient descent with momentum Image cropping Training scale S where S is the shortest dimension of the image (Min S = 224) Multi-scale cropping S is randomly generated from [Smin, Smax] with Smin=256 and Smax =512 Augmentation through scale jittering Random horizontal flipping and RGB color shifting of crops

Testing Multi-scale Horizontal flipping of image Dense Evaluation Large difference between training and testing scales will drop performance The smallest image side is scaled to test scale Q Q = {S – 32, S, S+32} Horizontal flipping of image Dense Evaluation FC layers converted to convolutional layers 7 x 7, 1 x 1, 1 x 1 Result is spatially averaged to produce a vector of class scores Applied to uncropped image Avoids recomputation for multiple crops

Results A – 8 conv-3 B – 10 conv-3 C – 10 conv-3, 3 conv-1 D – 13 conv-3 E – 16 conv-3

Dense and Multi-Crop Multi-crop slightly more effective than dense Soft-max average of both produce better results

Results

Summary LRN does not improve performance More depth with smaller filters is better than a shallow network with large filters Three 3 x 3 vs one 7 x 7 Increased depth reduces classification error Error saturates at 19 layers Spatial context is important (3 x 3 vs 1 x 1) Scale jittering and dense evaluation work in conjunction to improve accuracy