Download presentation
Presentation is loading. Please wait.
Published byEileen Warren Modified over 8 years ago
1
Convolutional Neural Networks at Constrained Time Cost (CVPR 2015) Authors : Kaiming He, Jian Sun (MSR) Presenter : Hyunjun Ju 1
2
Motivation Most of the recent advanced CNNs are time consuming. They take a high-end GPU or multiple GPUs one week or several weeks to train, which can sometimes be too demanding for the rapidly changing industry. This paper investigates the accuracy of CNN architecture at constrained time cost. Factors : depth, width(the number of filter), filter size, stride. 2 Goal Find the efficient and relatively accurate CNN model.
3
Time Complexity of Convolutions 3 The time cost of fc layers and pooling layers is not involved in the above formulation. These layers often take 5-10% computational time.
4
Baseline Model 4 Full6 4096 Full7 4096 Full8 Softmax 1000 SPP (spatial pyramid pooling)
5
3-stage Design 5 Stage Layers between two nearby pooling layers. convolution pooling convolution pooling Stage 2 Stage 1 conv pooling Stage 3
6
Model Designs by Layer Replacement 6 Model A is baseline. The others are variation of the model A. In each model, a few layers are replaced with some other layers that preserve time costs. There are trade offs by replacing the layers.
7
Trade-offs between Depth and Filter Sizes 7
8
8 The depth is more important than the filter sizes. When the time complexity is roughly the same, the deeper networks with smaller filters show better results than the shallower networks with large filters.
9
Trade-offs between Depth and Width 9
10
Increasing the depth leads to considerable gains, even the width needs to be properly reduced. 10 But, G is only better than F marginally.
11
Trade-offs between Width and Filter Size 11
12
Trade-offs between Width and Filter Size Unlike the depth that has a high priority, the width and filter sizes do not show apparent priorities to each other. 12
13
But, Is Deeper Always better? In experiments, they find the accuracy is stagnant or even reduced in some of their very deep attempts. Two possible explanations 1.The width/filter sizes are reduced overly and may harm the accuracy. 2.Overly increasing the depth will degrade the accuracy even if the other factors are not traded. 13
14
But, Is Deeper Always better? To understand the main reason Do not constrain the time complexity (just add conv layers) 14 Overly increasing depth can harm the accuracy, even if the width/filter sizes are unchanged. The errors not only get saturated at some point, but get worse if going deeper. The degradation is not due to over-fitting. (training errors are also worse)
15
Adding Pooling Layer(Feature map size and width) 15 The model J(low feature map size, high width) results in new error rates which is better than that of model E.
16
Delayed Subsampling of Pooling Layers Max pooling Layer has two different role 1.Lateral suppression that increase the invariance to small local translation 2.Reducing the spatial size of feature maps by subsampling Usually, max pooling layer plays the two roles simultaneously with stride > 1. We can separate these two roles using two different layers. Pooling layer : by setting the stride = 1 A convolutional layer : by setting the stride > 1 (original stride of the pooling layer) this operation doesn’t change the complexity of all convolutional layer. 16 Delayed model has lower Top-5 error rates than original model’s.
17
Comparisons (Fast models) 17 (J’) Model J’ has the best performance and complexity is low.
18
Comparisons (Accurate models) VGG-16 and GoogLeNet are trained by more additional data augmentation than others. 18 Model J’ has low complex and relatively good performance.
19
Conclusion Constrained time cost is practical issue in industrial and commercial requirement. They proposed models that are fast for practical applications yet are more accurate than existing fast models Not the best performance model or the fastest model. 19
20
20
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.