Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Science and Engineering, Seoul National University

Similar presentations


Presentation on theme: "Computer Science and Engineering, Seoul National University"— Presentation transcript:

1 Computer Science and Engineering, Seoul National University
Using Neural Network Hanock Kwak Biointelligence Lab Computer Science and Engineering, Seoul National University

2 Test, Valid, and Train Data
Training Set: this data set is used to adjust the weights on the neural network. Validation Set: this data set is used to minimize overfitting. (ex: model selection, early stopping) Testing Set: this data set is used only for testing the final solution in order to confirm the actual predictive power of the network.

3 Early Stopping To prevent overfitting, we stop training in some period. for each epoch for each training data instance propagate error through the network adjust the weights calculate the accuracy over the validation data if validation accuracy decreased by some threshold exit training else continue training

4 Adjusting Learning Rate
Decreasing learning rate improves overall performance. for each epoch for each training data instance propagate error through the network adjust the weights calculate the accuracy over the validation data if validation accuracy decreased by some threshold if learning rate is too low exit training else decrease learning rate continue training

5 Cross Validation Evaluate the model strictly with cross validation.

6 Optimizer Instead of performing naive gradient descent methods, use some improved optimizers. Adagrad Adadelta momentum rmsprop Adam (recommended) ...

7 Regularizer Another way of preventing overfitting is to regularize your parameters L1 (or L2) regularizer

8 Data Preprocessing Do not let your model do everything.
Do some preprocessing to make the model easier to learn. preprocessed data

9 Data Normalization Normalizing the data make neural network easier to learn Zero-centered Similar scale for each dimension

10 Weight Initialization
All zero initialization: Bad. If every neuron in the network computes the same output, then they will also all compute the same gradients during backpropagation and undergo the exact same parameter updates. Small random numbers: w = np.random.randn(n) / sqrt(n)

11 Batch Normalization Performs the normalization for each training mini-batch. Reduces internal covariate shift. Accelerates learning process.

12 See More Information! http://cs231n.github.io/
These notes accompany the Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition.

13 Supplementary Notes on Convolutional Neural Networks

14 Why Convolutional Layers?
A filter slides over the input image to produce a feature map. It extracts some type of features overall the images. Filters depicted by weights

15 Hierarchical Features
In general, the more convolution steps we have, the more complicated features our network will be able to learn to recognize

16 Max Pooling Spatial Pooling (also called subsampling or downsampling) reduces the dimensionality of each feature map but retains the most important information.

17 Zero Padding Ensure that the window can be sliding from scratch to the end of the tail.

18 All Convolution Nets Remove fully connected layers
Replace max-pooling with convolution with stride 2 Similar or better performance & Lower computation

19 Transposed Convolution (Deconvolution)
Inverse operation of convolution

20 Residual Network Adding linear jump gate helps very deep neural networks (50+ layers) to work well.


Download ppt "Computer Science and Engineering, Seoul National University"

Similar presentations


Ads by Google