Download presentation
Presentation is loading. Please wait.
Published byCandice Morton Modified over 9 years ago
1
Neural Networks Vladimir Pleskonjić 3188/2015
2
2/20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in range (0, 1) Neural network
3
3/20 Vladimir Pleskonjić Perceptron Building blocks Numeric inputs Output in (0, 1) Bias (always equal to +1) Perceptron
4
4/20 Vladimir Pleskonjić Perceptron Inputs are summed (bias included) weighted by perceived importance Θ Output is sigmoid function of this sum Sigmoid function
5
5/20 Vladimir Pleskonjić Architecture Parallel perceptrons form one neural layer Serial layers form the neural network Multiple hidden layers are allowed Neural layers
6
6/20 Vladimir Pleskonjić Feedforward Calculating output is called feedforward Layer by layer Outputs are confidences for their class Interpreted by the user
7
7/20 Vladimir Pleskonjić Training Weights Θ are the soul of our network Train examples Weights are configured to minimize the error function
8
8/20 Vladimir Pleskonjić Error function Error per output Error per train example Total error Error for single output
9
9/20 Vladimir Pleskonjić Backpropagation Partial derivatives of error function with respect to each weight in Θ Derivative rules and formulae
10
10/20 Vladimir Pleskonjić Optimization Gradient descent Learning rate Weights should be initialized with small random values Descent
11
11/20 Vladimir Pleskonjić Regularization Overfitting Addition to error function Sum of squares of weights Regularization coefficient Not applied on links connecting to biases
12
12/20 Vladimir Pleskonjić Stochastic gradient descent Train cycles Batch (all at once) Stochastic (one by one) Mini-batch (in groups)
13
13/20 Vladimir Pleskonjić Putting it together Design architecture Randomize weights Train cycle: backpropagation + gradient descent Repeat train cycles a number of times Feedforward to use Interpret output confidences
14
14/20 Vladimir Pleskonjić Implementation C++ Matrix class CUDA used for some of the operations NeuralNetwork class
15
15/20 Vladimir Pleskonjić Code: feedforward
16
16/20 Vladimir Pleskonjić Code: gradient descent
17
17/20 Vladimir Pleskonjić Pseudo code: training
18
18/20 Vladimir Pleskonjić Results MNIST database My configuration Train accuracy: 95.3% Test accuracy: 95.1%
19
19/20 Vladimir Pleskonjić Performance C++ [ms]C++ and CUDA [ms] 1% TRAIN1907832796 1% TEST624671 10% TRAIN181092322312 10% TEST57316515 100% TRAIN1879421?? 100% TEST5742164124 Execution time comparison The aim of this project was not to achieve the optimal performance, but to reach a better understanding of the algorithm and its implementation.
20
20/20 Vladimir Pleskonjić Questions? Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.