Download presentation
Presentation is loading. Please wait.
Published byHomer Reginald Barnett Modified over 6 years ago
1
Handwritten Digit Recognition Using Stacked Autoencoders
Yahia Saeed, Jiwoong Kim, Lewis Westfall, and Ning Yang Seidenberg School of CSIS Pace University, New York
2
Optical Character Recognition
Convert text into machine processable data 1910s Telegraph codes Reading device for the blind Circa 1930 Searching microfilm archives Fixed font characters Accuracy affected by noise Hand Written or printed characters Characters not consistent Likely to contain noise
3
We Explored Digit Recognition Neural Networks Autoencoder
SoftMax Regression MNIST hand written digit data base Matlab
4
MNIST Database Large database of hand written digits
Re-mix of NIST digit databases Training images from American Census Bureau employees Testing images from American high school students NIST images normalized 20x20 Antialiased introduces greyscale MNIST Images 28x28 Centered on center of mass
5
MNIST Database (cont.) 60k training images 10k testing images
Our training set ½ from MNIST training images ½ from MNIST testing images Our testing set was similar
6
Neural Network Learning algorithm Each feature has an input Layers
One or more hidden layers Output layer Inputs to each layer Combination of products of the outputs of the previous layer and weights
7
Neural Network Layer 1 – Input layer
Layer 2 – Undercomplete hidden layer Layer 3 – Output layer
8
Auto-Encoders A type of unsupervised learning which tries to discover generic features of the data Learn identity function by learning important sub-features (not by just passing through data) Compression, etc. Can use just new features in the new training set or concatenate both Mention Zipser auotencoder with reverse engineering, then Cottrell compression where unable to reverse engineer Point out that don't have to have less hidden nodes in the next layer, but careful, if train too long, will just learn to pass through, more on that in a bit
9
Autoencoder Neural Network
Unsupervised Backpropagation Output not intended to match input Features captured may not be intuitive Undercomplete constraint used
10
Stacked Auto-Encoders
Stack many (sparse) auto-encoders in succession and train them using greedy layer-wise training
11
Stacked Auto-Encoders
supervised training on the last layer using final features Then supervised training on the entire network to fine- tune all weights Shows softmax, but could use BP or any other variation
12
784 features -> 100 features
Undercomplete 784 features -> 100 features 100 features -> 50 features Sparse network 'SparsityProportion',0.15, ...
13
1ST Autoencoder
14
Sparse Autoencoder L weightRegularization to control the Weight of the network (should be small) SparsityProportion is a parameter to control the sparsity of the output from the hidden layer must be (between 0-1)
15
2nd Autoencoder
16
SoftMax Classifier Supervised learning
Classifies results of autoencoder processing of original inputs Goal to match output to original input
17
Stacked Autoencoder Output of hidden layer of one autoencoder input to the next autoencoder
18
Constraints Undercomplete Sparse network
784 features -> 100 features 100 features -> 50 features Sparse network
19
MatLab We used autoencoder and softMax functions Hides math processing
GPU enhances speed
20
Previous Work LeCun, Cortes, and Burgess “The MNIST Database of Handwritten Digits” Method Accuracy Linear Classifier 92.4% K Nearest Neighbor 99.3% Boosted Stumps 99.1% Non-Linear Classifier 96.4% SVM 99.4% Neural Net 99.6% Convolutional Net 99.7%
21
Training 10k MNIST images 1st autoencoder 784 features / image
Encode undercomplete to 100 features / image Decode to 784 features / image 400 epochs Sparsity parameter of 0.15
22
Training (cont.) 2nd autoencoder SoftMax Classifier
100 features / image Encode undercomplete to 50 features / image Decode to 100 features / image 100 epochs Sparsity parameter of 0.10 SoftMax Classifier 50 features / image to 1 of 10 classes / image 400 epochs
23
Testing First results - 79.7% accuracy Conducted retraining
Final results – 99.7% accuracy
24
Output of 1st Autoencoder
25
Output of 2nd Autoencoder
26
1st Confusion Matrix
27
Final Confusion Matrix
28
Conclusion 2 stacked autoencoder layers and a softMax classifier layer
Ultimately getting 99.7% accuracy MatLab really needs a GPU
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.