Analysis of Classification Algorithms In Handwritten Digit Recognition Logan Helms Jon Daniele
Classification Algorithms Template Matching Naïve Bayes Classifier Neural Network
Benchmarks 1.Gradient-Based Learning Applied to Document Recognition by LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. 2.Comparison of Machine Learning Classifiers for Recognition of Online and Offline Handwritten Digits by Omidiora, E., Adeyanju, I., Fenwa, O.
MNIST Training Set:60,000 samples Test Set:10,000 samples Accuracy: Number of correctly guessed test samples/ 10,000
NAÏVE BAYES CLASSIFIER
Naïve Bayes Classifier Each pixel value (on/off) is independent of any other pixel value Each pixel has a probability associated with being on or off in any given digit class The probability of each pixel is used to determine the probability of an unknown digit being classified in one of the known classes
Naïve Bayes Classifier Training set: digits Test set: digits Success rate: Abysmal: 08.13% correct classification rate Benchmark: WEKA: Multimodal Naive Bayes: 83.65% 08.13% <<<<<<<<< 83.65%
Naïve Bayes Classifier Challenges: Pixel probabilities change according to the shape of the digit Are pixels the best feature set by which to compare different digits? Input size: 28x28 digit image results 786 pixels Requirements for matrix manipulation
Naïve Bayes Classifier Improvements Discarding extraneous pixel data Pixel values are mainly contained in a 20x20 matrix Using a binary pixel value vs a range of pixel values (0-255) Edge detection Incorporate feature extractor(s) and evaluate images based on those features
NEURAL NETWORK
Neural Network Type: Feed Forward Training: Back-propagation algorithm Response Function: Architectures: NameInput LayerHidden LayerOutput Layer NN NN
Training
NN300 Training time:~17 hours (~52 mins/epoch) Learning rate: EpochRate 1, , 4, , 7, , 10, 11, , 14, 15, 16, 17, 18, 19,
NN1000 Training time:~2.5 days (~3 hrs/epoch) Learning rate: EpochRate 1, , 4, , 7, , 10, 11, , 14, 15, 16, 17, 18, 19,
Results After 20 epochs NetworkAccuracy Benchmark % NN % Benchmark % NetworkAccuracy Benchmark % NN1000-
Benchmark %On MNIST test set as is. 96.4%Generated more training data by using artificial distortions 98.4%When using deslanted images
Future Work Further training of NN300 with the MNIST test set has increased accuracy to 84.01% Experiment with hidden neuron count and multiple hidden layers Research other types of neural networks