Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, November, 1998.
Invariant Object Recognition The central goal of computer vision research is to detect and recognize objects invariant to scale, viewpoint, illumination, and other changes November 21, 2018 Computer Vision
(Invariant) Object Recognition November 21, 2018 Computer Vision
Generalization Performance Many classifiers are available Maximum likelihood estimation, Bayesian estimation, Parzen Windows, Kn-nearest neighbor, discriminant functions, support vector machines, neural networks, decision trees, ....... Which method is the best to classify unseen test data? The performance is often determined by features In addition, we are interested in systems that can solve a particular problem well November 21, 2018 Computer Vision
Error Rate on Hand Written Digit Recognition November 21, 2018 Computer Vision
No Free Lunch Theorem November 21, 2018 Computer Vision
No Free Lunch Theorem – cont. November 21, 2018 Computer Vision
Ugly Duckling Theorem In the absence of prior information, there is no principled reason to prefer one representation over another. November 21, 2018 Computer Vision
Bias and Variance Dilemma Regression Find an estimate of a true but unknown function F(x) based on n samples generated by F(x) Bias – the difference between the expected value and the true value; a low bias means on average we will accurately estimate F from D Variance – the variability of estimation; a low bias means that the estimate does not change much as the training set varies. November 21, 2018 Computer Vision
Bias-Variance Dilemma When the training data is finite, there is an intrinsic problem of any classifier function If the function is very generic, i.e., a non-parametric family, it suffers from high variance If the function is very specific, i.e., a parametric family, it suffers from high bias The central problem is to design a family of classifiers a priori such that both the variance and bias are low November 21, 2018 Computer Vision
November 21, 2018 Computer Vision
Bias and Variance vs. Model Complexity November 21, 2018 Computer Vision
Gap Between Training and Test Error Typically the performance of a classifier on a disjoint test set will be larger than that on the training set Where P is the number of training examples, h a measure of capacity (model complexity), a between 0.5 and 1, and k a constant November 21, 2018 Computer Vision
Check Reading System November 21, 2018 Computer Vision
End-to-End Training November 21, 2018 Computer Vision
Graph Transformer Networks November 21, 2018 Computer Vision
Training Using Gradient-Based Learning A multiple module system can be trained using a gradient-based method Similar to backpropagation used for multiple layer perceptrons November 21, 2018 Computer Vision
Convolutional Networks November 21, 2018 Computer Vision
Handwritten Digit Recognition Using a Convolutional Network November 21, 2018 Computer Vision
Training a Convolutional Network The loss function used is Training algorithm is stochastic diagonal Levenberg-Marquardt RBF output is given by November 21, 2018 Computer Vision
MNIST Dataset 60,000 training images 10,000 test images There are several different versions of the dataset November 21, 2018 Computer Vision
Experimental Results November 21, 2018 Computer Vision
Experimental Results November 21, 2018 Computer Vision
Distorted Patterns By using distorted patterns, the training error dropped to 0.8% from 0.95% without deformation November 21, 2018 Computer Vision
Misclassified Examples November 21, 2018 Computer Vision
Comparison November 21, 2018 Computer Vision
Rejection Performance November 21, 2018 Computer Vision
Number of Operations Unit: Thousand operations November 21, 2018 Computer Vision
Memory Requirements November 21, 2018 Computer Vision
Robustness November 21, 2018 Computer Vision
Convolutional Network for Object Recognition November 21, 2018 Computer Vision
NORB Dataset November 21, 2018 Computer Vision
Convolutional Network for Object Recognition November 21, 2018 Computer Vision
Experimental Results November 21, 2018 Computer Vision
Jittered Cluttered Dataset November 21, 2018 Computer Vision
Experimental Results November 21, 2018 Computer Vision
Face Detection November 21, 2018 Computer Vision
Face Detection November 21, 2018 Computer Vision
Multiple Object Recognition Based on heuristic over segmentation It avoids making hard decisions about segmentation by taking a large number of different segmentations November 21, 2018 Computer Vision
Graph Transformer Network for Character Recognition November 21, 2018 Computer Vision
Recognition Transformer and Interpretation Graph November 21, 2018 Computer Vision
Viterbi Training November 21, 2018 Computer Vision
Discriminative Viterbi Training
Discriminative Forward Training November 21, 2018 Computer Vision
Space Displacement Neural Networks By considering all possible locations, one can avoid explicit segmentation Similar to detection and recognition November 21, 2018 Computer Vision
Space Displacement Neural Networks We can replicate convolutional networks at all possible locations November 21, 2018 Computer Vision
Space Displacement Neural Networks November 21, 2018 Computer Vision
Space Displacement Neural Networks November 21, 2018 Computer Vision
Space Displacement Neural Networks November 21, 2018 Computer Vision
SDNN/HMM System November 21, 2018 Computer Vision
Graph Transformer Networks and Transducers November 21, 2018 Computer Vision
On-line Handwriting Recognition System November 21, 2018 Computer Vision
On-line Handwriting Recognition System November 21, 2018 Computer Vision
Comparative Results November 21, 2018 Computer Vision
Check Reading System November 21, 2018 Computer Vision
Confidence Estimation November 21, 2018 Computer Vision
Summary By carefully designing systems with desired invariance properties, one can often achieve better generalization performance by limiting system’s capacity Multiple module systems can be trained often effectively using gradient-based learning methods Even though in theory local gradient-based methods are subject to local minima, in practice it seems it is not a serious problem Incorporating contextual information into recognition systems are often critical for real world applications End-to-end training is often more effective November 21, 2018 Computer Vision