Download presentation
Presentation is loading. Please wait.
Published byKathryn Sparks Modified over 9 years ago
1
Thesis title: “Studies in Pattern Classification – Biological Modeling, Uncertainty Reasoning, and Statistical Learning” 3 parts: (1)Handwritten Digit Recognition with a Vision- Based Model (part in CVPR-2000) (2)An Uncertainty Framework for Classification (UAI-2000) (3)Selection of Support Vector Kernel Parameters (ICML-2000)
2
Handwritten Digit Recognition with a Vision-Based Model Loo-Nin Teow & Kia-Fock Loe School of Computing National University of Singapore
3
OBJECTIVE To develop a vision-based system that extracts features for handwritten digit recognition based on the following principles: –Biological Basis; –Linear Separability; –Clear Semantics.
4
Developing the model 2 main modules: Feature extractor –generates feature vector from raw pixel map. Trainable classifier –outputs the class based on the feature vector.
5
General System Structure Handwritten Digit Recognizer Feature Extractor Feature Classifier Raw Pixel Map Feature Vector Digit Class
6
The Biological Visual System Primary Visual Cortex Eye Optic nerve Optic tract Optic chiasm Brain Lateral geniculate nucleus Optic radiation
7
Receptive Fields Visual map Visual cell Receptive field input Output activations
8
Simple Cell Receptive Fields
9
Simple Cell Responses Cases with activation Cases without activation
10
Hypercomplex Receptive Fields
11
Hypercomplex Cell Responses Cases without activation Cases with activation
12
Biological Vision Local spatial features; Edge and corner orientations; Dual-channel (bright/dark; on/off); Non-hierarchical feature extraction.
13
The Feature Extraction Process Selective Feature Convolution Aggregation I Q F I I Q F 2 of 36x36 32 of 32x32 32 of 9x9
14
Dual Channel On-Channel intensity-normalize (Image) Off-Channel complement (On-Channel)
15
Selective Convolution Local receptive fields –same spatial features at different locations. Truncated linear halfwave rectification –strength of feature’s presence. “Soft” selection based on central pixel –reduce false edges and corners.
16
Selective Convolution (formulae) where
17
Convolution Mask Templates Simplified models of the simple and hypercomplex receptive fields. Detect edges and end-stops of various orientations. Corners - more robust than edges –On-channel end-stops : convex corners; –Off-channel end-stops : concave corners.
18
Some representatives of the 16 mask templates used in the feature extraction
19
Feature Aggregation Similar to subsampling: –reduces number of features; –reduces dependency on features’ positions; –local invariance to distortions and translations. Different from subsampling: –magnitude-weighted averaging; –detects presence of feature in window; –large window overlap.
20
Feature Aggregation (formulae) Magnitude-Weighted Average where
21
Classification Linear discrimination systems –Single-layer Perceptron Network minimize cross-entropy cost function. –Linear Support Vector Machines maximize interclass margin width. k-nearest neighbor –Euclidean distance –Cosine Similarity x x x x x x o o o o o o x x x x x x o o o o o o
22
Multiclass Classification Schemes for linear discrimination systems One-per-class (1 vs 9) Pairwise (1 vs 1) Triowise (1 vs 2)
23
Experiments MNIST database of handwritten digits. 60000 training, 10000 testing. 36x36 input image. 32 9x9 feature maps.
24
Preliminary Experiments Feature Classifier SchemeVoting Option Train Error (%) (60000 samples) Test Error (%) (10000 samples) Perceptron Network 1-per-class-0.002.14 PairwiseHard0.000.88 Soft0.000.87 TriowiseHard0.000.72 Soft0.000.72 Linear SVMs PairwiseHard0.000.98 Soft0.000.82 TriowiseHard0.000.74 Soft0.000.72 k-Nearest Neighbor Euclidean Distance -0.001.39 (k = 3) Cosine Similarity -0.001.09 (k = 3)
25
Experiments on Deslanted Images Feature Classifier SchemeVoting Option Train Error (%) (60000 samples) Test Error (%) (10000 samples) Perceptron Network PairwiseHard0.000.81 Soft0.000.73 TriowiseHard0.000.63 Soft0.000.62 Linear SVMs PairwiseHard0.000.69 Soft0.000.68 TriowiseHard0.000.65 Soft0.00 * 0.59 *
26
Misclassified Characters
27
Comparison with Other Models Classifier ModelTest Error (%) LeNet-41.10 LeNet-4, boosted [distort]0.70 LeNet-50.95 LeNet-5 [distort]0.80 Tangent distance1.10 Virtual SVM0.80 [deslant]* 0.59 *
28
Conclusion Our model extracts features that are –biologically plausible; –linearly separable; –semantically clear. Needs only a linear classifier –relatively simple structure; –trains fast; –gives excellent classification performance.
29
Hierarchy of Features? Idea originated from Hubel & Wiesel –LGN simple complex hypercomplex –later studies show these to be parallel. Hierarchy - too many feature combinations. Simpler to have only one convolution layer.
30
Linear Discrimination Output: where f defines a hyperplane: and g is the activation function: or
31
One-per-class Classification the unit with the largest output value indicates the class of the character:
32
Pairwise Classification Soft Voting: Hard Voting: where
33
Triowise Classification Soft Voting: Hard Voting:
34
k-Nearest Neighbor Euclidean Distance Cosine Similarity where
35
Confusion Matrix (triowise SVMs / soft voting / deslanted)
36
Number of iterations to convergence for the perceptron network Scheme#units# epochs 1-per-class 10281 Pairwise 90 57 Triowise360147
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.