Download presentation
Presentation is loading. Please wait.
1
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence of point coordinates with pen-down/up signals from the digitizer Stroke: sequence of points from pen-down to pen-up signals Word: sequence of one or more strokes.
2
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU System Overview Input Pre-processing (high curvature points) Segmentation Recognition Engine Dictionary Character Recognizer Context Models Word Candidates
3
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Segmentation Hypotheses High-curvature points and segmentation points:
4
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Character Recognition I Fisher Discriminant Analysis (FDA): improves over PCA (Principal Component Analysis). Original spaceProjection space Linear projec- tion p=W T x Training set: 1040 lowercase letters, Test set: 520 lowercase letters Test results: 91.5% correct
5
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Fisher Discriminant Analysis Between-class scatter matrix –C: number of classes –N i : number of data vectors in class i – i : mean vector of class i and : mean vector Within-class scatter matrix –v j i : j-th data vector of class i.
6
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Given a projection matrix W (of size n by m) and its linear transformation, the between-class scatter in the projection space is Similarly
7
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Optimization formulation of the fisher projection solution: ( B, W are scatter matrices in projection space) Fisher Discriminant Analysis (cont.)
8
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU FDA (continued) Construction of the Fisher projection matrix: –Compute the n eigenvalues and eigenvectors of the generalized eigenvalue problem: –Retain the m eigenvectors having the largest eigenvalues. They form the columns of the target projection matrix.
9
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Character Recognition Results Training set: 1040 lowercase letters Test set: 520 lowercase letters Test results:
10
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Challenge I The problem of the previous approach is: non-characters are classified as characters. When applied to cursive words it creates several/too many non-sense word hypothesis by extracting characters where they don’t seem to exist. More generally, one wants to be able to generate shapes and their deformations.
11
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Challenge II How to extract reliable local geometric features of images (corners, contour tangents, contour curvature, …) ? How to group them ? Large size data base to match one input, how to do it fast ? Hierarchical clustering of the database, possibly over a tree structure or some general graph. How to do it ? Which criteria to cluster ? Which methods to use it ?
12
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Recognition Engine Integrates all available information, generates and grows the word-level hypotheses. Most general form: graph and its search. Hypothesis Propagation Network
13
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Hypothesis Propagation Network H (t, m) Class m's legal predecessors "a""b”"z""y" List length 1 2 3 Time T t Look- back window range m Recognition of 85% on 100 words (not good)
14
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Challenge III How to search more efficiently in this network and more generally on Bayesian networks ?
15
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Visual Bigram Models (VBM) Some characters can be very ambiguous when isolated: “9” and “g”; “e” and “l”; “o” and “0”; etc, but more obvious when put in a context. Character heights Relative height ratio and positioning “go” “90”
16
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU VBM: Parameters h1h1 h2h2 h top 1 top 2 bot 1 bot 2 Height Diff. Ratio: HDR = (h 1 - h 2 ) / h Top Diff. Ratio: TDR = (top 1 - top 2 ) / h Bottom Diff. Ratio: BDR = (bot 1 - bot 2 ) / h
17
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU VBM: Ascendancy Categories Total 9 visual bigram categories (instead of 26x26=676).
18
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU VBM: Test Results
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.