March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence.

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence of point coordinates with pen-down/up signals from the digitizer Stroke: sequence of points from pen-down to pen-up signals Word: sequence of one or more strokes.

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU System Overview Input Pre-processing (high curvature points) Segmentation Recognition Engine Dictionary Character Recognizer Context Models Word Candidates

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Segmentation Hypotheses High-curvature points and segmentation points:

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Character Recognition I Fisher Discriminant Analysis (FDA): improves over PCA (Principal Component Analysis). Original spaceProjection space Linear projection p=W T x Training set: 1040 lowercase letters, Test set: 520 lowercase letters Test results: 91.5% correct

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Fisher Discriminant Analysis Between-class scatter matrix –C: number of classes –N i : number of data vectors in class i –  i : mean vector of class i and  : mean vector Within-class scatter matrix –v j i : j-th data vector of class i.

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Given a projection matrix W (of size n by m) and its linear transformation, the between-class scatter in the projection space is Similarly

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Optimization formulation of the fisher projection solution: (  B,  W are scatter matrices in projection space) Fisher Discriminant Analysis (cont.)

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU FDA (continued) Construction of the Fisher projection matrix: –Compute the n eigenvalues and eigenvectors of the generalized eigenvalue problem: –Retain the m eigenvectors having the largest eigenvalues. They form the columns of the target projection matrix.

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Character Recognition Results Training set: 1040 lowercase letters Test set: 520 lowercase letters Test results:

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Challenge I The problem of the previous approach is: non-characters are classified as characters. When applied to cursive words it creates several/too many non-sense word hypothesis by extracting characters where they don’t seem to exist. More generally, one wants to be able to generate shapes and their deformations.

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Challenge II How to extract reliable local geometric features of images (corners, contour tangents, contour curvature, …) ? How to group them ? Large size data base to match one input, how to do it fast ? Hierarchical clustering of the database, possibly over a tree structure or some general graph. How to do it ? Which criteria to cluster ? Which methods to use it ?

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Recognition Engine Integrates all available information, generates and grows the word-level hypotheses. Most general form: graph and its search. Hypothesis Propagation Network

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Hypothesis Propagation Network H (t, m) Class m's legal predecessors "a""b”"z""y" List length 1 2 3 Time T t Look- back window range m Recognition of 85% on 100 words (not good)

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Challenge III How to search more efficiently in this network and more generally on Bayesian networks ?

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU Visual Bigram Models (VBM) Some characters can be very ambiguous when isolated: “9” and “g”; “e” and “l”; “o” and “0”; etc, but more obvious when put in a context. Character heights Relative height ratio and positioning “go” “90”

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU VBM: Parameters h1h1 h2h2 h top 1 top 2 bot 1 bot 2 Height Diff. Ratio: HDR = (h 1 - h 2 ) / h Top Diff. Ratio: TDR = (top 1 - top 2 ) / h Bottom Diff. Ratio: BDR = (bot 1 - bot 2 ) / h

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU VBM: Ascendancy Categories Total 9 visual bigram categories (instead of 26x26=676).

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU VBM: Test Results

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence.

Similar presentations

Presentation on theme: "March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence.

Similar presentations

Presentation on theme: "March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence."— Presentation transcript:

Similar presentations

About project

Feedback