Online Handwriting Recognition Charles C. Tappert School of Computer Science and Information Systems Pace University
Handwriting Recognition Offline Scanned Images Static Information Online Electronic Tablet or Digitizer Real-Time, Dynamic Information
Online Handwriting Recognition Invention of electronic tablets -- late 1950s Tablet and display were separate Pen Computing -- 1980s Combined tablets and display Brought input and output into the same surface Immediate feedback via electronic link Created the paper-like interface
A person using a Rand Tablet
Pen Computing Pen computer – a notebook/handheld computer with a stylus-based user interface Paper-like interface – executes job-specific applications by emulation of paper-based work methods Wireless communication – portable extension of the corporate Information System, extending not only through the corporate plant but also into its customer’s offices The computer goes where the work is
Pen Computing
Tablet Digitizer – Dynamic Information Pen Down – indication of inking X-Y coordinates Resolution: 200 points/inch Sampling rate: 100 points/second
Dynamic Handwriting Information Number of Strokes a stroke is the ink trace from pen down to pen up Order of strokes Stroke Direction Stroke velocity, acceleration
Written Language and Handwriting Properties Alphabet Letters, digits, punctuation, special symbols Writing is a time sequence of strokes Complete one character before beginning next except for delayed strokes Spatial order – for example, left to right.
Fundamental Property of Writing Differences between different characters are more significant than differences between drawings of the same character This is what makes written communication possible Possible exceptions
Written English Writing Styles Handwriting Uppercase – about 2 strokes per letter Lowercase – about 1 stroke per letter Cursive Script Less than a stroke per letter Delayed crossing and dotting of strokes
Many Computer Recognition Problems Various Language Alphabets Shorthand – for example, Pitman Spreadsheets Flowcharts Line Drawings Editing Symbols for text editing
Computer Problems in English Constrained Handprint Printing on lines – symbols can touch/overlap Printing one symbol per box – form filling Unconstrained Handprint No lines and symbols can touch or overlap Cursive Script Mixed Printing and Cursive
Handprint Recognition Difficulties Digitizer problems Writing variation not handheld by system Uppercase versus lowercase versus digits Segmentation – character within character problem
History of Computer Systems for English Printing System Handled only Specified Variations Small number of variations per symbol All common variations System Trained to User Usually with built in prototypes covering common variations
Survey Journal Article in 1990 44 Systems, 300 References 11 Experimental Systems for Handprint 4 Experimental Systems for Cursive Script 16 Commercial Systems for Opaque Tablets 5 Commercial Systems for Pen Computers 8 Experimental Application Systems (Spreadsheets, flowcharts, etc.)
Example Systems Rand System 1966 – Groner Pencept Commercial Product 1980s ATT System 1983 – Don Burr IBM Runon System 1984 – Chuck Tappert Linus Commercial Product 1987 – Ralph Sklarew
A person using a Rand Tablet
Pencept Product – Pairwise Discrimination
Tappert System
Combined Characters Segmentation and Recognition
Categories of Systems University Project Systems – least robust Industrial Project Systems – more robust Commercial Products – most robust Fred Brooks’ Mythical Man Month program – programming system – programming systems product
Conclusion and Future Work Graffiti recognizer greatly simplified the recognition problem Handprint problem not completely solved Even with IBM’s ThinkWrite and CIC’s Jot products Cursive script not solved
Context Essential Other Low Performance Characters