Handwriting Vector Quantizer

Slides:



Advertisements
Similar presentations
Information Representation
Advertisements

Handwritten Mathematical Symbol Recognition for Computer Algebra Applications Xiaofang Xie, Stephen M. Watt Dept. of Computer Science, University of Western.
Segmentation of Touching Characters in Devnagari & Bangla Scripts Using Fuzzy MultiFactorial Analysis Presented By: Sanjeev Maharjan St. Xavier’s College.
Medical Image Registration Kumar Rajamani. Registration Spatial transform that maps points from one image to corresponding points in another image.
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence.
Digital Pens rewriting the future Rex Santacruz MIS 304.
Software Testing. “Software and Cathedrals are much the same: First we build them, then we pray!!!” -Sam Redwine, Jr.
Objectives: You will understand: How analyst can individualize handwriting to a particular person. What types of evidence are submitted to the document.
HANDWRITING A Writer’s Tool Chapter 13. Handwriting  Handwriting is the formation of alphabetic symbols on paper  Instruction emphasizes legibility.
(Off-Line) Cursive Word Recognition Tal Steinherz Tel-Aviv University.
©Brooks/Cole, 2003 Chapter 2 Data Representation.
Chapter 2 Data Representation. Define data types. Visualize how data are stored inside a computer. Understand the differences between text, numbers, images,
ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA.
Unit 30 P1 – Hardware & Software Required For Use In Digital Graphics
Handwriting Copybook Style Analysis Of Pseudo-Online Data Student and Faculty Research Day Mary L. Manfredi, Dr. Sung-Hyuk Cha, Dr. Charles Tappert, Dr.
Handwriting Analysis. QUESTION ? A piece of paper is involved in most crimes, perhaps indirectly like in a ransom note in a kidnapping or a forged signature.
Chapter 15 Kendall/Hunt Publishing Company0 Handwriting.
Handwriting analysis.
HOnors Forensic Science.  I. Document Examiners  A. Involves examination of handwriting and typewriting to ascertain the source or authenticity of a.
Digital Image: Representation & Processing (2/2) Lecture-3
Loop Investigation for Cursive Handwriting Processing and Recognition By Tal Steinherz Advanced Seminar (Spring 05)
Slide 1 Wednesday, October 07, 2015 Low Level Machine.
1 of 2 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Object Orientated Data Topic 5: Multimedia Technology.
Digital Image Processing CCS331 Relationships of Pixel 1.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
Handwriting Analysis CSI UMMC. Uses of Handwriting Analysis ► Determine identity of writer  In ransom notes  In document forgery  In death threats.
DOCUMENT AND HANDWRITING ANALYSIS. DOCUMENTS AS EVIDENCE Document specialists are called to : Verify handwriting and signatures Authenticate documents.
Forensic Science.  I. Document Examiners  A. Involves examination of handwriting and typewriting to ascertain the source or authenticity of a questioned.
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
Data Representation. What is data? Data is information that has been translated into a form that is more convenient to process As information take different.
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
Hardware Lesson 5 1. Starter 2 Name these devices and explain if they are input or output devices.
Handwriting Analysis Part 2. Characteristics Handwriting experts generally look at 12 characteristics of a person’s writing. They try and compare a sample.
 Document analysis in the crime lab emphasizes comparison of materials and writing with known standards  Printing machines (typewriters, printers, etc.)
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
Unabomber Reading Summarize the article and tell me your thoughts on Ted Kaczynski. Describe how he was caught by the FBI. Needs to be ½ page for full.
What is Binary Code? Computers use a special code of their own to express the digital information they process. It's called the binary code because it.
Software testing techniques Software testing techniques REGRESSION TESTING Presentation on the seminar Kaunas University of Technology.
Chapter 6 Skeleton & Morphological Operation. Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post.
Text Reader And Typer Project By: Brandon Smith. What it does ● First, a picture containing text is used as an input. ● The program scans it for distinct.
3.3 Fundamentals of data representation
Handwriting Comparison
S.Rajeswari Head , Scientific Information Resource Division
BTEC NCF Dip in Comp - Unit 02 Fundamentals of Computer Systems Lesson 10 - Text & Image Representation Mr C Johnston.
Printer its types, working and usefulness
Computer Input Device: Graphic Tablets
Handwriting Analysis CSI UMMC.
Online Handwriting Recognition
Lecture 5 Smaller Network: CNN
Ch2: Data Representation
Target 4-2 Handwriting Analysis.
Handwriting Analysis Like Fingerprints, every person’s handwriting is unique and personalized Handwriting is difficult to disguise or forge Questioned.
Document Forgery: Handwriting Analysis
SME1013 PROGRAMMING FOR ENGINEERS
Chapter 2 Data Representation.
SME1013 PROGRAMMING FOR ENGINEERS
Family History Technology Workshop
Fundamentals of Python: First Programs
Data Representation Chapter 2 Computer HW (Von Neumann Model) Program
Chapter 10 Handwriting Analysis, Forgery, and Counterfeiting
Input and Output devices in a Computer
Lecture 23 CS 507.
Documentation Analysis
Warm Up Objective: Scientists will describe forensic explosives and arson by analyzing the documentary. What is the topic? What will you be doing? Why.
Warm Up Objective: Scientists will describe questioned documents by analyzing handwriting. What is the topic? What will you be doing? Why is this important?
6. Strokes to begin and end
Handwriting analysis.
Presentation transcript:

Handwriting Vector Quantizer Capturing Printed and Cursive Text Using Domain Knowledge to Efficiently Compress Data While Maintaining Critical Information

Goals Reversibility – enhances application flexibility and testability Flexibility comes from being able to use the subsystem in either recognition or synthesis mode. Testability – Running a domain representation subsystem in synthesis mode (give it a representative feature sequence and output Speech/Writing) allows us to see/hear what the representation captures about the domain. Produce an early result (preferably mid-term) so that the Recognition Engine teams can have some good test data. The team can then enhance their subsystem performance as the semester continues. The output format should just be a sequence of named objects (e.g. integer digits 0-255) plus a developed closeness matrix that a recognition engine can use as a metric to help in “matching” dissimilar object sequences.

Vector Quantizer Handwriting – both printing and cursive Strokes Pieces of characters Model each as a small number of vector moves Normalization – reduce variability Closeness - Word-Level Heuristic Deal with smoothness violations – Dotting i’s, crossing t’s and q’s Reversibility – works in both directions

Handwriting Input - a sequence of {x,y,z} (z is either binary black/white or perhaps pressure) coordinates from using a stylus to write on the screen. msdn.microsoft.com – Pen programming in Windows Output - a sequence of named vectors from a limited set [perhaps 16 angles and 8 lengths (1, 2 ,4, … , 128) plus B/W – One Byte]. Each written character is then accurately represented by 10 to 30 bytes vs about 104 pixels – a data compression ratio of better than 40:1.

Strokes Sequence of strokes Black – Leaves ink behind White – An off the page move Critical Points - Reliably determined by several methods -allow the generated vector sequence within each stroke to have more consistency. Newton – sudden change in velocity maxima/minima – Along the direction of the writer’s “tilt” pressure changes – Especially on/off the paper

Normalization Character size stylus speed writer “Tilt/Slant” (very different for right handed vs left handed writers) Each of these makes the problem larger for the recognition engine. A Vector Quantizer should attempt to “normalize” these variables, but retain a set of expansion parameters for use when generating the user’s handwriting.

Closeness “Euclidean Distance” – A metric that determines the similarity between features (our 2-D vector set in this case). Closeness Matrix – A data structure containing the pairwise distances between the vectors. The Recognition Engine can use this to generalize the input to get a match while attach “penalties” when generalizing.

Word-Level Heuristic Smoothness violations Dotting i’s, crossing t’s and q’s - tend to be done by writers at the word level instead of per character. A long white stroke to the left - a move backwards in time detected by the Quantizer Determine the where in the stroke/vector sequence where the black stroke belongs suitable triplet of strokes (white-black-white) can be inserted into the stroke sequence of the appropriate letter. Allow normal writing habits to continue. The writer could get improved results by modifying his/her behavior to do these strokes in sequence at the letter level.

Reversibility Smoothing - smooth out the quantized vectors when operating in synthesis mode to produce clean strokes Remembering the writer’s parameters re-introducing them is useful in replicating the writer’s handwriting Tilt Size Pen nib – for cursive/calligraphic output