Handwriting Recognition for Genealogical Records Luke Hutchison FHT 2003.

Slides:



Advertisements
Similar presentations
Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
QR Code Recognition Based On Image Processing
Handwritten Mathematical Symbol Recognition for Computer Algebra Applications Xiaofang Xie, Stephen M. Watt Dept. of Computer Science, University of Western.
ONLINE ARABIC HANDWRITING RECOGNITION By George Kour Supervised by Dr. Raid Saabne.
One-Shot Learning Gesture Recognition Students:Itay Hubara Amit Nishry Supervisor:Maayan Harel Gal-On.
1 Probabilistic Artificial Neural Network For Recognizing the Arabic Hand Written Characters Khalaf khatatneh, Ibrahiem El Emary,and Basem Al- Rifai Journal.
Computer Science Research for Family History and Genealogy David W. Embley Heath Nielson, Mike Rimer, Luke Hutchison, Ken Tubbs, Doug Kennard, Tom Finnigan.
Computer Vision Optical Flow
EE663 Image Processing Edge Detection 5 Dr. Samir H. Abdul-Jauwad Electrical Engineering Department King Fahd University of Petroleum & Minerals.
A new face detection method based on shape information Pattern Recognition Letters, 21 (2000) Speaker: M.Q. Jing.
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence.
Forged Handwriting Detection Hung-Chun Chen M.S. Thesis in Computer Science Advisors: Drs. Cha and Tappert.
PLT 2007 CSIS Shorthand Handwriting Recognition for Pen-Centric Interfaces Charles C. Tappert 1 and Jean R. Ward 2 1 School of CSIS, Pace University, New.
Face Detection: a Survey Speaker: Mine-Quan Jing National Chiao Tung University.
Introducing of handwritten icons Ralph Niels, Don Willems and Louis Vuurpijl.
1Ellen L. Walker Matching Find a smaller image in a larger image Applications Find object / pattern of interest in a larger picture Identify moving objects.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.
1 of 6 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
כמה מהתעשייה? מבנה הקורס השתנה Computer vision.
Facial Feature Detection
Performance Evaluation of Grouping Algorithms Vida Movahedi Elder Lab - Centre for Vision Research York University Spring 2009.
West Virginia University
ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA.
Handwriting Copybook Style Analysis Of Pseudo-Online Data Student and Faculty Research Day Mary L. Manfredi, Dr. Sung-Hyuk Cha, Dr. Charles Tappert, Dr.
ENDA MOLLOY, ELECTRONIC ENG. FINAL PRESENTATION, 31/03/09. Automated Image Analysis Techniques for Screening of Mammography Images.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
What is a line? The edge of something A moving dot . . .
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
Input Devices. What is Input?  Everything we tell the computer is Input.
Loop Investigation for Cursive Handwriting Processing and Recognition By Tal Steinherz Advanced Seminar (Spring 05)
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Avoiding Segmentation in Multi-digit Numeral String Recognition by Combining Single and Two-digit Classifiers Trained without Negative Examples Dan Ciresan.
Visual Inspection Product reliability is of maximum importance in most mass-production facilities.  100% inspection of all parts, subassemblies, and.
March 10, Iris Recognition Instructor: Natalia Schmid BIOM 426: Biometrics Systems.
HCI For Pen Based Computing Cont. Richard Anderson CSE 481 B Winter 2007.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.
Handwritten Recognition with Neural Network Chatklaw Jareanpon, Olarik Surinta Mahasarakham University.
Handwritten Signature Verification
Autonomous Robots Vision © Manfred Huber 2014.
UC Berkeley CS294-9 Fall Document Image Analysis Lecture 11: Word Recognition and Segmentation Richard J. Fateman Henry S. Baird University of.
Introduction to Related Papers of Vessel Segmentation Methods Advisor : Ku-Yaw Chang Student : Wei-Lu Lin 2015/1/7.
1 Machine Vision. 2 VISION the most powerful sense.
APECE-505 Intelligent System Engineering Basics of Digital Image Processing! Md. Atiqur Rahman Ahad Reference books: – Digital Image Processing, Gonzalez.
Reference books: – Digital Image Processing, Gonzalez & Woods. - Digital Image Processing, M. Joshi - Computer Vision – a modern approach, Forsyth & Ponce.
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
Scanned Documents INST 734 Module 10 Doug Oard. Agenda Document image retrieval  Representation Retrieval Thanks for David Doermann for most of these.
Handwriting Recognition for Genealogical Records
Handwriting Recognition
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Arabic Handwriting Recognition Thomas Taylor. Roadmap  Introduction to Handwriting Recognition  Introduction to Arabic Language  Challenges of Recognition.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
1 A Statistical Matching Method in Wavelet Domain for Handwritten Character Recognition Presented by Te-Wei Chiang July, 2005.
By: Shane Serafin.  What is handwriting recognition  History  Different types  Uses  Advantages  Disadvantages  Conclusion  Questions  Sources.
Face Detection 蔡宇軒.
Motion tracking TEAM D, Project 11: Laura Gui - Timisoara Calin Garboni - Timisoara Peter Horvath - Szeged Peter Kovacs - Debrecen.
OCR Reading.
ROBUST FACE NAME GRAPH MATCHING FOR MOVIE CHARACTER IDENTIFICATION
Supervised Time Series Pattern Discovery through Local Importance
Tracking Objects with Dynamics
Online Handwriting Recognition
© 2016 by W. W. Norton & Company Recognizing Objects Chapter 4 Lecture Outline.
Forged Handwriting Detection
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
UN Workshop on Data Capture, Bangkok Session 7 Data Capture
Thomas L. Packer BYU CS DEG
UN Workshop on Data Capture, Dar es Salaam Session 7 Data Capture
Presentation transcript:

Handwriting Recognition for Genealogical Records Luke Hutchison FHT 2003

Church Extraction Effort Nov 2002: Church released US 1880 and Canadian 1881 CensusNov 2002: Church released US 1880 and Canadian 1881 Census 55 million names55 million names 11 million man-hours11 million man-hours Granite Vault: contains 2.3 million rolls of microfilm ( = about 6 million 300-page volumes )Granite Vault: contains 2.3 million rolls of microfilm ( = about 6 million 300-page volumes ) Approximate extraction time for one person (based on the above census): 280 years, 24/7Approximate extraction time for one person (based on the above census): 280 years, 24/7 We don't have that sort of timeWe don't have that sort of time Need automated extraction: handwriting recognitionNeed automated extraction: handwriting recognition

Example Microfilm Images

Handwriting Recognition Two different fields:Two different fields: Online Handwriting Recognition Online Handwriting Recognition Writer's pen movements captured Velocity, acceleration, stroke order etc. Style can be constrained (e.g. Graffitti gestures) Offline Handwriting Recognition Offline Handwriting Recognition Only pixels Cannot constrain style (documents already written) Offline is harder (less information)Offline is harder (less information) Genealogical records are all offlineGenealogical records are all offline Mary

Online Handwriting Recognition Modern systems are moderately successful,Modern systems are moderately successful, e.g. Microsoft Research's new Tablet PC:e.g. Microsoft Research's new Tablet PC: Polynomial coefficients e.g. [0.94, 0.05, 0.29,...]

Offline Handwriting Recognition A difficult problemA difficult problem Almost as many approaches as there are researchersAlmost as many approaches as there are researchers e.g.e.g. Pattern Recognition Pattern Recognition Statistical analysis Statistical analysis Mathematical modelling Mathematical modelling Physics-based modelling Physics-based modelling Subgraph matching / graph search Subgraph matching / graph search Neural networks / machine learning Neural networks / machine learning Fractal image compression Fractal image compression... (too many to list) (too many to list)...

Previous Work: Offline  Online Conversion Finding contour Finding contour Finding midline Finding midline Stroke ordering – difficult problem Stroke ordering – difficult problem

Offline  Online Conversion ctd. Especially difficult with genealogical records:Especially difficult with genealogical records: Stroke ordering: difficult Stroke ordering: difficult Broken lines / blobs? Broken lines / blobs? Not practicalNot practical

Previous Work: Holistic Matching Whole word is stretched to match known wordsWhole word is stretched to match known words Sources of variation compound across wordSources of variation compound across word

Previous Work: Sliding Window Narrow vertical window slides across wordNarrow vertical window slides across word A state machine recognizes sequencesA state machine recognizes sequences Results good, but sensitive to noiseResults good, but sensitive to noise

Previous Work: Parascript Features detected & put in sequenceFeatures detected & put in sequence Letters warped to best match sequence of featuresLetters warped to best match sequence of features Complex; sensitive to noiseComplex; sensitive to noise

Handwriting Recognition Some aspects of Handwriting Recognition:Some aspects of Handwriting Recognition: Segmentation problem (can't read word until it is segmented; can't segment word until it is read) Segmentation problem (can't read word until it is segmented; can't segment word until it is read) Different handwriting styles Different handwriting styles Use of dictionary to correct for errors in reading Use of dictionary to correct for errors in reading nr? m? Srnitb --> Smith

Thesis Approach: Preprocessing Outlines of word are traced and smoothed: Handwriting slope is corrected for automatically:

Segmentation Goal: robustly cut letters into segmentsGoal: robustly cut letters into segments Match multiple segments to detect lettersMatch multiple segments to detect letters Easier than matching whole letterEasier than matching whole letter

Dynamic Global Search Assemble word spelling from possible letter readingsAssemble word spelling from possible letter readings Best path: “Williarw Suwkino” (65% confidence)

Results (1)

Results (2)

Results (3)

Results (4) In general: results even worse – system only worked well on words it was specifically trained on

The Human Brain's Visual System Retina

The Human Brain's Visual System Angular edge detectors Retina

The Human Brain's Visual System Angular edge detectors Retina Line / curve detectors

The Human Brain's Visual System Angular edge detectors Retina Line / curve detectors Feature detectors

The Human Brain's Visual System Angular edge detectors Retina Line / curve detectors Feature detectors Lateral inhibition Feedback

The Human Brain's Visual System Angular edge detectors Retina Line / curve detectors Feature detectors Letter / word shape recognizers Lateral inhibition Feedback J

The Human Brain's Visual System Angular edge detectors Retina Line / curve detectors Feature detectors Letter / word shape recognizers Lateral inhibition Feedback J Joseph

Conclusions Handwriting recognition is important for genealogy......but it is hardHandwriting recognition is important for genealogy......but it is hard Current methods don't work very well......and they don't operate much like the human brainCurrent methods don't work very well......and they don't operate much like the human brain Future work should focus on understanding the brain, and emulating it as much as possible, e.g. With:Future work should focus on understanding the brain, and emulating it as much as possible, e.g. With: Hierarchical reasoning Hierarchical reasoning Feedback Feedback Lateral inhibition Lateral inhibition

Questions? Luke Hutchison