Comparison of Handwritings Miroslava Božeková Thesis supervisor: Doc. RNDr. Milan Ftáčnik, CSc.

Contents Goal Benefits Previous work Application Experiments Conclusion Future work Comparison of Handwritings2

Goal One and more scanned images with handwritten text Same or different writer? Implement and make experiments Writer verification – TWO input images Image comes from: M. Bulacu and L. Schomaker, Text-Independent Writer Identification and Verification Using Textural and Allographic Features, 2007 Comparison of Handwritings3

Benefits User-friendly program – no special knowledge expected – simple control – minimization of interaction Experiments for two input images – 100 samples from 40 writers – Comparison of 3 approaches – Best result 96,5 % accuracy by second approach Comparison of Handwritings4

Two input images Comparison of Handwritings5

Previous work 1 Srihari et al. – A large number of features – 2 categories: macrofeatures and microfeatures – Multilayer perceptron or parametric distributions – 96% accuracy Bensefia et al. – Extract the set of graphemes – Clustering graphemes for all documents in data set – Mutual information Comparison of Handwritings6

Previous work 2 Schlapbach and Bunke – Hidden Markov Model based recognizers – Individual recognizer – Text lines as input – 2,5% error Bulacu and Schomaker – Probability distribution functions (PDFs) extracted from the handwriting images – Two levels – The texture level and the character-shape level. Comparison of Handwritings7

Input data IAM Handwriting Database – forms of handwritten English text – for handwritten text recognizers, writer identification and verification experiments. – Scanning: resolution of 300dpi, saved as PNG images with 256 gray levels. – 1539 images from 657 authors – http://iamwww.unibe.ch/~fkiwww/iamDB/ http://iamwww.unibe.ch/~fkiwww/iamDB/ Comparison of Handwritings8

Application Preprocessing Extraction of features Graphemes clustering – Modified hierarchical – Kohonen’s Self-Organizing Map (SOM) Three approaches: – First – feature vector – Second - feature vector + modified hierarchical clustering – Third – feature vector + SOM Comparison of Handwritings10

Preprocessing Thresholding Line segmentation Slant correction Word segmentation Grapheme segmentation & normalization Comparison of Handwritings11

Thresholding - Otsu 1979 Comparison of Handwritings12

Line segmentation Comparison of Handwritings13 Arivazhagan, Srinivasan and Srihari 2007

Slant detection and correction 1 What is slant? Comparison of Handwritings14

Slant detection and correction 2 Comparison of Handwritings15

Word segmentation 1 Comparison of Handwritings16 Image comes from: S. Srihari, Handwriting recognition, Automatic, 2006

Word segmentation 2 Comparison of Handwritings17

Grapheme segmentation What is grapheme? Image comes from: Marius Bulacu and Lambert Schomaker, A Comparison of Clustering Methods for Writer Identification and Verification, 2005 Comparison of Handwritings18

Extraction of features Slant Density of handwriting Proportion Height of handwriting Distance between lines Block letters Comparison of Handwritings19

Proportion, height and distance between lines height distance between lines ( Upp - Asc ) : ( Low - Upp) : ( Des - Low) Comparison of Handwritings20

Block letters Comparison of Handwritings21

First approach - feature vector Comparison of Handwritings22

Self-Organizing Map Comparison of Handwritings23

Modified hierarchical clustering 2 steps First step – 'closest' grapheme – using Euclidean distance – grapheme pairs (clusters). Second step – closest cluster (analogue with first step). Result – Percentage of numbers of clusters with graphemes coming from the same image. Comparison of Handwritings24

Second and third approach Two thresholds for SOM Four thresholds for Modified hierarchical clustering Unique for each approach Comparison with thresholds – Similarity – Dissimilarity – Uncertain result, feature vector Comparison of Handwritings25

Experiments ApproachwriterNOENOCRNOIRaccuracyaverage 1.Same Different 100 85 97 15 3 85% 97% 91% 2.Same Different 100 93 100 7070 93% 100% 96,5% 3.Same Different 100 86 98 14 2 86% 98% 92% Comparison of Handwritings26 NOE = number of experiments NOCR = number of correct results NOIR = number of incorrect results

Three and more input images boolean verifyN (int n) { for (int k = 0; k < n; k++) { if (verify2(k, k+1) != true ) return false; } return true; } Comparison of Handwritings27

One input image Extracted handwriting features Each feature is represented by a vector of numbers Comparison of Handwritings28

Conclusion 3 approaches Experiments on 100 images from 40 different writers. The IAM Handwriting Database. 96,5 % accuracy. Comparison of Handwritings29

Future work Preprocessing step – Deskew document – Noise reduction – Deskew lines – Rule lines Handwriting features Clustering methods More experiments with different threshold numbers Another handwriting database Comparison of Handwritings30

The End Thank you for your attention June 12, 2008 Comparison of Handwritings31

Comparison of Handwritings Miroslava Božeková Thesis supervisor: Doc. RNDr. Milan Ftáčnik, CSc.

Similar presentations

Presentation on theme: "Comparison of Handwritings Miroslava Božeková Thesis supervisor: Doc. RNDr. Milan Ftáčnik, CSc."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Comparison of Handwritings Miroslava Božeková Thesis supervisor: Doc. RNDr. Milan Ftáčnik, CSc.

Similar presentations

Presentation on theme: "Comparison of Handwritings Miroslava Božeková Thesis supervisor: Doc. RNDr. Milan Ftáčnik, CSc."— Presentation transcript:

Similar presentations

About project

Feedback