Handwritten Characters Recognition Based on an HMM Model

Handwritten Characters Recognition Based on an HMM Model
2/25/2019

Work Based-on Articles
“A Study of Hidden Markov Models for Off-line Recognition of Handwritten Characters” B. Gosselin & A. Paggiaro “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition” Lawrence R. Rabiner 2/25/2019

Project Goal Application of the Principles Described in Article [1] ,Using Theory From Article [2]. 2/25/2019

Elements of an HMM N , the Number of States in the Model. Denoted as .
M , the Number of Distinct Observation Symbols Per State , Denoted as The State Transition Probability Distribution, Denoted as , Where 2/25/2019

Elements of an HMM (Cont.)
The Observation Symbol Probability Distribution in State j , Denoted as where , . The Initial State Distribution , Denoted as Where , The Model Notation is : 2/25/2019

3 Basic Problems of HMM Problem 1 ( Evaluation )
How Do We Efficiently Compute ? Solution to This , Allows Us to Choose the Model Which Best Matches the Observations Sequence , O . 2/25/2019

3 Basic Problems of HMM (Cont.)
Problem 2 ( Decoding ) Given Sequence , O , and a Model , , How Do We Choose Optimally the State Sequence , Q ,Best Matches the Observations , O ? Solution to This , Enables Us to Learn About the Physical Meaning of the Hidden Part of the Model ( e.g the States ). 2/25/2019

3 Basic Problems of HMM (Cont.)
Problem 3 ( Learning ) How Do We Adjust the Model Parameters A, B, in Order to Maximize Per Each Class of Characters ? Solution to This , Enables Us to Optimize The Model Parameters , In Order to Have Best Description of How a Given Observation Sequence Comes About . 2/25/2019

Creating My Database Creating Digital Image of Handwritten Characters. Isolation of the Characters. Binarization ( Using Local Threshold ). 3 2 1 2/25/2019

My Database Specifications
Samples From 8 Different People. 918 Characters ( ~ 8 x 5 x 27 ). Character’s Size x 0.6 [Cm] , ( 0.24 x 0.27 [Inch] ). Scanner – 300 DPI , 256 Gray-Level . 2/25/2019

Applying HMM Procedure
Top Level Designing Discrete HMMs Pre Process Feature Extraction Binary Images 1. 2/25/2019

Goal – Provide Skeleton Image of Each Character.
Pre - Processing Goal – Provide Skeleton Image of Each Character. Create Skeleton Algorithm Invert Colors Binary Images 2/25/2019

Examples Of Skeletons Original Skeleton 2/25/2019

Feature Extraction 95 Pairs (X,Y) 14 Pairs (X,Y)
Reduce Amount of Data Held in Characters. Provide Attractive Representation (Using Oriented Search Principle Method). 95 Pairs (X,Y) 14 Pairs (X,Y) 2/25/2019

Oriented Search Principle
Goals – “Chipper” Characters Representation. Trying to Retrieve Writing Dynamics. Algorithm Steps – Finding First Pixel ( TBLR Scan ). Specifying the Next Likely Pixel ( Iteratively ). Conclusion of Line Segment. ( By Distortion TH or Number of Pixels ). Normalizing Ordinates ( According to the Initial Width & Height of Character ). 1. 2/25/2019

Oriented Search Principle (Cont.)
Illustration – 1. 2/25/2019

Feature Extraction - Example
60 Observation Sequence 60 1. X1 Y1 X2 Y2 = 2/25/2019

Example of Extremities Ordinates Distribution
1. 2/25/2019

Feature Extraction - Summarize
Advantages Disadvantages Preserve Shape of Characters. Reduce Amount of Data. Retrieve Dynamic Level of Writing. Over Sensitivity to First Pixel. Significant Variance of Segments Number. Significant Variance of Segments Order. Requires HMM Design !! 2/25/2019

Designing Discrete HMM ( Per Each Class )
Observations Sequences Create ACM Initialize HMM Structure Training HMM 2/25/2019

Designing Discrete HMM ( Cont. )
Average Character Model A Representative Model For Each Character Defined By All Obs. Goals Defines the Number of States in the HMM Structure. Initializes the State Transitions Distributions. Gives Physical Meaning to the States of the HMM ( Each State Correspond to Segment ). 2/25/2019

Average Character Model (Cont.) Creation Procedure : Definition of Segments Number, TH. Applying VQ Process ( Local,Mean+Std ). Association of Segments to ACM Segments. ( Using Distance Measure ). Updating ACM ( According to Calculation of Mean Value of All Grouped Segments ). Goto Step 3 ( Unless TH Achieved ). Problem : Making Short the segment of the ACM Avoiding this problem by taking the 2 consecutive segments as a single one. Improve : At the end of each iteration the least referenced segment of the ACM is removed The most referenced is split into 2 different segments. 2/25/2019

Average Character Model (Cont.) Distance Measure Definition : 2/25/2019

Initialize HMM Structure N = Segments On ACM. Si  ith ACM Segment. (Random Selection) Example: 2/25/2019

Segments Alignment Process Goal - Matching Each Segment of Each Character to the Ones of the ACM. Training Char. ACM 2/25/2019

Alignment Process Results Each Observations Sequence’s Segment Is Indexed According to ACM Closest Segment. Example : O [ 4KxT ] S [ KxN ] 2/25/2019

Initialize HMM Parameters According to Matrix S : Initial { }  Initial { A }  Initial { B }  2/25/2019

Training HMM Procedure
Designing Discrete HMM ( Cont. ) Training HMM Procedure Training Data {O} Calculate Adjust HMM Parameters Max{ } ( Viterbi Path ) 2/25/2019

Training The HMM ( Cont.)
Designing Discrete HMM ( Cont. ) Training The HMM ( Cont.) No Yes 2/25/2019

Classification System
HMM א א MLP HMM ב Skeleton Char. Image Feature Extraction ת HMM ת 2/25/2019

Improve Classification Performance
Classification Decision Calculation Using the Forward-Backward Procedure : N2 T Calculations Vs. 2T NT Calculations !! 2/25/2019

Improve Classification ( Cont. )
Adding the Probability That an Observations Sequence ends on a given State Si : 2/25/2019

Adding the Probability That the Observations Sequence Length is T : 2/25/2019

Adding “Punishment” Probability to Each State, Si ,of the Model ,That Doesn’t Take Place in a Sequence Of Observations : ( Improve Discrimination [ ב ,ר ] ) .  Viterby Optimal State Sequence. 2/25/2019

Adding the Probability Distribution of the “Discrete” Aspect Ratio ( Width / Height ). ( Avoid Discrimination Problem Due to the Normalization of Character Dimensions ). 2/25/2019

Classification ( DB Properties )
Performed on NIST3 Database. 1579 of Each Class English Unconstrained Uppercase Handwritten Characters. 1324 – Training Set, 235 – Testing Set. Total Number of Characters : 41, ( 1579 X 26 ). 2/25/2019

Classification ( Total Results )
Improve Number None 1 2 3 4 Recog. Rate (%) 78.6 79.6 80.3 83.2 84.4 2/25/2019

Classification ( Results Per Char. )
2/25/2019

My Goals Implementation of those Principles (on MATLAB).
Using My “Small” Database. Having Same Recognition Rate. 2/25/2019

Handwritten Characters Recognition Based on an HMM Model

Similar presentations

Presentation on theme: "Handwritten Characters Recognition Based on an HMM Model"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Handwritten Characters Recognition Based on an HMM Model

Similar presentations

Presentation on theme: "Handwritten Characters Recognition Based on an HMM Model"— Presentation transcript:

Similar presentations

About project

Feedback