Prénom Nom Document Analysis: TextRecognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Slides:



Advertisements
Similar presentations
By: Mani Baghaei Fard.  During recent years number of moving vehicles in roads and highways has been considerably increased.
Advertisements

QR Code Recognition Based On Image Processing
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Word Spotting DTW.
Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
1 Probabilistic Artificial Neural Network For Recognizing the Arabic Hand Written Characters Khalaf khatatneh, Ibrahiem El Emary,and Basem Al- Rifai Journal.
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
A Study of Approaches for Object Recognition
Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Non Parametric Methods for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
1/25 Writing Character sets Unicode Input methods.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Prénom Nom Document Analysis: Fundamentals of pattern recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Graphology / Handwriting Analysis
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Lecture 1: Introduction to Pattern Recognition
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA.
Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire d'Informatique 64, Avenue Jean Portalis, TOURS – France.
1 Template-Based Classification Method for Chinese Character Recognition Presenter: Tienwei Tsai Department of Informaiton Management, Chihlee Institute.
Pattern Recognition Vidya Manian Dept. of Electrical and Computer Engineering University of Puerto Rico INEL 5046, Spring 2007
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Oriented Local Binary Patterns for Offline Writer Identification
CPSC 601 Lecture Week 5 Hand Geometry. Outline: 1.Hand Geometry as Biometrics 2.Methods Used for Recognition 3.Illustrations and Examples 4.Some Useful.
7-Speech Recognition Speech Recognition Concepts
Loop Investigation for Cursive Handwriting Processing and Recognition By Tal Steinherz Advanced Seminar (Spring 05)
Ajay Kumar, Member, IEEE, and David Zhang, Senior Member, IEEE.
Explorations in Neural Networks Tianhui Cai Period 3.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Compiled By: Raj G Tiwari.  A pattern is an object, process or event that can be given a name.  A pattern class (or category) is a set of patterns sharing.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
COMPARISON OF IMAGE ANALYSIS FOR THAI HANDWRITTEN CHARACTER RECOGNITION Olarik Surinta, chatklaw Jareanpon Department of Management Information System.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
OCR a survey Csink László Problems to Solve Recognize good quality printed text Recognize good quality printed text Recognize neatly written handprinted.
Handwritten Recognition with Neural Network Chatklaw Jareanpon, Olarik Surinta Mahasarakham University.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Reporter: 資訊所 P Yung-Chih Cheng ( 鄭詠之 ).  Introduction  Data Collection  System Architecture  Feature Extraction  Recognition Methods  Results.
Presented By Lingzhou Lu & Ziliang Jiao. Domain ● Optical Character Recogntion (OCR) ● Upper-case letters only.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
UC Berkeley CS294-9 Fall Document Image Analysis Lecture 11: Word Recognition and Segmentation Richard J. Fateman Henry S. Baird University of.
Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.
Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems Faten Hussein Presented by The University of British.
Scanned Documents INST 734 Module 10 Doug Oard. Agenda Document image retrieval  Representation Retrieval Thanks for David Doermann for most of these.
Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.
Pattern Recognition NTUEE 高奕豪 2005/4/14. Outline Introduction Definition, Examples, Related Fields, System, and Design Approaches Bayesian, Hidden Markov.
Handwriting Recognition
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Automatic Script Identification. Why do we need Script Identification OCRs are generally language dependent. Document layout analysis is sometimes language.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Arabic Handwriting Recognition Thomas Taylor. Roadmap  Introduction to Handwriting Recognition  Introduction to Arabic Language  Challenges of Recognition.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
1 A Statistical Matching Method in Wavelet Domain for Handwritten Character Recognition Presented by Te-Wei Chiang July, 2005.
Optical Character Recognition
UC Berkeley CS294-9 Fall Document Image Analysis Lecture 12: Word Segmentation Richard J. Fateman Henry S. Baird University of California – Berkeley.
OCR Reading.
Chapter 12 Object Recognition
Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.
SPECIAL ISSUE on Document Analysis, 5(2):1-15, 2005.
Presentation transcript:

Prénom Nom Document Analysis: TextRecognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008

© Prof. Rolf Ingold 2 Outline  Objectives  Typology  Processing chain  Methodology  Character recognition  Word recognition  An OCR experiment  Conclusion

© Prof. Rolf Ingold 3 Objectives  Text recognition is the most advanced domain of document analysis  It aims at analyzing images to extracting text, i.e. sequences of character codes (ASCII/Unicode)

© Prof. Rolf Ingold 4 Character recognition typology  Machine printed vs. handwritten text  on-line vs. off-line handwriting  Isolated characters, connected characters, cursive text  Various Alphabet  Western languages (Roman, Greek,  Asian (Chinese, Japanese, Korean, Thai,...  Arabic alphabets  Limited vocabulary  only numbers, only uppercase text  with/without diacritics/punctuation  restricted vocabulary (city names, street names,...)  language  contextual knowledge

© Prof. Rolf Ingold 5 Factor influencing performance  Variability in style  single scriber, omni-scriber  mono-font, multi-font, omni-font  Geometrical variability  in size  in orientation (rotated)  in transformations (slanted, perspective view,...)  Image resolution  binary images, starting at 200 dpi  grey-level images starting at 150 dpi  Image quality  degraded support (historical documents)  acquisition conditions (bad illumination, optical aberration, noise,...)

© Prof. Rolf Ingold 6 European languages  European text is characterized by  limited set of characters (26 to ~100 classes)  diacritics and punctuation  isolated characters and cursive scripts  left-to right and top-to-bottom writing  large variety of fonts  different handwriting styles

© Prof. Rolf Ingold 7 Arabic  Arabic text is characterized  right-to-left writing  limited set of characters  context dependent glyphs  connected characters  diacritics  justification by word stretching

© Prof. Rolf Ingold 8 Asian scripts  Asian text is characterized  numerous scripts (hanzi, kanji, hanja, han tu, hiragana, katakana,...)  horizontal and vertical writing  very large alphabets  structured characters  grid based layout

© Prof. Rolf Ingold 9 OCR Methodologies  OCR systems are very complex and combine several steps  image segmentation into characters or isolated shapes  preprocessing of hypothetical segmented characters  size normalization, morphological filtering, thinning,...  feature extraction of isolated shapes  global measures (width to height ratio, density, center of gravity, moments,...)  local properties (stroke thickness,...)  shape identification using  a single classifier (neural network, support vector machine,...)  multiple classifiers and fusion methods  word validation using contextual information

© Prof. Rolf Ingold 10 Character vs. word recognition  There is a paradox with cursive text or connected characters:  character recognition supposes prior character segmentation  character segmentation requires prior character recognition  Several approaches to bypass this paradox  entire word modeling and recognition  multiple hypothesis generation and testing  combined character segmentation and recognition (HMM)

© Prof. Rolf Ingold 11 Processing chain line segmentation word segmentation character segmentation isolated character rec.word recognition feature/primitive extr. identification feature/primitive extr. identification post-analysis recognized word normalization

© Prof. Rolf Ingold 12 Isolated character recognition  Isolated character recognition is applicable  on high quality printed text  on constrained handwriting (forms)  The challenge is to take into account the variability of the class  Performance depends on  size of alphabet (number of classes)  image quality

© Prof. Rolf Ingold 13 Several classification strategies  Direct comparison with class model  Statistical pattern recognition  using features  Structural pattern recognition  using primitives  Hybrid approaches combining statistical and structural approaches  Use of multiple classifiers and fusion of their results

© Prof. Rolf Ingold 14 Comparison with class model or class samples  The unknown pattern is compared with one representative of each class (model)  a similarity measure is returned  decision is determined by most similar sample  rejection may occur if similarity is above a threshold

© Prof. Rolf Ingold 15 Similarity measures  Hamming distance  Warping distance

© Prof. Rolf Ingold 16 Features for statistical approaches  Someimes preprocessing at image level is required  smoothing  size normalization  stroke normalization  skeletization ...  Features are extracted  horizontal and vertical projection profile  central moments  intersections with lines  global transforms (Hough, Fourier,...)  local features (densities, moments,...) ...

© Prof. Rolf Ingold 17 Central moments  Central moments are shift invariant properties defined as with  They can be computed using the following formulas

© Prof. Rolf Ingold 18 Primitives for structural approaches  Shapes are decomposed in strokes and several properties are extracted  number of connected components  number of holes  number and relative position of singular points  extremities  connections  crossings  concavities, convexities ...  These primitives are represented as  strings  trees  graphs and used for comparison

© Prof. Rolf Ingold 19 Identification  For statistical approaches, different classifier are used  discriminant functions  kNN classifier  multi-layer perceptron  support ...  For statistical approaches use  hierarchical classification  string distances  graph matching

© Prof. Rolf Ingold 20 Multilayer perceptron  Information is propagated throw a layered network of "neurons"  decision is given by the highest activation on the output layer  weights of connections are computed in a training phase

© Prof. Rolf Ingold 21 Hierarchical classification  Structural pattern recognition can be performed hierarchically

© Prof. Rolf Ingold 22 OCR Difficulties  The main sources of errors are  variability of character shapes (special fonts, handwriting)  image defects : noise and distortions  broken or touching characters  shape similarity ("0" and "O", "1", "I" and "l", "5" and "S",...)  small shapes : punctuation, accents, superscripts (" er ", " ème ")  special characters ("©", "½", "±",...) or bullets

© Prof. Rolf Ingold 23 Word recognition  Text recognition at word level makes sense  in case of restricted vocabulary  for language driven approaches  for knowledge based approaches  for keyword spotting  Word recognition is typically used  cursive scripts  handwriting  noisy text, difficult to segment  low resolution text

© Prof. Rolf Ingold 24 Word Recognition specificities  Word recognition is more complex than character recognition  usually the number of classes is much higher  more features are needed  Word recognition can take into account external information  language based knowledge  dictionary  character frequencies, bigrams, trigrams, etc.  structural constraints (security number, dates,...)  restricted vocabulary (e.g. city names, street names,...)  redundancy (e.g. zip codes and city names)  Hidden Markov Models (HH) are suited for word recognition

© Prof. Rolf Ingold 25 Hidden Markov models (1)  Each class is modeled by a two stage stochastic process using hidden and visible states  A model =(A,B, π) is composed of  A, the matrix of transition probabilities  B, the matrix of observation probabilities  π, the vector of the initial state probabilities

© Prof. Rolf Ingold 26 Hidden Markov models (2)  The probability of an observation can be computed using  A pattern is assigned to the model with highest posterior probability (i.e, the model that best explains the pattern)  The parameters of the model (probabilities) are determined in a training phase using training samples

© Prof. Rolf Ingold 27 Post-analysis  Post-analysis is performed with the aim of validating / correcting character recognition  based on dictionaries  bigrams, trigrams  confidence of character recognition (if available)

© Prof. Rolf Ingold 28 Character Recognition Performance  OCR (Optical Character Recognition) is the most mature technique of document analysis  For most applications, very high accuracy is required  99% recognition rate would generate errors per page  99,9% – 99,99% is often requested  OCR systems may be designed for  standard OCR-A, OCR-B fonts, specially designed for OCR  mono-font recognition, specialized for typewriting  omni-font or multi-font text recognition (the most popular)  trainable systems, being tuned for specific fonts and styles

© Prof. Rolf Ingold 29 OCR experiment  One magazine page [Hebdo 18, 2007, Editorial]  good quality printed text  medium layout complexity

© Prof. Rolf Ingold 30 First Experiment (Standard MS-Office OCR)  Layout not understood  Many OCR errors  => Unusable !

© Prof. Rolf Ingold 31 Second Experiment (ReadIris)  Page layout correctly recognized  Italic correctly detected  including one false detection  A few word segmentation errors  Correct de-hyphenation  A few OCR errors  often as consequence of segmentation errors !

© Prof. Rolf Ingold 32 Conclusion on OCR technologies  Imperfect results in printed character recognition  Recognition of uncontrolled handwriting not mature  Practical problems with  mathematics (symbols and formulas)  special fonts or scripts  logos  => perfect document recognition is not achievable !  Some applications can deal with approximate results  Recognition algorithms should be tuned to prefer rejections to errors  Include manual correction in the processing step