OCR a survey Csink László 2009. 2 Problems to Solve Recognize good quality printed text Recognize good quality printed text Recognize neatly written handprinted.

Slides:



Advertisements
Similar presentations
Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.
Advertisements

Applications of one-class classification
QR Code Recognition Based On Image Processing
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
電腦視覺 Computer and Robot Vision I
Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Document Image Processing
Course Syllabus 1.Color 2.Camera models, camera calibration 3.Advanced image pre-processing Line detection Corner detection Maximally stable extremal regions.
Chapter 9: Morphological Image Processing
Course Syllabus 1.Color 2.Camera models, camera calibration 3.Advanced image pre-processing Line detection Corner detection Maximally stable extremal regions.
Each pixel is 0 or 1, background or foreground Image processing to
September 10, 2013Computer Vision Lecture 3: Binary Image Processing 1Thresholding Here, the right image is created from the left image by thresholding,
Chapter 5 Raster –based algorithms in CAC. 5.1 area filling algorithm 5.2 distance transformation graph and skeleton graph generation algorithm 5.3 convolution.
SEMANTIC FEATURE ANALYSIS IN RASTER MAPS Trevor Linton, University of Utah.
6/9/2015Digital Image Processing1. 2 Example Histogram.
EE 7730 Image Segmentation.
Thresholding Otsu’s Thresholding Method Threshold Detection Methods Optimal Thresholding Multi-Spectral Thresholding 6.2. Edge-based.
Surface Reconstruction from 3D Volume Data. Problem Definition Construct polyhedral surfaces from regularly-sampled 3D digital volumes.
Aula 5 Alguns Exemplos PMR5406 Redes Neurais e Lógica Fuzzy.
Segmentation Divide the image into segments. Each segment:
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
E.G.M. PetrakisBinary Image Processing1 Binary Image Analysis Segmentation produces homogenous regions –each region has uniform gray-level –each region.
VEHICLE NUMBER PLATE RECOGNITION SYSTEM. Information and constraints Character recognition using moments. Character recognition using OCR. Signature.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.
1. Binary Image B(r,c) 2 0 represents the background 1 represents the foreground
1 Template-Based Classification Method for Chinese Character Recognition Presenter: Tienwei Tsai Department of Informaiton Management, Chihlee Institute.
Digital Image Processing
Machine Vision for Robots
October 14, 2014Computer Vision Lecture 11: Image Segmentation I 1Contours How should we represent contours? A good contour representation should meet.
Analysis of shape Biomedical Image processing course, Yevhen Hlushchuk and Jukka Parviainen.
Kumar Srijan ( ) Syed Ahsan( ). Problem Statement To create a Neural Networks based multiclass object classifier which can do rotation,
Digital Image Processing, 2nd ed. © 2002 R. C. Gonzalez & R. E. Woods Chapter 11 Representation & Description Chapter 11 Representation.
Chapter 9.  Mathematical morphology: ◦ A useful tool for extracting image components in the representation of region shape.  Boundaries, skeletons,
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
CS 6825: Binary Image Processing – binary blob metrics
September 23, 2014Computer Vision Lecture 5: Binary Image Processing 1 Binary Images Binary images are grayscale images with only two possible levels of.
Digital Image Processing CCS331 Relationships of Pixel 1.
Morphological Image Processing
Digital Image Processing, 2nd ed. © 2002 R. C. Gonzalez & R. E. Woods Representation & Description.
Handwritten Hindi Numerals Recognition Kritika Singh Akarshan Sarkar Mentor- Prof. Amitabha Mukerjee.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
Digital Image Processing - (monsoon 2003) FINAL PROJECT REPORT Project Members Sanyam Sharma Sunil Mohan Ranta Group No FINGERPRINT.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
Digital Image Processing Lecture 16: Segmentation: Detection of Discontinuities Prof. Charlene Tsai.
1 Machine Vision. 2 VISION the most powerful sense.
Low level Computer Vision 1. Thresholding 2. Convolution 3. Morphological Operations 4. Connected Component Extraction 5. Feature Extraction 1.
References Books: Chapter 11, Image Processing, Analysis, and Machine Vision, Sonka et al Chapter 9, Digital Image Processing, Gonzalez & Woods.
1 Mathematic Morphology used to extract image components that are useful in the representation and description of region shape, such as boundaries extraction.
2D-LDA: A statistical linear discriminant analysis for image matrix
1 Overview representing region in 2 ways in terms of its external characteristics (its boundary)  focus on shape characteristics in terms of its internal.
Digital Image Processing Lecture 16: Segmentation: Detection of Discontinuities May 2, 2005 Prof. Charlene Tsai.
Morphological Image Processing Robotics. 2/22/2016Introduction to Machine Vision Remember from Lecture 12: GRAY LEVEL THRESHOLDING Objects Set threshold.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
Course 3 Binary Image Binary Images have only two gray levels: “1” and “0”, i.e., black / white. —— save memory —— fast processing —— many features of.
Chapter 6 Skeleton & Morphological Operation. Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post.
1 A Statistical Matching Method in Wavelet Domain for Handwritten Character Recognition Presented by Te-Wei Chiang July, 2005.
Sheng-Fang Huang Chapter 11 part I.  After the image is segmented into regions, how to represent and describe these regions? ◦ In terms of its external.
Optical Character Recognition
Another Example: Circle Detection
Recognition of biological cells – development
Brain Hemorrhage Detection and Classification Steps
Computer Vision Lecture 5: Binary Image Processing
Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.
Binary Image processing بهمن 92
Handwritten Characters Recognition Based on an HMM Model
Morphological Operators
Lecture 16. Classification (II): Practical Considerations
Presentation transcript:

OCR a survey Csink László 2009

2 Problems to Solve Recognize good quality printed text Recognize good quality printed text Recognize neatly written handprinted text Recognize neatly written handprinted text Recognize omnifont machine-printed text Recognize omnifont machine-printed text Deal with degarded, bad quality documents Deal with degarded, bad quality documents Recognize unconstrained handwritten text Recognize unconstrained handwritten text Lower substitution error rates Lower substitution error rates Lower rejection rates Lower rejection rates

3 OCR accprding to Nature of Input

4 Feature Extraction Large number of feature extraction methods are available in the literature for OCR Large number of feature extraction methods are available in the literature for OCR Which method suits which application? Which method suits which application?

5 A Typical OCR System 1. Gray-level scanning ( dpi) 2. Preprocessing –Binarization using a global or locally adaptive method –Segmentation to isolate individual characters –(optional) conversion to another character representation (e.g. skeleton or contour curve) 3. Feature extraction 4. Recognition using classifiers 5. Contextual verification or post-processing

6 Feature Extraction (Devivjer and Kittler) Feature Extraction = the problem of extracting from the raw data the information which is most relevant for classification purposes, in the sense of minimizing the within-class variability while enhancing the between- class pattern variability Feature Extraction = the problem of extracting from the raw data the information which is most relevant for classification purposes, in the sense of minimizing the within-class variability while enhancing the between- class pattern variability Extracted features must be invariant to the expected distortions and variations Extracted features must be invariant to the expected distortions and variations Curse of dimensionality= if the training set is small, the number of features cannot be high either Curse of dimensionality= if the training set is small, the number of features cannot be high either Rule of thumb: number of training patterns = 10×(dim of feature vector) Rule of thumb: number of training patterns = 10×(dim of feature vector)

7 Some issues Do the characters have known orientation and size? Do the characters have known orientation and size? Are they handwritten, machine-printed or typed? Are they handwritten, machine-printed or typed? Degree of degradation? Degree of degradation? If a character may be written in two ways (e.g. ‘a’ or ‘α’), it might be represented by two patterns If a character may be written in two ways (e.g. ‘a’ or ‘α’), it might be represented by two patterns

8 Variations of the same character Size invariance can be achieved by normalization, but norming can cause discontinuities in the character Rotation invariance is important if chaarcters may appear in any orientation (P or d ?) Skew invariance is important for hand-printed text or multifont machine-printed text

9 Features Extracted from Grayscale Images Goal: locate candidate characters. If the image is binarized, one may find the connected components of expected character size by a flood fill type algorithm (4-way recursive method, 8-way recursive method, non- recursive scanline method etc., check Then the bounding box is found. A grayscale method is typically used when recognition based on the binary representation fails. Then the localization may be difficult. Assuming that there is a standard size for a character, one may simply try all possible locations. In a good case, after localization one has a subimage containing one character and no other objects.

10 Template Matching (not often used in OCR systems for grayscale characters) No feature extraction is used, the template character image itself is compared to the input character image: where the character Z and the template T j are of the same size and summation is taken over all the M pixels of Z. The problem is to find j for which D j is minimal; then Z is identified with T j. No feature extraction is used, the template character image itself is compared to the input character image: where the character Z and the template T j are of the same size and summation is taken over all the M pixels of Z. The problem is to find j for which D j is minimal; then Z is identified with T j.

11 Limitations of Template Matching Characters and templates must be of the same size Characters and templates must be of the same size The method is not invariant to changes in illumination The method is not invariant to changes in illumination Very vulnerable to noise Very vulnerable to noise In template matching, all pixels are used as templates. It is a better idea to use unitary (dfistance-preserving) transforms to character images, obtaining a reduction of features while preserving most of the informations of the character shape.

12 The Radon Transform The Radon transform computes projections of an image matrix along specified directions. A projection of a two-dimensional function f(x,y) is a set of line integrals. The Radon function computes the line integrals from multiple sources along parallel paths, or beams, in a certain direction. The beams are spaced 1 pixel unit apart. To represent an image, the radon function takes multiple, parallel-beam projections of the image from different angles by rotating the source around the center of the image. The following figure shows a single projection at a specified rotation angle.

13 Projections to Various Axes

14

15 Zoning Consider a candidate area (connected set) surrounded by a bounded box. Divide it to 5×5 equal parts and compute the average gray level in each part, yielding a 25-length feature vector.

16 Thinning Thinning is possible both for grayscale and for binary images Thinning is possible both for grayscale and for binary images Thinning= skeletonization of characters Thinning= skeletonization of characters Advantage: few features, easy to extract Advantage: few features, easy to extract The informal definition of a skeleton is a line representation of an object that is: i) one-pixel thick, ii) through the "middle" of the object, and, iii) preserves the topology of the object.

17 When No Skeleton Exists a)Impossible to egnerate a one-pixel width skeleton to be in the middle b)No pixel can be left out while preserving the connectedness

18 Possible Defects Specific defects of data may cause misrecognition Small holes  loops in skeleton Single element irregularities  false tails Acute angles  false tails

19 How Thinning Works Most thinning algorithms rely on the erosion of the boundary while maintaining connectivity,see Morpholo.html for mathematical morphology Most thinning algorithms rely on the erosion of the boundary while maintaining connectivity,see Morpholo.html for mathematical morphologyhttp:// Morpholo.htmlhttp:// Morpholo.html To avoid defects, preprocessing is desirable To avoid defects, preprocessing is desirable As an example, in a black and white application As an example, in a black and white application –They remove very small holes –They remove black elements having less than 3 black neighbours and having connectivity 1

20 An Example of Noise Removal This pixel will be removed (N=1; has 1 black neighbour)

21 Generation of Feature Vectors Using Invariant Moments Given a grayscale subimage Z containing a character candidate, the moments of order p+q are defined by Given a grayscale subimage Z containing a character candidate, the moments of order p+q are defined by where the sum is taken over all M pixels of the subimage. The translation-invariant central moments of order p+q are obtained by shifting the origin to the center of gravity: where

22 Hu’s (1962) Central Moments η pq –s are scale invariant to scale M i –s are rotation invariant

23 K-Nearest Neighbor Classification Example of k-NN classification. The test sample (green circle) should be classified either to the first class of blue squares or to the second class of red triangles. If k = 3 it is classified to the second class because there are 2 triangles and only 1 square inside the inner circle. If k = 5 it is classified to first class (3 squares vs. 2 triangles inside the outer circle). Disadvantage in practice: the distance of the green circle to all blue squares and to all red triangle shave to be computed, this may take much time

24 From now on we will deal with binary (black and white) images only From now on we will deal with binary (black and white) images only

25 Projection Histograms These methods are typically used for These methods are typically used for –segmenting characters, words and text lines –detecting if a scanned text page is rotated But they can also provide features for recognition, too! Using the same number of bins on each axis – and dividing by the total number of pixels - the features can be made scale independent Projection to the y-axis is slant invariant, but projection to the x-axis is not Histograms are very sensitive to rotation

26 Comparision of Histograms It seems plausible to compare two histograms y 1 and y 2 (where n is the number of bins) in the following way: However, the dissimilarity using cumulative histograms is less sensitive to errors. Define the cumulative histogram Y as follows: For the cumulative histograms Y 1 and Y 2 define D as:

27 Zoning for Binary Characters 1 Contour extraction or thinning may be unusable for self-touching characters. This kind of error often occurs to degraded machine-printed texts (generations of photocopying  ) The self-touching problem may be healed by morphological opening.

28 Similarly to the grayscale case, we consider a candidate area (connected set) surrounded by a bounded box. Divide it to 5×5 equal parts and compute the number of black pixels in each part, yielding a 25-length feature vector. Zoning for Binary Characters 2

29 Generation of Moments in the Binary Case Given a binary subimage Z containing a character candidate, the moments of order p+q are defined by where the sum is taken over all black pixels of the subimage The translation-invariant central moments of order p+q are obtained by shifting the origin to the center of gravity: where

30 The Central Moments can be used similarly to the grayscale case η pq –s are scale invariant to scale M i –s are rotation invariant

31 Contour Profiles The profiles may be outer profiles or inner profiles. To construct profiles, find the uppermost and lowermost pixels on the contour. The contour is split at these points. To obtain the outer profiles, for each y select the outermost x on each contour half. Profiles to the other axis can be constructed similarly.

32 Features Generated by Contour Profiles First differences of profiles: X’ L =X L (y+1)-x L (y) First differences of profiles: X’ L =X L (y+1)-x L (y) Width: w(y)=x R (y)-x L (y) Height/max y (w(y)) Location of minima and maxima of the profiles Location of peaksin the first differences (which may indicate discontinuities)

33 Zoning on Contour Curves 1 (Kimura & Sridhar) Enlarged zone A feature vector of size (4× 4) × 4 isgenerated

34 Zoning on Contour Curves 2 (Takahashi) Contour codes were extracted from inner contours (if any) as well as outer contours, the feature vector had dimension (4 ×6 ×6 ×6) ×4 ×(2) (size ×four directions × (inner and outer))

35 Zoning on Contour Curves 3 (Cao) When the contour curve is close to a zone border, small variations in the curve may lead to large variations in the feature vector Solution: Fuzzy border

36 Zoning of Skeletons Features: length of the character graph in each zone (9 or 3). By dividing the length with the total length of the graph, size independence can be achieved. Additional features: the presence or absence of junctions or endpoints

37 The Neural Network Approach for Digit Recognition Le Cun et al: Each character is scaled to a 16×16 grid Three intermediate hidden layers Training on a large set Advantage: feature extraction is automatic Disadvantage: We do not know how it works The output set (here 0-19) is small