Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008

© Prof. Rolf Ingold 2 Outline  Image acquisition  Image enhancement  Foreground / background separation  Binarization  Color clustering  Skew detection and correction  Skew estimation  Deskewing  Text normalization

© Prof. Rolf Ingold 3 Image acquisition  Document images are acquired by  drum scanners  flatbed scanners  high resolution digital cameras  specialized book scanners  or extracted from  3D scene images  video sequences

© Prof. Rolf Ingold 4 Image quality  Various types of document images  binary images (fax)‏  gray level images (256 levels)‏  RGB images (24 bits, or more)‏  at different resolutions  200 dpi (low fax quality)‏  300 - 400 dpi (standard resolution for office automation)‏  8 -15 Mpixels for A4 format  600 dpi or higher for special applications  Images may be  degraded  distorted, non planar  noisy, with artifacts (JPEG)‏

© Prof. Rolf Ingold 6 Overview of document image processing  Image preprocessing is an initial step of document analysis  it aims at preparing the image for further processing  The most important initial steps are  Image enhancement  Binarization, i.e., foreground / background separation  Skew correction  More specialized techniques are used locally  Text size normalization  Slant correction ...

© Prof. Rolf Ingold 7 Image enhancement  Classical image filtering algorithms are applied  To reduce or remove color information  To enhance the contrast between foreground and background  To correct irregular illumination  To strengthen contours  To smooth contours  To remove salt and pepper noise  To thin or thicken strokes  …  Image enhancement is often combined with segmentation or shape analysis

© Prof. Rolf Ingold 8 Foreground / background separation  Document image analysis requires the separation between foreground (ink) and background (paper)‏  Foreground / background is trivial for simple document classes  Binarization determined by appropriate threshold  Problems arise in following situations  Non uniform background (mixing colors and “reverse video”)‏  Textured backgrounds  Halftoning artifacts  Non uniformly illuminated documents  Degraded documents (bad inking, old paper, with holes, …)‏  Paper Transparency, ink traversing

© Prof. Rolf Ingold 10 Niblack’s method  Niblack’s method is using a local threshold where   x,y and  x,y represent respectively the mean and standard deviation of gray levels in a N x N neighborhood around pixel x,y  k is a constant between 0 and 1 (suggested value 0.2)‏  R is the range of gray levels

© Prof. Rolf Ingold 11 Sauvola's method  Sauvola at al. has proposed a variant which assumes that text is dark in bright background where  R =128,  k =0.5  Problems remain when the hypothesis is not true (even after reversing)‏

© Prof. Rolf Ingold 14 Color clustering  For rich colored documents  Check, forms, …  Geographic maps  Historical documents  Advertising foreground background separation is performed by color clustering  Color clustering may be achieved automatically  k-means  Gaussian mixtures  …

© Prof. Rolf Ingold 15 Skew detection and correction  Most document image recognition algorithms need perfectly, horizontally and vertically aligned text  Very often, acquisition systems are not accurate enough  Skew correction requires two steps  Skew estimation (with a precision < 1 degree)‏  Image deskewing (rotation with a small angle)‏  For book reading systems, due to page curvatures, more sophisticated image correction algorithms are required

© Prof. Rolf Ingold 16 Skew estimation  Many different methods have been proposed for skew estimation for printed documents  Margin detection  by white stream analysis  by projection profile analysis  Hough transforms  at pixel level  of centers of connected components  Linear regressions  of centers of connected components  Most methods can be applied on down-sampled images  Skew detection for handwriting is more difficult, but less useful

© Prof. Rolf Ingold 18 Hough Transform  The Hough transform is a global transformation  mapping the spatial space (x,y) to a parametric space ( ,  )‏  each pixel is accumulated on a beam of lines defined in polar coordinates, i.e

© Prof. Rolf Ingold 20 Deskewing of document image  Deskewing requires an image rotation  rotation of color or gray level images needs re-sampling  rotation of binary images has several pitfalls  they introduce distortions and noise  they are not reversible (except for Pythagoras angles)‏  Deskewing can also be approximated  by combining two affine transforms

© Prof. Rolf Ingold 23 Normalization of character size  For text recognition normalization of character sizes is often required  Size normalization can be achieved  By bounding boxes of isolated characters  By base line, ascenders and descenders

© Prof. Rolf Ingold 24 Normalization techniques for handwriting  In case of handwriting additional normalization may be applied  size normalization for ascenders and descenders  slant correction  Slant estimation is performed by averaging the direction of the median of straight vertical segments

© Prof. Rolf Ingold 25 Run Length Smearing Algorithm (RLSA)‏  The Run Length Smearing Algorithm (RLSA) consists in replacing white runs by black runs, if their length is smaller than a given threshold  it can be applied horizontally or vertically  RLSA is often usefull for segmentation

Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Similar presentations

Presentation on theme: "Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Similar presentations

Presentation on theme: "Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."— Presentation transcript:

Similar presentations

About project

Feedback