Download presentation
Published byWilla Nichols Modified over 9 years ago
1
Language-Independent Text Line Extraction from Historical Document Images
Presented by: Abed Asi First International workshop on Historical Document Imaging and Processing 2011, Beijing , China
2
Motivation Historical handwritten manuscripts are valuable cultural heritage Providing insights into both tangible and intangible cultural aspects from the past Efforts to understand, manipulate and archive historical manuscripts Digitization increases accessibility and allows automatic processing *Courtesy: - wadod.com Genizah Project
3
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
4
Image representation N x M (Matrix)
5
Binarization # pixels intensity
6
Connectivity & Components
We can define 4- or 8-paths depending on the type of connectivity specified A set of pixels S is a Connected Component if for each pixel pair (x1,y1) є S and (x2,y2) є S there is a path between them such that every two successive pixels in the path are in S and are X-neighbors. (X = 4, 8). 4-Neighborhood 8-Neighborhood
7
Connected Component One word, but 3connected components
8
Distances Given 2 points P = (u,v) , Q = (x,y) Euclidean Distance
City Block Distance Chessboard Distance In example: P = (1,8); Q = (4,1)
9
Distance transform Given a set of pixels S, calculate the distance of other pixels to S The pixels in the set S will be considered as reference pixels Let We scan the image by a pre-defined connectivity : First pass: Consider Green pixels (N1)
10
Distance transform In reverse scan, consider Blue pixels (N2)
First scan Distance transform
11
Distance transform – (cont’d)
3 2 1 4 1 Alef Letter - Arabic Printed Handwritten Binary Representation Distance transform Chessboard metric = Reference pixels
12
Sign Distance transform
3 2 1 -1 4 Alef Letter Printed Handwritten Sign Distance transform chessboard metric
13
Sign Distance transform – (cont’d)
The brighter the color the larger the distance from reference pixels Original Document Image Sign Distance transform (SDT)
14
Gradient is the derivative of the image in the horizontal direction
A gray-scale image I is defined as a two-dimensional function I(x,y)=gray The gradient of the image (I ) is given by the formula : Where: is the derivative of the image in the horizontal direction is the derivative of the image in the vertical direction The magnitude of the gradient is defined by:
15
Gradient
16
*Courtesy: Islamic manuscript, Leipzig University Library, Germany
Background Pre-Processing De-noising Binarization Page Layout Analysis Text-line and word Segmentation Indexation and Recognition Segmentation Original *Courtesy: Islamic manuscript, Leipzig University Library, Germany
17
*Courtesy: Juma Al-majid Center for Culture and Heritage, Dubai.
Text-line Extraction Assigning the same color to each text line ب ت ث يــ جـ خـ حـ Original Manuscript Processed Manuscript *Courtesy: Juma Al-majid Center for Culture and Heritage, Dubai.
18
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
19
Challenges Looser layout format Line Proximity Multi-Oriented lines
Historical handwritten documents pose different challenges than those in machine-printed. Looser layout format Line Proximity Multi-Oriented lines Touching components Different slope (within the same line) Delayed strokes Overlapping components A 19th century master thesis – SAAB medical Library, American University of Beirut
20
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
21
Seam Carving Content-aware image resizing
An energy function defines energy value for each pixel A seam is an optimal 8-connected path of low energy pixels Original Image Calculated seams Gradient Image Resized
22
Seam Carving – (cont’d)
let I be an n x m size image. Define a vertical seam to be: where x is a mapping x : [1, ,n] [1, ,m]. Seam contains one, and only one, pixel in each row of the image, otherwise a distorted image might be obtained. The pixels of the path of a seam will therefore be : one can change the value of K in the constraint, and get either a simple column for k = 0 , or even completely disconnected set of pixels.
23
Seam Carving – (cont’d)
Given an energy function e, the cost of a seam is: We look for the optimal seam s* that minimizes this cost : The optimal seam can be found using Dynamic programming
24
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
25
Text line representation by seams
Human perception of text lines Tracks text lines by ink concentration and in-between line spaces Two types of seams have been defined *Courtesy: Wadod Center for masnuscripts.
26
Text line representation by seams -(cont’)
The medial seam crosses the text area of a text line. A Separating seam is a path that passes between two consecutive text lines. Original Document Image Seam Seed Medial Seam Separating Seam Processed *Courtesy: Wadod Center for masnuscripts.
27
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
28
Energy Map We use the Sign distance transform (SDT) as an energy map
In SDT, pixels values are assigned according to their distance from the nearest reference pixel Recall, distance values are negative inside connected components and positive in- between Intuition: Local minima and maxima points determine the medial and separating seams, respectively Original Document Image Sign Distance Transform (SDT) *Courtesy: Wadod Center for masnuscripts
29
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
30
Seam Generation – (cont’d)
The SDT is traversed horizontally to compute a cumulative energy map - Seam Map - for all possible connected seams for each entry (i,j): SDT is traversed with two passes to enhance text line patterns Sign distance transform Bi-linearly interpolate the resulting two maps Left-to-right pass Interpolated map Right-to-left pass
31
Seam Generation – (cont’d)
The minimal entry of the last column is detected. Backtrack from the minimal entry to find the medial seam. Original Document Image Seam Map – One pass Seam Map – Two passes
32
Seam Generation – (cont’d)
Iteratively, all text lines will be extracted
33
Seam Generation – (cont’d)
Then, why separating seams are needed? Avoid recalculation of energy and seam maps after each line extraction Avoid additional strokes classification (post processing)
34
Seam Generation – (cont’d)
Separating seams define the boundaries of text lines Generated with respect to the medial seam of the corresponding text line Grown from seam seeds toward the two sides of the image guided by the SDT
35
Seam Generation – (cont’d)
Seam fragment is a connected group of pixels defined as the closest local maxima along the vertical direction Seam fragments with low priority are discarded Seeds candidate set is constructed The seed that generates the optimal (maximal cost) seam was chosen Medial Seam Seam Map Sign Distance Transform
36
Seam Generation – (cont’d)
The separating seams may diverge from the medial seam due to the fork of ridges A spring force anchored at the medial seam guides the separating seams Before After
37
Touching/Overlapping Components
Usually, crossing overlapping components is avoided gracefully Touching components are split too, but not necessarily in the optimal position Processed Processed
38
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
39
Overlapping Components
Experimental Results Language Overlapping Components Lines Description Dataset Arabic and Spanish 516 1050 Wadod Center for Manuscripts Wadod Arabic 258 900 Al-Majid Center for Culture and Heritage, Dubai Al-Majid English 485 420 American University of Beirut AUB 317 150 Congress Library 1576 2520
40
Experimental Results- (cont’d)
Correctness (%) Dataset Line Lower Upper Medial 98 97 99 Wadod 96 Al-Majid 95 94 AUB 94.25 93 Congress library Stroke Crossing (%) Overlapping Components Dataset 9 516 Wadod 2 258 Al-Majid 485 AUB 10 317 Congress library Table 1: correctness of text line extraction Table 2: crossed components
41
Experimental Results- (cont’d)
42
Outline Background Challenges Seam Carving
Text line representation by seams Energy Map Seam Generation Experimental Results Summary
43
Summary Summary Language independent approach
Dynamic programming was used to find text lines Saves energy map re-computing after text line extraction Post processing steps are avoided Crossing overlapping components was avoided in most cases Still need more research to split touching components optimally
44
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.