Download presentation
Presentation is loading. Please wait.
1
Document Analysis Systems Bedola Roberto, Bordoni Davide, Franc Vojtech Supervised by Luca Lombardi Overview: Introduction Our implementation of Wahl et al framework Multiresolution Approach for page segmentation Conclusions
2
www.ip2001.5u.com Id ip2001.5u.com Pass 2001ip
3
Introduction Document understanding Extraction of relevant information from documents (letters, forms, engineering drawings, etc. ) Graphics Processing Digitalized Image Page Segmentation Textual Processing (OCR)
4
Wahl & Our implementation Region merging Segmentation Connected Component Extraction Block Characterization Text extraction Binarization Morph. opening Connected Component Extraction Classification by SVM Histograms
5
Example 1 After binarizationAfter segmentationOriginal doc.
6
Example 2 Labelled image Before merging Labelled image After merging Classified image
7
Multiresolution Principles The use of various filter permit to find the different zones. Techniques The use of some parameter permit to cut some pixel.
8
Multiresolution
9
Mean & Variance Mean Variance
10
Background Condition We can find the background: 240 < Mean < 255 225 < Mean < 255 image with noise 0 < Variance < 15 0 < Variance < 25 image with noise
11
Median & Threshold Median Threshold
12
Image and Graphics 4 steps Middletone Segmentation: 1. Median filter 2. Threshold filter Pixel counting 3. If threshold 170 counter++ Classification 4. If counter/area 0.7 Graphic else Pictures
13
Text 3 steps Segmentation of mean image 1. Mean filter Pixel text counting 2. If 240 mean 210 & variance 10 counter++ Classification 3. If counter/area 0.7 Text else if counter/area 0.3 analyse a more detailed image else Unclassifiable
14
Experimental Results
15
Conclusions Wahl et al. method Simple and accurate, but slower. We finish to implement this method Multiresolution Complex and faster, but need many parameters Only partial implementation
16
Meat or fish Goodb ay
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.