Presentation is loading. Please wait.

Presentation is loading. Please wait.

Document Analysis Systems Bedola Roberto, Bordoni Davide, Franc Vojtech Supervised by Luca Lombardi Overview: Introduction Our implementation of Wahl et.

Similar presentations


Presentation on theme: "Document Analysis Systems Bedola Roberto, Bordoni Davide, Franc Vojtech Supervised by Luca Lombardi Overview: Introduction Our implementation of Wahl et."— Presentation transcript:

1 Document Analysis Systems Bedola Roberto, Bordoni Davide, Franc Vojtech Supervised by Luca Lombardi Overview: Introduction Our implementation of Wahl et al framework Multiresolution Approach for page segmentation Conclusions

2 www.ip2001.5u.com Id ip2001.5u.com Pass 2001ip

3 Introduction Document understanding Extraction of relevant information from documents (letters, forms, engineering drawings, etc. ) Graphics Processing Digitalized Image Page Segmentation Textual Processing (OCR)

4 Wahl & Our implementation Region merging Segmentation Connected Component Extraction Block Characterization Text extraction Binarization Morph. opening Connected Component Extraction Classification by SVM Histograms

5 Example 1 After binarizationAfter segmentationOriginal doc.

6 Example 2 Labelled image Before merging Labelled image After merging Classified image

7 Multiresolution Principles The use of various filter permit to find the different zones. Techniques The use of some parameter permit to cut some pixel.

8 Multiresolution

9 Mean & Variance Mean Variance

10 Background Condition We can find the background: 240 < Mean < 255 225 < Mean < 255 image with noise 0 < Variance < 15 0 < Variance < 25 image with noise

11 Median & Threshold Median Threshold

12 Image and Graphics 4 steps Middletone Segmentation: 1. Median filter 2. Threshold filter Pixel counting 3. If threshold  170  counter++ Classification 4. If counter/area  0.7  Graphic else  Pictures

13 Text 3 steps Segmentation of mean image 1. Mean filter Pixel text counting 2. If 240  mean  210 & variance  10  counter++ Classification 3. If counter/area  0.7  Text else if counter/area  0.3  analyse a more detailed image else  Unclassifiable

14 Experimental Results

15 Conclusions Wahl et al. method Simple and accurate, but slower. We finish to implement this method Multiresolution Complex and faster, but need many parameters Only partial implementation

16 Meat or fish Goodb ay


Download ppt "Document Analysis Systems Bedola Roberto, Bordoni Davide, Franc Vojtech Supervised by Luca Lombardi Overview: Introduction Our implementation of Wahl et."

Similar presentations


Ads by Google