Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}

Similar presentations


Presentation on theme: "Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}"— Presentation transcript:

1 Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}

2  To convert alpha-numeric character from image into normal text form.  To get general idea on image processing.

3 S.NToolsDescription 1JDK 6 Development Kit for JAVA Programming 2NetBeans 7.0 IDE for JAVA Application Development 3Microsoft Windows & Linux OS platforms to Application 4Tortoise SVN Version Control Software for Project Mgmt. 5Sourceforge Project Management and Configuration 6Microsoft Office Documentations

4  Taking image as input.  Converts into normal text form.  Recognizes alpha-numeric characters only.  Edit and Save recognized text. Loaded Image Converted Text Editable

5 Save Text Matrix Matching Feature Extraction Character Segment Line Segment Thinning Binarization Get Image Bold Thin

6  Otsu Binarization Algorithm  Hilditch Skeletonization Algorithm (Thinning)

7  Generic Segmentation

8  Feature Extraction (zonning) Based on Zones 5 horizontal and 5 vertical zones =>25 features Based on Upper and Lower profiles 10 vertical zones => 20 features Based on Left and Right profiles 10 horizontal zones => 20 features Total Number of features 25 + 20 + 20 = 65

9 OFF DAYS: Exam Time: (25 Days) Dashain Holidays: (15 Days) Tihar Holidays: (3 Days)

10 Choosing the correct algorithm. Hard to implement algorithm. Implemented, but output is not accurate. accuracy of matrix matching.

11  Text from image gets converted to text file.  Simplest algorithm; accuracy is about 40%-60%.

12  Can’t recognize text in noisy image.  Can’t detect inclined text from image.  Matrix matching is slow.  Bad thinning & noise makes some text unrecognizable.

13  Scanner image input.  Recognize PDF and other image format.  Nepali / Devnagari font support.  Different fonts.  Output in PDF or Word file format.  Skewing & Noise reduction.  Handwritings.  Neural Network.

14  Bates, K. S. (2010). Head First Java. O'Reilly.  Improving Optical Character Recognition http://www.csc.villanova.edu/~mdamian/csc3990/csrs2008/07- csrs2008-AJPalkovic.PDF http://www.csc.villanova.edu/~mdamian/csc3990/csrs2008/07- csrs2008-AJPalkovic.PDF  Evaluation of OCR Algorithms for Images: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.9539 &rep=rep1&type=PDF http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.9539 &rep=rep1&type=PDF  Otsu Thresholding - The Lab Book Pages http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html  Image Segmentation http://people.cs.uchicago.edu/~pff/segment/http://people.cs.uchicago.edu/~pff/segment/  Hilditch Algorithm http://cis.k.hosei.ac.jp/~wakahara/Hilditch.chttp://cis.k.hosei.ac.jp/~wakahara/Hilditch.c  Skeletonization http://cgm.cs.mcgill.ca/~godfried/teaching/projects97/azar/skelet on.html http://cgm.cs.mcgill.ca/~godfried/teaching/projects97/azar/skelet on.html  Java OCR | Ron Cemer's Blog http://www.roncemer.com/software- development/java-ocrhttp://www.roncemer.com/software- development/java-ocr

15


Download ppt "Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}"

Similar presentations


Ads by Google