Presentation is loading. Please wait.

Presentation is loading. Please wait.

Development of an OCR System Nathan Harmata TJHSST Computer Systems Lab 2007-2008.

Similar presentations


Presentation on theme: "Development of an OCR System Nathan Harmata TJHSST Computer Systems Lab 2007-2008."— Presentation transcript:

1 Development of an OCR System Nathan Harmata TJHSST Computer Systems Lab 2007-2008

2 What is OCR? Optical Character Recognition Font and handwriting based

3 Goals of My Project Generic recognition for Latin-based fonts Proper handling of most formatting System built from scratch

4 Overview of Idocrase System

5 Image Processing

6 Transformations Attribute Character Model

7 Transformations Sector Vector - image is parsed into parts that pass the vertical line test - then each part is transformed into a collection of line segments Gap Vector - gaps, if any, are found on the four sides of the image

8 Transformations Pixel Concentration Vector – which sides, if any, have a higher concentration of pixels

9 Character Recognition GCDD – Generic Character Definition Database Averages of Character Models for every character from many different fonts 0 PixelConcentrationVector balanced balanced SectorVector 4 3 GapVector

10 Character Recognition For a single character: For words, dictionary and grammar references are used.

11 Idocrase Application

12 Results -Mediocre word recognition -Doesn’t handle formatting well -Doesn’t handle small letters well -Fairly accurate single character recognition (93.7%)‏


Download ppt "Development of an OCR System Nathan Harmata TJHSST Computer Systems Lab 2007-2008."

Similar presentations


Ads by Google