Presentation is loading. Please wait.

Presentation is loading. Please wait.

Development of an OCR System

Similar presentations


Presentation on theme: "Development of an OCR System"— Presentation transcript:

1 Development of an OCR System
Second Quarter Nathan Harmata Period 5

2 Recap of 1st Quarter Cache system based on quadrant counts
Font dependent, since it is based on cache Completely from scratch Framework for the rest of the year is basically done

3 Goals of 2nd Quarter Generic letter recognition
Transformation of letters Same letter of different font should have similar form Unique forms

4 SlopeField Idea I proposed at the end of 1st Quarter
Transformation of a letter into a collection of line segments of different slopes

5 SlopeField Steps: - get rid of non black pixels
- average horizontal clumps of pixels - starting with the lower left pixel, form a line segment with its adjacent pixel - continue adding more pixels to the line segment if the slope doesn't change too much - stop when a different slope is encountered - repeat with the offending pixel

6 SlopeField

7 SectorParsing Deals with the major flaw with SlopeField
Parses the image into portions that pass the vertical line test. Each portion is then transformed into a SlopeField.

8 New Caching System 5 very different fonts Output was analyzed
SectorParsing and SlopeField done to each letter of each font Output was analyzed Goal is to use these results to create a new way to compare letters

9 Cache Program and Results - SlopeField

10 Cache Program and Results - SectorParsing

11

12 SectorVector From the results, the following were deemed important:
- number of sectors - approximate number of segments - sign of the slope of the first segment Using the data from testing, a “SectorVector” for each letter was formed

13 Results from SectorVector Analysis

14 OCRManager Parses text just it was done first quarter; it uses the same method Individual letters are parsed using SectorParsing into SlopeFields into a SectorVector This SectorVector is compared to the cache by computing the scaled distance between them

15 Goals for 3rd Quarter Make the matching letters for SectorVectors fewer in number and more spread out Develop other heuristics


Download ppt "Development of an OCR System"

Similar presentations


Ads by Google