PROJECT PROPOSAL DIGITAL IMAGE PROCESSING TITLE:- Automatic Machine Written Document Reader Project Partners:- Manohar Kuse(Y08UC073) Sunil Prasad Jaiswal(Y08UC124)
AIM To build a system, which could read aloud the text in a book, by means of recognition of characters from the image of the pages of the book from a simple overhead web-cam. Our Approach First we will acquire a page of text book by means of overhead web-cam. This will work as our input image. Now we will convert the text in image to ASCII text. This process will be carried by means of first separating individual characters from the image. Identifying them with some machine learning approach. The ASCII test thus formed will be sent to program which can convert this ASCII text to sound. SEGMENTATION OF CHARACTERS: We plan to achieve this by First thresholding the acquired image, to obtain a binary mask of individual characters. Then by means of CCA (connected component analysis) label individual characters and then normalize the size of each connected component. CHARACTER-FEATURES: One of the method suggested, was to use statistical zonal features. Which means, that an individual character was divided into a few zones. And statistical features like mean, variance, area, perimeter etc of each zones were the features. These features are to be later learned with classifiers like – Neural networks, Bayer's classifier or Support Vector Machines (SVM).
We are planning to divide the normalized character into 16 zones. For each zone we will evaluated mean, variance, area, perimeter. We are also planning to use global features like – compactness, area, perimeter and fraction of filled area. Block diagram of our approach :- image Acquisition Segmentation of characters &size normalization Character feature extraction Identification Sound producing device
Challenges CCA is not able to identify spaces. This will create problems in grouping characters as words. We plan to come-up with a new approach for segmentation of characters. Since we are using zonal features, slight orientation of characters might cause inaccuracy for the classifier. Requirement of large dataset, covering various font styles, and symbols. We have not yet found such a data-set on the internet and we plan to make one by ourselves, and get our friends for manual annotation.. Course of Progress We plan to have following deadlines:- 28 th Oct Finalization of character features 2 nd Nov Building the data-set with annotations 4 th Nov Preliminary results on the data-set 10 th Nov Testing with real time data 12 th Nov Testing results 15 th Nov Demonstration of the system 16 th Nov Final report