Presentation is loading. Please wait.

Presentation is loading. Please wait.

OCRdroid : A Framework to Digitize Text Using Mobile Phones  Authors  Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav.

Similar presentations


Presentation on theme: "OCRdroid : A Framework to Digitize Text Using Mobile Phones  Authors  Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav."— Presentation transcript:

1 OCRdroid : A Framework to Digitize Text Using Mobile Phones  Authors  Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme  University of Southern California  Presenter  Mi Zhang

2 Outline  What is OCRdroid ?  Related Work  Design Considerations  System Architecture  Experimental Results  Summary

3 What is OCRdroid ?  Why?  Huge demand for recognizing text in camera-captured pictures  Mobile phones are Ubiquitous and Powerful  What?  OCRdroid = OCR + Mobile Phone  Two Applications  PocketPal: Personal Receipt Management Tool  PocketReader: Personal Mobile Screen Reader

4 Related Work  Design and implementation of a Card Reader based on build-in camera. X.P. Luo, J. Li, and L.X. Zhen  Automatic detection and recognition of signs from natural scenes. X. Chen, J. Yang, and A. Waibel  A morphological image preprocessing suite for OCR on natural scene images. M. Elmore, and M. Martonosi

5 Design Considerations  Real-Time Processing  Lighting Conditions  Text Skew  Perception Distortion (Tilt)  Text Misalignment  Blur (Out – Of - Focus)

6 Real-Time Processing  Issues :  Limited memory  Relative Low processing power  Require quick response  Our Solutions :  Multi-Thread System Architecture  Image Compression  Computationally Efficient Algorithms

7 Lighting Conditions  Issues :  Uneven Lighting (Shadows, Reflection, Flooding, etc.)

8 Lighting Conditions  Our Solution :  Local Binarization : Fast Sauvola’s Algorithm

9 Text Skew  Issues :  When perspective is not fixed, text lines may get skewed from their original orientation

10 Text Skew  Our Solution :  Branch-and-Bound text line finding algorithm + Auto-rotation

11 Perception Distortion (Tilt)  Issues :  When the text plane is not parallel to the imaging plane  Mobile phones are susceptible to tilts  Small Perception Distortion causes OCR to fail

12 Perception Distortion (Tilt)  Our Solution :  Use Embedded Orientation Sensor (Pitch and Roll)  Calibration

13 Text Misalignment  Issues :  Camera screen covers a partial text region  Irregular shapes of text characters

14 Text Misalignment  Our Solution :  Step#1 : Modified version of Sauvola’s algorithm Top Border Right Border Left Border Bottom Border

15 Text Misalignment  Our Solution :  Step#1(Cont) : Routes to perform Sauvola’s algorithm

16 Text Misalignment  Our Solution :  Step#2 : Noise Reduction Right Border Left Border Bottom Border........ Top Border W W

17 Blur (Out Of Focus)  Issues :  OCR needs sharp edge response

18 Blur (Out Of Focus)  Our Solution :  Android autofocus mechanism

19 Internet.. OCR Engine – Tesseract Web Server 1. Photo of a receipt 2. Front end processing 3. Upload image 4. Perform Backend Processing & OCR 5. Return OCR Results 6. Results returned 7. Information Extraction Android Phone System Architecture

20 Camera Preview Orientation Handler Alignment Checker Image Upload OCR Data Receiver Information Extraction Mobile Database Internet Capture Improper Alignment Detected Proper Alignment Detected Front-End Architecture

21 Back-End Architecture Store Image Skew Detection & Auto-rotation OCR Text Output Binarization Internet Tesseract OCR Engine Sends Results back to Mobile Device Internet

22 Experimental Results  Test Corpus  Ten distinct black & white images  Three distinct lighting conditions  Normal : Adequate light  Poor : Dim  Flooding : Light source focus on a particular portion of image  Performance Metrics  Character Accuracy  Word Accuracy  Timing

23 Experimental Results  Binarization: (Measured by Character Accuracy)  Normal: Around 97%  Poor: Around 60%  Flooding: Around 60%  Skew tolerance: Up to 30 degrees  Perception Distortion: Up to 10 degrees

24 Experimental Results  Misalignment Detection:  Timing Performance:  Misalignment Detection: Less Than 6 seconds  Overall Process: Less Than 11 seconds

25 More Information  Project Website @: http://www- scf.usc.edu/~ananddjo/ocrdroid/index.phphttp://www- scf.usc.edu/~ananddjo/ocrdroid/index.php  Test Cases & Results  Demo Video  Paper  Presentation Slide  Tools Information (Mobile Phone + Software)

26 Summary  OCRdroid – A Generic Framework for Developing OCR- based Applications on Mobile Phones  Six Design Considerations & Our Solutions  Especially, we advance a new real-time computationally efficient algorithm for text misalignment detection  Experimental Results

27 Questions ?

28 Thank You


Download ppt "OCRdroid : A Framework to Digitize Text Using Mobile Phones  Authors  Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav."

Similar presentations


Ads by Google