OCRdroid : A Framework to Digitize Text Using Mobile Phones  Authors  Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav.

Slides:



Advertisements
Similar presentations
By Zheng Sun, Aveek Purohit, Shijia Pan, Frank Mokaya, Raja Bose, and Pei Zhang final38.pdf.
Advertisements

Ethan Bruning Senior Sales Engineer Mobile Capture Apps – Introduction to Mobile Capture App Design and Development.
Review of AI from Chapter 3. Journal May 13  What advantages and disadvantages do you see with using Expert Systems in real world applications like business,
Automatic Data Capture Devices & Methods
Mid-Peninsula IBM PC Club Meeting November 21, 2005 SnagIt Screen Capture OmniPage Pro 14 OCR IconSaver Utility Jan Laskowski
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
GIS and Image Processing for Environmental Analysis with Outdoor Mobile Robots School of Electrical & Electronic Engineering Queen’s University Belfast.
. Website and file organization. How websites work.
MUltimo3-D: a Testbed for Multimodel 3-D PC Presenter: Yi Shi & Saul Rodriguez March 14, 2008.
California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.
Barcode Readers using the Camera Device in Mobile Phones 指導教授:張元翔 老師 學生:吳思穎 /05/25.
1 Incremental Detection of Text on Road Signs from Video Wen Wu Joint work with Xilin Chen and Jie Yang.
Input devices, processing and output devices Hardware Senior I.
Evaluating the use of OCR on a Mobile Device Presented by : Hamed Alharbi Supervisor by :Dr Brett Wilkinson.
Computer Vision Systems for the Blind and Visually Disabled. STATS 19 SEM Talk 3. Alan Yuille. UCLA. Dept. Statistics and Psychology.
Prepared by: - Mr. T.R.Shah, Lect., ME/MC Dept., U. V. Patel College of Engineering. Ganpat Vidyanagar. Digital Image Processing & Machine Vision – An.
Capture your favorite image Done by: ms.Hanan Albarigi.
Evaluation question four How did you use media technologies in the construction and research, planning and evaluation stages ?
Input Devices Manual and Automatic By Laura and Gracie.
Multimedia Databases (MMDB)
Knowledge Systems Lab JN 9/10/2002 Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
DATA COLLECTION METHODS CONTENT PAGE How data is collected via questionnaires. How data is collected via questionnaires. How data is collected with mark.
By: Hadley Scholtz Supervisor: Mehrdad Ghaziasgar Co - supervisor: James Connan Mentor: Ibraheem Frieslaar.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
Phone Reader Project Presenter: Marilyn Bihina Supervisor: James Connan.
Presentation by Paul Prell With slides provided by Flip-Pal®
JASON BANICH ADVISOR: DR. JOHN SENG Crosswalk Detection via Computer Vision.
Real-Time Cyber Physical Systems Application on MobilityFirst Winlab Summer Internship 2015 Karthikeyan Ganesan, Wuyang Zhang, Zihong Zheng Shantanu Ghosh,
Compiled and Presented by KJ Tsiri Supervisor : Mr. Ismail.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
ESPL 1 Motivation Problem: Amateur photographers often take low- quality pictures with digital still camera Personal use Professionals who need to document.
Eric Minner & James Pittman. Outline Project Statement / Motivation Concept overview Quick computer vision overview Demo Lessons Learned Future Work.
Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.
Input Devices. Input devices allow us to enter data into the computer system –Mouse –Keyboard –Graphics Tablet –TrackPad –Touch-sensitive screen - Scanner.
How to Recover Deleted Photos from Android Cell Phone? Android is keeping on improving their products and make sure to provide the best software service.
Topic 2 Input devices. Topic 2 Input devices Are used to get raw data into the computer so that it can be processed Include common input devices such.
An Introduction to Digital Image Processing Dr.Amnach Khawne Department of Computer Engineering, KMITL.
POSTER TEMPLATE BY: Background Objectives Psychophysical Experiment Photo OCR Design Project Pipeline and outlines ❑ Deep Learning.
Implementation of Real Time Image Processing System with FPGA and DSP Presented by M V Ganeswara Rao Co- author Dr. P Rajesh Kumar Co- author Dr. A Mallikarjuna.
License Plate Recognition of A Vehicle using MATLAB
Multi-Sensor 180° Panoramic View IP Cameras
ŞANS OYUNLARI SONUCU PROJECT by TeamTrio. TeamTrio  Ögem Çetin  Ulaş Dallı  Mesut Yılmaz
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
Automatic License Plate Recognition for Electronic Payment system Chiu Wing Cheung d.
PART C – AT for visual impairments
A device tat transfers data from the outside world into a computer
Input devices.
S.Rajeswari Head , Scientific Information Resource Division
Graduation Project Seminar wesome Scanner
Rogerio Feris 1, Ramesh Raskar 2, Matthew Turk 1
bReader – Blind can read now
Fast Preprocessing for Robust Face Sketch Synthesis
Input devices.
Transact™ Mobile SDK Quickly bring capture-enabled mobile applications to market with open-ended backend integrations.
Factors that Influence the Geometric Detection Pattern of Vehicle-based Licence Plate Recognition Systems Martin Rademeyer Thinus Booysen, Arno Barnard.
Play game, pause video, move cursor… with your eyes
Introduction to Computers
Optical Character Recognition
Higher School of Economics , Moscow, 2016
眼動儀與互動介面設計 廖文宏 6/26/2009.
ACTi ALPR - Automatic License Plate Recognition
Optical Data Capture: Optical Mark Recognition (OMR)
Data Capture F451 - AS Computing.
Android Sensor Programming
EYES OF A COMPUTER Autorem materiálu a všech jeho částí, není-li uvedeno jinak, je Zuzana Strnadlová. Dostupné z Metodického portálu ISSN: 
Introduction to Computers
Higher School of Economics , Moscow, 2016
PLTW Terms PLTW Vocabulary Set #10.
Quick and Dirty: the art of OCR
Presentation transcript:

OCRdroid : A Framework to Digitize Text Using Mobile Phones  Authors  Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme  University of Southern California  Presenter  Mi Zhang

Outline  What is OCRdroid ?  Related Work  Design Considerations  System Architecture  Experimental Results  Summary

What is OCRdroid ?  Why?  Huge demand for recognizing text in camera-captured pictures  Mobile phones are Ubiquitous and Powerful  What?  OCRdroid = OCR + Mobile Phone  Two Applications  PocketPal: Personal Receipt Management Tool  PocketReader: Personal Mobile Screen Reader

Related Work  Design and implementation of a Card Reader based on build-in camera. X.P. Luo, J. Li, and L.X. Zhen  Automatic detection and recognition of signs from natural scenes. X. Chen, J. Yang, and A. Waibel  A morphological image preprocessing suite for OCR on natural scene images. M. Elmore, and M. Martonosi

Design Considerations  Real-Time Processing  Lighting Conditions  Text Skew  Perception Distortion (Tilt)  Text Misalignment  Blur (Out – Of - Focus)

Real-Time Processing  Issues :  Limited memory  Relative Low processing power  Require quick response  Our Solutions :  Multi-Thread System Architecture  Image Compression  Computationally Efficient Algorithms

Lighting Conditions  Issues :  Uneven Lighting (Shadows, Reflection, Flooding, etc.)

Lighting Conditions  Our Solution :  Local Binarization : Fast Sauvola’s Algorithm

Text Skew  Issues :  When perspective is not fixed, text lines may get skewed from their original orientation

Text Skew  Our Solution :  Branch-and-Bound text line finding algorithm + Auto-rotation

Perception Distortion (Tilt)  Issues :  When the text plane is not parallel to the imaging plane  Mobile phones are susceptible to tilts  Small Perception Distortion causes OCR to fail

Perception Distortion (Tilt)  Our Solution :  Use Embedded Orientation Sensor (Pitch and Roll)  Calibration

Text Misalignment  Issues :  Camera screen covers a partial text region  Irregular shapes of text characters

Text Misalignment  Our Solution :  Step#1 : Modified version of Sauvola’s algorithm Top Border Right Border Left Border Bottom Border

Text Misalignment  Our Solution :  Step#1(Cont) : Routes to perform Sauvola’s algorithm

Text Misalignment  Our Solution :  Step#2 : Noise Reduction Right Border Left Border Bottom Border Top Border W W

Blur (Out Of Focus)  Issues :  OCR needs sharp edge response

Blur (Out Of Focus)  Our Solution :  Android autofocus mechanism

Internet.. OCR Engine – Tesseract Web Server 1. Photo of a receipt 2. Front end processing 3. Upload image 4. Perform Backend Processing & OCR 5. Return OCR Results 6. Results returned 7. Information Extraction Android Phone System Architecture

Camera Preview Orientation Handler Alignment Checker Image Upload OCR Data Receiver Information Extraction Mobile Database Internet Capture Improper Alignment Detected Proper Alignment Detected Front-End Architecture

Back-End Architecture Store Image Skew Detection & Auto-rotation OCR Text Output Binarization Internet Tesseract OCR Engine Sends Results back to Mobile Device Internet

Experimental Results  Test Corpus  Ten distinct black & white images  Three distinct lighting conditions  Normal : Adequate light  Poor : Dim  Flooding : Light source focus on a particular portion of image  Performance Metrics  Character Accuracy  Word Accuracy  Timing

Experimental Results  Binarization: (Measured by Character Accuracy)  Normal: Around 97%  Poor: Around 60%  Flooding: Around 60%  Skew tolerance: Up to 30 degrees  Perception Distortion: Up to 10 degrees

Experimental Results  Misalignment Detection:  Timing Performance:  Misalignment Detection: Less Than 6 seconds  Overall Process: Less Than 11 seconds

More Information  Project scf.usc.edu/~ananddjo/ocrdroid/index.phphttp://www- scf.usc.edu/~ananddjo/ocrdroid/index.php  Test Cases & Results  Demo Video  Paper  Presentation Slide  Tools Information (Mobile Phone + Software)

Summary  OCRdroid – A Generic Framework for Developing OCR- based Applications on Mobile Phones  Six Design Considerations & Our Solutions  Especially, we advance a new real-time computationally efficient algorithm for text misalignment detection  Experimental Results

Questions ?

Thank You