Optical Character Recognition Mobile App. Development Project Aakanksha Gupta
Optical Character Recognition A Technology that enables one to extract text out of printed documents, captured images Targets typewritten text, one character at a time An "offline" process, which analyses a static document Pre-Processes images for successful recognition Android currently doesn’t come pre bundled with libraries for OCR, hence need external libraries Current Apps on Google Play Store: Google Keep Text Fairy CamScanner Application in Real Life Data entry for business documents, e.g. check, passport Extracting business card information into a contact list Converting handwriting in real time to control a compute
Text Recognition Functionality: An app developed to allow user to capture an image and gives user an option to see the text what is available on the image. Images are not stored but gives an option to store in external media. Requirement: Android 5 and above Features: Camera Feature, Convert English Words to Text Issues: Sometimes gives garbage values and cannot recognise text Camera gives blur images sometimes Horizontal and Vertical camera orientation Autofocusing
Demo of Text Recognition
Implementation Preparing Tesseract Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages It provides API for converting Image to Text. Used English Language for the application Install Native Development Kit (NDK) is a set of tools that allows you to use C and C++ code to run in Java and Android Methodology Adding tess-two library to dependency Creating a class to manage Tesseract calls Initialize the object and call methods on the object
Android Side Using OpenCV 3_1_0 Still have to take a photo from the camera, Loading data from image files Gradle
Initialize the TessBAseAPI with the path to traineddata file and proper page segmentation Pass the image as bitmap to tessBaseAPI variable Call getUTF8Text method, and this will return string value
CameraBridgeViewBase Implemented to control when camera can be enabled, process the frame, calls external listener to make adjustment to frame Bitmap Graphic image file used to store digital images, 32 bit color FloatingActionButton Circled icon floating above the UI
Enable the Tesseract Toast notifications, and all the other notifications
Layout and Resource
AndroidManifest.xml Uses External Memory Change the icon of the application Change the name of application Uses the UTF-8 character encoding which is capable of encoding all possible characters defined by Unicode
Challenges Improvement Choosing the dataset was the biggest challenge Tried with Matalab but results were not fine Working with Tesseract Installing NDK and making it work Improvement Tesseract dataset is not great, sometimes give garbage values instead of detection Camera orientation creates issue Image when stored for processing appears to be blurred Not 100% accurate
Learnings Learned a lot about OpenCV & Tesseract Why dataset are important Android layout can be fun Not always does internet helps And yes, Finally I made an App of my own!! Phew.
Questions??