Image Text & Audio hacks. Introduction Image Processing is one of the fastest growing technology in the field of computer science. It is a method to convert.

Slides:



Advertisements
Similar presentations
Advanced Image Processing Student Seminar: Lipreading Method using color extraction method and eigenspace technique ( Yasuyuki Nakata and Moritoshi Ando.
Advertisements

Md. Monjur –ul-Hasan Department of Computer Science & Engineering Chittagong University of Engineering & Technology Chittagong 4349
QR Code Recognition Based On Image Processing
Book Scanning & Digital Image Production The VRC Guide to Imaging By Kate Stepp.
Chapter 5 Input and Output. What Is Input? What is input? p. 166 Fig. 5-1 Next  Input device is any hardware component used to enter data or instructions.
July 27, 2002 Image Processing for K.R. Precision1 Image Processing Training Lecture 1 by Suthep Madarasmi, Ph.D. Assistant Professor Department of Computer.
Simple Face Detection system Ali Arab Sharif university of tech. Fall 2012.
‘ Glaucoma Detection In Retinal Images Using Automated Method ’
Palestine Polytechnic University Braille To Text/Voice Converter Project Team Wisam Younes Bayan Halawani Samer Isieed Project Supervisor Dr. Radwan Tahboub.
HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,
Digital Image Processing: Revision
Video Object Tracking and Replacement for Post TV Production LYU0303 Final Year Project Spring 2004.
Chinese Character Recognition for Video Presented by: Vincent Cheung Date: 25 October 1999.
- TECHLEAD SOFTWARE ENGINEERING PVT. LTD. Video Analytics.
Parts of a Computer.
Graphics Standard Grade Computing. Graphics Package n A graphics package is another General Purpose Package. n It is used to draw pictures on the monitor.
ASSISTIVE TECHNOLOGY TOOLS EMILY WRENCH. VISUALLY IMPAIRED.
Assistive Technology Ability to be free. Quick Facts  Assistive technology is technology used by individuals with disabilities in order to perform functions.
ACTIVITY 2 : UNDERSTAND THE WORKING OF DIFFERENT INPUT/OUTPUT HARDWARE DEVICES BRIDGE COURSE of INFORMATION & COMMUNICATION TECHNOLOGY.
Modeling and Animation with 3DS MAX R 3.1 Graphics Lab. Korea Univ. Reference URL :
SCCS 4761 Introduction What is Image Processing? Fundamental of Image Processing.
Unit 30 P1 – Hardware & Software Required For Use In Digital Graphics
Assistive Technology Russell Grayson EDUC 504 Summer 2006.
Designing and implementing a method for locating and presenting a Laser pointer spot Eran Korkidi Gil-Ad Ben-Or.
INTRODUCTION TO COMPUTER PROGRAMMING itc-314 LECTURE 01.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
CSCI-235 Micro-Computers in Science Hardware Part II.
Screen Reader A program that combines sound and picture to help explain what is on the computer screen. Scenario: Mark has very low vision and has troubling.
Information Extraction from Cricket Videos Syed Ahsan Ishtiaque Kumar Srijan.
CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.
Knowledge Systems Lab JN 9/10/2002 Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
Multimodal Interaction Dr. Mike Spann
A Method for Hand Gesture Recognition Jaya Shukla Department of Computer Science Shiv Nadar University Gautam Budh Nagar, India Ashutosh Dwivedi.
Computers and Disability Case Study IB Computer Science II Paul Bui.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
By: Hadley Scholtz Supervisor: Mehrdad Ghaziasgar Co - supervisor: James Connan Mentor: Ibraheem Frieslaar.
 Supervised by Prof. LYU Rung Tsong Michael Student: Chan Wai Yeung ( ) Lai Tai Shing ( )
Braille Converter For Exam Agenda 1.Introduction 2.Research Problem 3.Objectives 4.Methodology 5.Users & Benefits 6.Expected Outputs 7.References.
Addison Wesley is an imprint of © 2010 Pearson Addison-Wesley. All rights reserved. Chapter 7 The Game Loop and Animation Starting Out with Games & Graphics.
September 5, 2013Computer Vision Lecture 2: Digital Images 1 Computer Vision A simple two-stage model of computer vision: Image processing Scene analysis.
Graphics. Graphic is the important media used to show the appearance of integrative media applications. According to DBP dictionary, graphics mean drawing.
Computer Science 111 Fundamentals of Programming I Introduction to Digital Image Processing.
Introduction to Image processing using processing.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES
+ Assistive Technology By Lyndsay RHodes. + Screen Reader A screen reader is a software application for people with severe visual impairments. A screen.
Pinnacle Pro Painting Program User Manual Created by: David Kwasny Chris Schulz W. Scott DePouw.
Data Representation. What is data? Data is information that has been translated into a form that is more convenient to process As information take different.
1 Machine Vision. 2 VISION the most powerful sense.
A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
By Pushpita Biswas Under the guidance of Prof. S.Mukhopadhyay and Prof. P.K.Biswas.
By: Hossein and Hadi Shayesteh Supervisor: Mr. James Connan.
Frank Bergschneider February 21, 2014 Presented to National Instruments.
What you need: In order to use these programs you need a program that sends out OSC messages in TUIO format. There are a few options in programs that.
An Introduction to Digital Image Processing Dr.Amnach Khawne Department of Computer Engineering, KMITL.
Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.
 Many people like the flexibility of digital images. For example:  They can be shared by attaching to /uploading to Internet  Sent via mobiles.
IMAGE PROCESSING is the use of computer algorithms to perform image process on digital images   It is used for filtering the image and editing the digital.
Automatic License Plate Recognition for Electronic Payment system Chiu Wing Cheung d.
OCR Reading.
Hand Gestures Based Applications
Mousavi,Seyed Muhammad – Lyashenko, Vyacheslav
bReader – Blind can read now
Extracting Old Persian Cuneiform Font Out of
A language assistant system for smart glasses
Text Detection in Images and Video
Assistive System Progress Report 1
Educating the Deaf Using Speech Recognition
Presented by :- Vishal Vijayshankar Mishra
Wadner Joseph • James Haralambides, PhD Abstract
Presentation transcript:

Image Text & Audio hacks

Introduction Image Processing is one of the fastest growing technology in the field of computer science. It is a method to convert an image into digital form and perform some operations on it, in order to get an enhanced image or to extract some useful information from it. Image processing is done usually for 1. Visualization - Observe the objects that are not visible. 2. Image sharpening and restoration - To create a better image. 3. Image retrieval - Seek for the image of interest. 4. Measurement of pattern – Measures various objects in an image. 5. Image Recognition – Distinguish the objects in an image. In this hackfest, Using the openCV library in C++ and tesseract OCR, we aim to build the following three devices:

A Friend of the Blind India has the world’s largest blind population. More than 1.43 million people are visually impaired. Only 5% of the blind receives any kind of education. Braille books,braille printers and Scanners are not readily available (expensive & non-portable) The device helps these people to read a normal text book. It has a right angle guide which helps blind person to navigate the finger in straight line and help him to move to the next line by giving an audio feedback.

A Teaching Assistant India ranks 185 on the basis of average Literacy Rate of countries. Around 70% of people in India still live in villages. It is very common to find no quality education in these areas. The device becomes really helpful in these cases. It acts as a teaching assistant,thus enabling a child to learn more interactively and efficiently by visualizing.

A Tourist Companion Imagine yourself in a foreign country where the spoken language is not known to you!! Naturally, you could land yourself in serious troubles in this situation. No problem! Here comes your rescuer in the form of this device.It translates the text in the image to the required language of the user thereby enabling him to understand the foreign language.

List of Libraries and APIs used OpenCV (C++) – For preprocessing the image before feeding it into tesseract. Tesseract – To convert test in images to text Pyenchant – Dictionary gTTS – Google Text to Speech(Online Mode)output in mp3 format Pico2wave- Text to Speech (Offline Mode) Pygame – play mp3 file

Steps Followed to read word pointed by finger Find the centroid of Finger part and save it. Crop the part above the centroid. Invert the Image (White and black to black and white) Morphological operation - Dilation on Inverted image Draw the rectangles and find centroids of all contours. Crop out that particular contour whose X coordinates include the centroid of finger part

Identify finger part Given a range of colour in RGB only those values are white rest all are black Find Contours Find the Centroid of maximum size contour

Morophological operation on contours Dilation Erosion Contour Properities

Captured Frame by camera

Finger Part Cropped out

Invert Image

Find Contours

Identify the word we are pointing at

Adaptive thresholding

Crop out the word pointed by red color Finger part

Cropped out word is not so clear and Tesseract is not soo Sensitive to identify alphabets even if they are not clear. Problem Solved – by Operation Smoothening the edges

Smoothening By Averaging and Guassian Filter

What if the line with words in image is bent? Is Tesseract good enough to identify characters even if the line with text is bent ? Answer is : N0

Solution to above stated Problem of line bending Draw the ellipse around the Counter of last Line. Once we can draw the ellipse we can find the angle of inclination of axis of ellipse with the X-axis and hence angle of tilt and Hence rotate complete image by that angle.

How to find the font size in image ? We have to find Font size coz’ we do dilation operation accordingly based on font size This is done by first inverting normal image Considering each alphabet as separate contour Draw the rectangles across each contour and find height of each contour

Overall View

We are sliding our finger Continuously and words are read out dynamically..!!! Any problems faced?  Blurred images should be removed  Solution: Laplacian edge Detector  Once a word is read out that image should not be processed again  Solution : Scale invariant Feature transform (SIFT) to tell whether present image is same as previous processed image Two parallel threads : One for processing image, Second for taking input image from camera

Laplacian Edge Detector If edges are crisp max value of matrix will be 255 Else somewhere around 100

Scale Invariant feature transform

Problem : Incomplete Images( Premature images) Solution : Using Two Parallel Threads(Programs)

Removal Of Noise NOISE PART

Removal of Noise

Conclusions:- What we built is …. Working prototype for Reading text in a book : Extracting text from images of wrappers of objects and reading out Translating the words to required language by using google APIs(helpful for tourists) and Displaying the image of the word that a child points at on screen (teaching assistant interactive learning).

After the hackathon, we aim to make the following key improvements:- present algorithm needs to be more faster, should include machine learning algorithms Should be a portable device.