A Mobile Application for the Blind and Dyslexic

Slides:

Advertisements

Similar presentations

Microsoft ® Office OneNote ® 2007 Training Using your Notebook to its fullest potential Kent School District presents:

Advertisements

QWERTY Keyboards.

Chapter 1: Voilà! Meet the Android

Do you want this in a different format? Paper Versions available upon request. Provide me with a USB key for an electronic copy. Give me your address.

® Copyright 2008 Adobe Systems Incorporated. All rights reserved. ADOBE® ACCESSIBILITY AT Access to Flash and PDF Matt May 25 Mar 2010 Featuring.

Android 4.0 ICS An Unified UI framework for Tablets and Cell Phones Ashwin. G. Balani, Founder Member, GTUG, Napur.

Our Footprints CLIENTS What We Do Technology Expertise.

1 Accessibility CSSE 376, Software Quality Assurance Rose-Hulman Institute of Technology April 16, 2007.

LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition Supervised by Prof. LYU, Rung Tsong Michael Prepared by: Wong Chi Hang Tsang.

R EAD & W RITE G OLD : T EXT H ELP S YSTEMS I NC.: T EXT TO S PEECH S OFTWARE By: Ashley, Kathryn, Rine, and Samantha.

Chapter 1: Voilà! Meet the Android. Smartphones –Can browse the Web –Allow you to play games –Use business applications –Check –Play music –Record.

ASSISTIVE TECHNOLOGY PRESENTED BY ABDUL BARI KP. CONTENTS WHAT IS ASSISTIVE TECHNOLOGY? OUT PUT: Screen magnifier Speech to Recogonizing system Text to.

Computers Mrs. Doss.

CapturaTalk4Android Demonstration Abi James

Assistive technology and the dyslexic student. FINAL PRESENTATION REGINA JACKSON TECHNOLOGY IN TODAY’S CLASSROOM.

Adobe Accessibility By Margaret Hartman. Who Benefits: Individuals who have motor impairments, low vision, or blindness Creators of PDF documents and.

COMPUTER SOFTWARE FORM 1. Learning Area Introduction to computer software Operating System (OS) Application Software Word Processing Software Presentation.

Claro Software Dr. Alasdair King Claro Software. Assistive Software Development at Claro “Assistive software development and publishing is what we do.

Assistive and Adaptive Technologies in Educational Settings

Assistive Technology Margaret Carlson ED 505. What is assistive technology?  Assistive technology is a broad term that encompasses many different tools.

 By: Ann Carey, M.A., CCC-SLP Speech-Language Pathologist Assistive Technology Building Coach.

2 |2 | Overview of the presentation What is disability? What is the global situation for persons with disabilities? What is accessibility? What is ICT.

CHAPTER 7 Operating System Copyright © Cengage Learning. All rights reserved.

Designing for Quality Justin Shewell TESOL 2016 Baltimore, MD, USA.

How can audio help you and your learners? Adam Pearce Education Director

How can speech technology be used to help people with disabilities?

Assistive Technology : an overview

DISCOVERING COMPUTERS 2018 Digital Technology, Data, and Devices

An introduction to Amazon AI

Housekeeping On-line presentation trouble shooting

Unit 4 – Technology literacy

Hand Gestures Based Applications

Creating Accessible PDF’s for the Web

Face Detection and Notification System Conclusion and Reference

Web-based structures, links and testing

Intelligent IVI with AI

Powerpoint available at

Topics Question answering at Bing

11.10 Human Computer Interface

Chapter 5: Using System Software

PPT By:Gaurav Jaiswal Singsys Pte. Ltd.

Materials and Methods (Continued)

Introduction to Azure Bot Framework

Google translate app demo

Google, AI and Home Automation

Changing how people interact with computers

Splashtop Classroom Assist Gives Teachers the Power to Annotate and Share Their Windows 10 Screens with Any Student, Anytime, Anywhere WINDOWS APP BUILDER.

Curry School of Education

Application Software Productivity Tools for Educators

Leverage the Intelligent Cloud

You Get an App. And You Get an App. And You Get an App!

PhoNET Voice based web access ASWIN.P S3 EC ROLL : 24.

Communication Disability

Application Software Productivity Tools for Educators

Aims: explore innovative assessment tools

Disability Resource Center

Lesson 9: GUI HTML Editors and Mobile Web Sites

AI Stick Easy to learn and use, accelerate the industrialization of artificial intelligence, and let the public become an expert in AI.

Pilar Orero, Spain Yoshikazu SEKI, Japan 2018

Building your class website

Android App Development Course

Accessibility in Microsoft

Presented by Accessibility Services, Johnson & Wales University

Software - Operating Systems

Agenda Need of Cloud Computing What is Cloud Computing

x.ai makes an AI personal assistant who schedules meetings for you

Artificial intelligence for everyone

To change the image on this slide, select the picture and delete it

Exploring Cognitive Services

THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU

Presentation transcript:

A Mobile Application for the Blind and Dyslexic Joseph Lanzone LILY Lab, Yale University, New Haven, CT LILY Lab Introduction Basic Pipeline The proliferation of new AI technologies based on deep learning and powered by massive industrial data sets has enabled the widespread dissemination of easy-to-use, cloud-based, artificial intelligence APIs for application development. Instead of expanding upon a specific facet of foundational research, the goal of this project is to explore the utility of some of these APIs, and to develop new and valuable usage scenarios for these AI-based utilities. Specifically, this project focused on the development of a cross-platform text-to-speech (TTS) system that leveraged cutting-edge optical character recognition (OCR) and human-like speech synthesis endpoints to extract text from captured images and read that text aloud to users. In time, hopefully this application and applications like it will serve as valuable learning tools for the visually impaired and dyslexic A user imports a set of images in the main view These images are processed and sent to the Azure Computer Vision OCR endpoint. The Azure Computer Vision endpoint returns a list of parsed strings These strings are sent to a new view, along with flags signaling which TTS system to use, and whether to clean up the strings via spellcheck. All strings are combined and re-split on sentence delimiters. Should the spellchecking flag be set, each line is sent to the Azure Spell Check API for correction. The specifics of the iterative spellchecking process are further unpacked in the paper. Each processed fragment of text is then fed to a TTS engine, sound bytes are generated for each line (and saved in a temp folder on the device), and each sound byte is linked to a user interaction event (click) within the app. Upon beginning to play audio from the app, all files are queued into the OS media player for OS-level playback control. Figure 1. The basic screens of the app (Android) Figure 2. The key industrial APIs used by the app (Amazon Polly, Bing Computer Vision, Bing Spell Check) Background Figure 3. Screens from the Windows version of the application, showing the main page landing page, and OS-specific photo import In the United States today, somewhere around 5-10% of school children suffer from dyslexia1, and roughly 3% are either completely blind or in some other way severely visually impaired2. To put that in numbers, between 26 and 42 million Americans are faced with reading based impairments caused by these common disabilities alone. At the same time, nearly 77% of the American population owns a smartphone equipped with a camera3, and even simple OCR systems are performant enough to succeed at basic reading tasks. At the same time, the development of virtual assistants such as Amazon’s Alexa and the Google Assistant have sparked increased interest and development of human-like generative speech systems, and the recent movement toward cloud computing has made it such that highly intensive computation can be carried out even on mobile devices These trends inspired the creation of the cross-platform, computer vision based text-to-speech application detailed in this paper. Moving Forward In the future, it would be really valuable to further streamline the parsing and processing of a multi-page PDF document (approaches for this detailed in paper). This would be extremely valuable for the visually impaired, as it would lower the number of total user-interaction events asked of them. It would also be valuable to make the parsed text searchable. Given the large comorbidity of dyslexia and attention deficit disorder, building out features to features that would help distracted users get back on track are highly valuable. Acknowledgement This work was only made possible through the support of Professor Dragomir Radev. Thanks so much for allowing me to pursue this Drago.