Design of a Speech Recognition System to Assist Hearing Impaired Students Richard Kheir 2 and Thomas P. Way Department of Computing Sciences, Villanova University Abstract Background Applications Four general application categories for ASR are: Command Recognition Dictation Interactive Voice Response (IVR) Assistive Technologies Motivation System Design Part 1 - DiBS Low recognition rate for domain specific jargon is one of the key weaknesses in ASR. DiBS was developed to solve this problem. Table: Summary of the accuracy results for five scenarios. DescriptionAccuracyRange Usability Untrained75%64%-83%Poor to fair Minimal Training88%78%-93%Sufficient Moderate Training90%81%-96%Good Moderate Training and Customized dictionary91%83%-96%Good Moderate Training, Customized Dictionary and pronunciations94%86%-98%Very good System Design Part 2 - VUST Table. Recognition accuracy for 4 classifications of classroom speech. Classification Words Correct Total Words Percent Recognized Planning % Lecture % Roll-call % Discussion % TOTAL % Contributions & Future Work Contributions Proved to be an affordable and beneficial assistive system Provides an easy to use software Improves Recognition Accuracy Distributed and portable application Future work Commercial Quality Post speech profiles and jargon in a central repository Evaluate other speech engines Deploy in classrooms SERVER Consists of three major components: the speech recognition software, a dictionary enhancement tool, and a transcription distribution application. Uses an ASR system designed to be affordable, accurate and easy to set up and use. Around one hour of speech training are enough to get good accuracy Training through windows control panel or through the VUST instructor’s Console Simple setup and configuration. User friendly interface Instructor initiates transcription Students connect via web applet Accurate results even without added jargon (table below) We have tested the ASR system with five scenarios: Untrained, some training, moderate training, moderate training and some added jargon using DiBS and moderate training with added jargon and custom pronunciation for the added jargon. Many enhancements took place on specific domains during the following years such as the introduction of the Hidden Markov Model (HMM). At the beginning of the 21st century, commercial speech recognition systems finally became practical and affordable, with many products on the market. The most popular vendors being IBM and Dragon. The quest for automatic speech recognition (ASR) started in 1939 with the introduction of VODER by AT&T. With the now wide availability of ASR software, the technology has become an application area that is emerging in assistive technology. For people who are deaf and hard of hearing, the accessibility and freedom that can be afforded by using a computer to recognize speech is finally beginning to be realized. The design of such a truly usable ASR system requires an understanding of the approaches, user requirements, and available technology. Speech recognition software is maturing, and possesses the potential to provide real-time note taking assistance in the classroom, particularly for deaf and hard of hearing students. This research talks about speech recognition in general, and reports on a practical, portable and readily deployed application that provides a cost-effective, automatic transcription system with the goal of making computer science lectures inclusive of deaf and hard of hearing students. The design of the system is described, some specific technology choices and implementation approaches are discussed, and results of two phases of an in-class evaluation of the system are analyzed. Ideas for student research projects that could extend and enhance the system also are proposed. Nady UHF-3 wireless headset system 3 …click ‘Connect and Start Recognition’ to start VUST server. Run the VUST program and selects a speech profile. 2 1 Connect wireless microphone receiver to computer and wear headset & transmitter. 1 Connect to VUST transcription server URL using web browser. 2 1 Select available connection, and click “Connect”. 3 Transcription is received once the lecture begins. 28 million deaf and hard of hearing individuals in the US (Around 500 million world wide) Limited benefit from hearing aids and cochlear implants as these are most useful in face to face conversations Note takers and sign language interpreters are expensive to hire and provide limited assistance due to the need to paraphrase during a lecture Developing countries provide no assistance Commercial ASR systems are expensive to acquire