Spik v1.0 Voice Commands Execution in a Windows Environment Dekel Abelson Eliran Dahan Instructor: Ari Todtfeld.

Slides:



Advertisements
Similar presentations
Speech Recognition There are different kinds of voice or speech “_______" that take the sounds of your voice and match it with words. The engine is software.
Advertisements

Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
Introduction To Java Objectives For Today â Introduction To Java â The Java Platform & The (JVM) Java Virtual Machine â Core Java (API) Application Programming.
Module 4.2 File management 1. Contents Introduction The file manager Files – the basic unit of storage The need to organise Glossary 2.
Windows Presetation Foundation (WPF) 1. Introduction.
Lets Talk 9+ Emulator e-Tech for Tots CS590 - Ashok Sahu.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
1 CE6130 現代作業系統核心 Modern Operating System Kernels 許 富 皓.
Google Android as a mobile development platform T Internet Technologies for Mobile Computing Olli Mäkinen.
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Developing a Basic Web Page with HTML
1st Project Introduction to HTML.
Dragon Naturally Speaking Tutorial What is Dragon Naturally Speaking? Dragon is a dictation software, students can dictate a paper rather than type it.
Python for S60 SmartPhones PostPC Workshop Fall 2006 Amnon Dekel.
HTML 1 Introduction to HTML. 2 Objectives Describe the Internet and its associated key terms Describe the World Wide Web and its associated key terms.
1 ADVANCED MICROSOFT WORD Lesson 15 – Creating Forms and Working with Web Documents Microsoft Office 2003: Advanced.
1 Dragon NaturallySpeaking: Training Agenda. What to Expect Goals: Method / Essential Skills / Getting Help Starting to use speech-recognition software.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Lesson 4 Computer Software
1 “ Speech ” EMPOWERED COMPUTING Greenfield Business Centre, 20 th September, 2006.
A First Program Using C#
A VERY USEFUL E-LEARNING TOOL FOR TEACHERS, RESEARCHERS, AND STUDENTS.
Tool name : Firebug A URL for more information about the tool, or where to buy or download it : Firebug is.
Java-Based In-Car Cell Phone Integration By:Chris Keller Greg Nehus Matt Odille.
Practical AT session 3 WP4-D4.2. Prepared by: Shams Eldin Mohamed Ahmed Hassan Speech, Text and Braille AT.
COMPUTER PROGRAMMING Source: Computing Concepts (the I-series) by Haag, Cummings, and Rhea, McGraw-Hill/Irwin, 2002.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
An Introduction to Visual Basic
Speech Recognition ECE5526 Wilson Burgos. Outline Introduction Objective Existing Solutions Implementation Test and Result Conclusion.
Introduction to Visual Basic. Quick Links Windows Application Programming Event-Driven Application Becoming familiar with VB Control Objects Saving and.
11.10 Human Computer Interface www. ICT-Teacher.com.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Se Over the past decade, there has been an increased interest in providing new environments for teaching children about computer programming. This has.
Microsoft Internet Explorer and the Internet Using Microsoft Explorer 5.
Web Programming : Building Internet Applications Chris Bates CSE :
Reading Aid for Visually Impaired Veera Raghavendra, Anand Arokia Raj, Alan W Black, Kishore Prahallad, Rajeev Sangal Language Technologies Research Center,
1 Computer Programming (ECGD2102 ) Using MATLAB Instructor: Eng. Eman Al.Swaity Lecture (1): Introduction.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal VideoConference Archives Indexing System.
Discovering Computers 2009 Chapter 13 Programming Languages and Program Development.
Voice Recognition (Presentation 2) By: Priya Devi A. S/W Developer, Xsys technologies Bangalore.
Chapter 15 Recording and Editing Sound. 2Practical PC 5 th Edition Chapter 15 Getting Started In this Chapter, you will learn: − How sound capability.
Kingdom of Saudi Arabia Prince Norah bint Abdul Rahman University College of Computer Since and Information System NET201.
Module 2: Using Microsoft Visual Studio.NET. Overview Overview of Visual Studio.NET Creating an ASP.NET Web Application Project.
Microsoft Assistive Technology Products Brought to you by... Jill Hartman.
Copyright © Mohamed Nuzrath Java Programming :: Syllabus & Chapters :: Prepared & Presented By :: Mohamed Nuzrath [ Major In Programming ] NCC Programme.
Multimedia and Computers Introduction to Computers.
An operating system is the software that makes everything in the computer work together smoothly and efficiently. What is an Operating System?
PROPOSAL : The Use of Voice Command in Operating Personal Computer By : COLLEGE OF ART & SCIENCE UNIVERSITI UTARA MALAYSIA STIW5023 ADVANCED PROGRAMMING.
Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,
HTML Concepts and Techniques Fifth Edition Chapter 1 Introduction to HTML.
INFORMATION SYSTEM – SOFTWARE TOPIC: GRAPHICAL USER INTERFACE.
JavaScript and Ajax (Internet Background) Week 1 Web site:
COMPUTER III. Fundamental Concepts of Programming Control Structures Sequence Selection Iteration Flowchart Construction Introduction to Visual Basic.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Speech Recognition Created By : Kanjariya Hardik G.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
HTML PROJECT #1 Project 1 Introduction to HTML. HTML Project 1: Introduction to HTML 2 Project Objectives 1.Describe the Internet and its associated key.
CST 1101 Problem Solving Using Computers
Exploring Microsoft Word 2000
Supervisor: Prof Michael Lyu Presented by: Lewis Ng, Philip Chan
11.10 Human Computer Interface
Project 1 Introduction to HTML.
Google translate app demo
Module 1: Getting Started
Computers Are Your Future
Social Media And Global Computing Introduction to Visual Studio
Alexa Programming.
An Introduction to JavaScript
Web Programming : Building Internet Applications Chris Bates CSE :
Presentation transcript:

Spik v1.0 Voice Commands Execution in a Windows Environment Dekel Abelson Eliran Dahan Instructor: Ari Todtfeld

Objectives Analysis and exploration of Voice-Recognition systems, the abilities of such systems and its limitations Understanding the Windows architecture and programming concepts Development and implementation of a tool that enables users to execute voice commands in a Windows environment, including the restructuring of a graphic interface (GUI) of the tool. Learning the Microsoft Speech SDK 5.1 (Software Development Kit) and its speech engine

Project skills C++ programming skills XML (Extensible Markup Language) programming skills Programming in windows environment include API (Application Programming Interface) commands

Brief history Release of Dragon Systems' “DragonDictate” for Windows 1.0, using discrete speech recognition technology Introduction of IBM’s “MedSpeak”, being the first continuous speech recognition software Dragon Systems’ “NaturallySpeaking” first general-purpose continuous speech software program Two months later IBM release it’s “ViaVoice” 2005 – Due to improvements in PC’s process time and in the algorithms used - today there are several speech recognition programs in the market.

Voice recognition Voice recognition follows these steps: 1. Spoken words enter a microphone 2. Audio is processed by the computer's sound card 3. The software discriminates between lower-frequency vowels and higher-frequency consonants and compares the results with phonemes, the smallest building blocks of speech The software then compares results to groups of phonemes, and then to actual words, determining the most likely match 4. The sentence is transferred to a word processing application

Architecture Voice command by the user SAPI 5.1 Speech Application Program Interface Commands execution using API functions Processing the recognized commands by C++/XML code

GUI Execution file - spik.exe The GUI - A window that receives the voice commands from the user. This GUI has been built in C++ using the basic “Windown” class.

Sapi 5.1 The SAPI provides a high-level interface between the application and the speech engine The TTS (Text-To-Speech) system synthesize text strings and files into spoken audio Speech Speech recognizers convert human spoken audio into readable text strings

Processing Main function contains the infinite loop waiting for messages to process Main window procedure that handles the messages to the window Execute commands that have been identified by the speech engine Microsoft Speech Engine API functions

Commands Execution Windows API is a set of Application Programming Interfaces available in the Microsoft Windows operating systems which enable developers to create software The API consists of C functions implemented in dynamically linked libraries (DLLs), mainly in core DLLs - kernel32.dll, user32.dll and gdi32.dll Main API functions we have used: CreateProcess()– runs executable files WinExec() – runs windows procedures ShellExecute() – runs URL files ShowWindow() – sets the specified window's show state SendMessage() – sends the specified message to a window or windows keybd_event() – synthesizes a keystroke PostMessage() – places (posts) a message in the message queue associated with the thread that created the specified window

The Code קבצי קוד מקור בשפת ++C קבצי Header של התוכנית קובץ תוכנית הרצה קובץ טקסט בפורמט XML לשימוש מנוע זיהוי הקול קובץ טקסט המכיל מחרוזות לשימוש התוכנית קובץ מקומפל לשימוש מנוע זיהוי הקול קבצי Header של מנוע זיהוי הקול

Adaptation & Training The speech recognition engine adapts itself to the user’s voice, vocabulary and speech style in order to improve speech recognition accuracy After adaptation there will be only ¼ of recognition errors and the accuracy will rise As more training is being done, accuracy will rise to around 95%.

Voice command example Calculator usage: Say the voice command “Open Calculator” To run the calc.exe program Say a simple exercise And than say “Equal” or “Result” To show the solution

Voice command example Run programs - notepad command line Internet usage - search google Windows navigation - my documents system properties start menu screen saver

Added value of the project Advanced versions based on Spik v1.0 will be a helpful tool for using the computer and the web, for physically challenged population

Future Development Advanced OS navigation in order to eliminate the use of the keyboard Adding Speech-to-Text capabilities Improved GUI to let users enter their own voice commands

Q&A