Reducing uncertainty in speech recognition Controlling mobile devices through voice activated commands Neil Gow, GWXNEI001 Stephen Breyer-Menke, BRYSTE003.

Slides:



Advertisements
Similar presentations
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 9 Emerging Trends and Technologies: Business, People,
Advertisements

1/14/11 -SP 11 Digital CapstonesCopyright Joanne DeGroat, ECE, OSU1 Project Descriptions For the digital orientated 682 projects.
AUTOMATIC ORGANIZING AND FORMATTING FOR LECTURE NOTES SHIQING (LICIA) HE ADIVISOR: PROF.KRISTINA STRIEGNITZ SPRING 2014 STRUCTURING THE UNSTRUCTURED NOTE:
SPEECH RECOGNITION Kunal Shalia and Dima Smirnov.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Almost-Spring Short Course on Speech Recognition Instructors: Bhiksha Raj and Rita Singh Welcome.
Intel® Education K-12 Resources Our aim is to promote excellence in Mathematics and how this can be used with technology in order.
Automatic Speech Recognition
THE SECOND LIFE OF A SENSOR: INTEGRATING REAL-WORLD EXPERIENCE IN VIRTUAL WORLDS USING MOBILE PHONES Sherrin George & Reena Rajan.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Natural Language Understanding
The University of Akron Summit College Business Technology Department Computer Information Systems 2440: 145 Operating Systems Introduction to UNIX/Linux.
Catholic University College of Ghana Fiapre-Sunyani Catholic University College of Ghana Introduction to Information Technology I.
Automatic Transcript Generation Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen.
Some Voice Enable Component Group member: CHUAH SIONG YANG LIM CHUN HEAN Advisor: Professor MICHEAL Project Purpose: For the developers,
1 “ Speech ” EMPOWERED COMPUTING Greenfield Business Centre, 20 th September, 2006.
Online Chinese Character Handwriting Recognition for Linux
SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,
Practical AT session 3 WP4-D4.2. Prepared by: Shams Eldin Mohamed Ahmed Hassan Speech, Text and Braille AT.
Playful Stimulation against Parkinson’s Disease -
RUP Implementation and Testing
How Spread Works. Spread Spread stands for Speech and Phoneme Recognition as Educational Aid for the Deaf and Hearing Impaired Children It is a game used.
1 ISA&D7‏/8‏/ ISA&D7‏/8‏/2013 Systems Development Life Cycle Phases and Activities in the SDLC Variations of the SDLC models.
Eduard Petlenkov, Associate Professor, TUT Department of Computer Control
Chapter 4 – Slide 1 Effective Communication for Colleges, 10 th ed., by Brantley & Miller, 2005© Technology and Electronic Communication.
CMU Shpinx Speech Recognition Engine Reporter : Chun-Feng Liao NCCU Dept. of Computer Sceince Intelligent Media Lab.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
26 June 2008 DG REGIO Evaluation Network Meeting Ex-post Evaluation of Cohesion Policy Programmes co-financed by the European Fund for Regional.
CAPTCHA solving Tianhui Cai Period 3. CAPTCHAs Completely Automated Public Turing tests to tell Computers and Humans Apart Determines whether a user is.
Voice Recognition (Presentation 2) By: Priya Devi A. S/W Developer, Xsys technologies Bangalore.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
Interaction techniques for post-WIMP interfaces Lawrence Sambrooks Supervisor: Dr Brett Wilkinson.
CSI Topics in Fuzzy Systems : Life Log Management Fall Semester, 2008.
May 7, 2003 Command and Control Visualization NAVCIITI Tasks 2.1b.
E.g.: MS-DOS interface. DIR C: /W /A:D will list all the directories in the root directory of drive C in wide list format. Disadvantage is that commands.
Module Overview. Aims apply your programming skills to an applied study of Digital Image Processing, Digital Signal Processing and Neural Networks investigate.
Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.
© 2013 by Larson Technical Services
Basic structure of sphinx 4
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
Unclassified//For Official Use Only 1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime.
Speech Recognition Created By : Kanjariya Hardik G.
By: Nicole Cappella. Why I chose Speech Recognition  Always interested me  Dr. Phil Show Manti Teo Girlfriend Hoax  Three separate voice analysts proved.
語音訊號處理之初步實驗 NTU Speech Lab 指導教授: 李琳山 助教: 熊信寬
#SummitNow Yes, I'm able to index audio files within Alfresco 2013 Fernando González @fegorama.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
Speech Recognition Xiaofeng Lai. What is speech recognition?  Speech recognition :  This is the ability of a machine or program to identify words and.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
1 Speech Recognition. 2 Introduction What is Speech Recognition? - Voice Recognition? Where can it be used? - Dictation - System control/navigation -
Perceptive Computing Democracy Communism Architecture The Steam Engine WheelFire Zero Domestication Iron Ships Electricity The Vacuum tube E=mc 2 The.
Eduard Petlenkov, Associate Professor, TUT Department of Computer Control
How can speech technology be used to help people with disabilities?
SPEECH TECHNOLOGY An Overview Gopala Krishna. A
Smart Homes & Buildings.
Speech Recognition
Yes, I'm able to index audio files within Alfresco
CHAPTER 1 Introduction BIC 3337 EXPERT SYSTEM.
Eick: Introduction Machine Learning
Artificial Intelligence for Speech Recognition
A presentation on Basics of Speech Recognition Systems
Hands-free Eyes-free Text Messaging
ISS0023 Intelligent Control Systems Arukad juhtimissüsteemid
Home Automation System
Dialog Design 4 Speech & Natural Language
PhoNET Voice based web access ASWIN.P S3 EC ROLL : 24.
Command Me Specification
Human and Computer Interaction (H.C.I.) &Communication Skills
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Reducing uncertainty in speech recognition Controlling mobile devices through voice activated commands Neil Gow, GWXNEI001 Stephen Breyer-Menke, BRYSTE003 Supervisor: Audrey Mbogho

Introduction Variety of applications Word processing In-car voice activation Over-the-phone automated business systems Mobile phone interactions Biometric identification

Introduction AT&T Bell labs Processing power was the initial barrier Speeds of up to 160 wpm are possible With accuracy of 95%

Introduction Why use command based interfaces on cell-phones? Small keypads Hands free No required visual feedback Quick access to common functions

How it works Analogue sound waves are converted to digital format The acoustical model breaks the digitized input into phonemes

How it works Phonemes are analysed in the context of the phonemes around them This is done according to a statistical model to identify the assumed spoken word

Available models Neural Networks Dynamic time warping Knowledge based speech recognition The hidden Markov Model

The Toolkits we will be using The Sphinx Project Hidden Markov Model The NICO Toolkit Artificial neural network

Our Problem Domain Evaluating the two models performance Assessing the applicability of the models in mobile environments

Our Approach We will be implementing and comparing two software packages Scaling the packages for mobile devices Testing them in a simulated mobile environment If feasible we will be implementing the preferred package on a mobile device

The Sphinx Project Carnegie Mellon University funded by DARPA Open source (GPL) Latest version written in Java Based on Hidden Markov Models

The NICO Toolkit Neural Inference COmputation Developed during Open Source (BSD) Written in C Written for UNIX Its focus is for Speech Recognition General Neural Network Software

Division Of Work Both Designing evaluation criteria Neil Research Hidden Markov Model Implement and Scale Sphinx Evaluate Sphinx Steve Research Neural Networks Implement and Scale NICO Evaluate NICO Both Mobile implementation

Timeline

Risks Failure to implement and scale the packages Lack of sufficient documentation for the packages Failure to understand how they work Falling behind schedule

Goals Further the research on speech recognition Determine the effectiveness of these algorithms in mobile environments Produce a working prototype that can be run on mobile devices