5526 Speech Recognition Application of Sphinx-4 Yuan Hao.

Slides:



Advertisements
Similar presentations
Intro to C#. Programming Coverage Methods, Classes, Arrays Iteration, Control Structures Variables, Expressions Data Types.
Advertisements

Tuning Jenny Burr August Discussion Topics What is tuning? What is the process of tuning?
USA AREA CODES APPLICATION by Koffi Eddy Ihou May 6,2011 Florida Institute of Technology 1.
The Mail Room Juan. Internship Location City of hall mail room 30 Church St.
Writing Workshop Here are some typical writing style issues which people have trouble with.
Linguist Module in Sphinx-4 By Sonthi Dusitpirom.
Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.
State Diagrams and enums ECE152. Overview What is a state diagram and where are they used? – Digital Logic – Coding Why use a states? How are states done.
MSc. Publishing on WWW JavaScript. What is JavaScript? A scripting language devised by Netscape Adds functionality to web pages by: Embedding code into.
ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.
Programming. Software is made by programmers Computers need all kinds of software, from operating systems to applications People learn how to tell the.
Latitude and Longitude. Enter the following data into Excel CityLatitude (degrees)Longitude (degrees) New York City Las Vegas
Unit Two: Interpersonal Communication Characteristics of Oral Language.
May 14, Multimodal User Interface Final Project Webnnel: A channel-based Web navigation system Chen-Hsiang Yu and Oshani Seneviratne
Doug Hughes, Alagad Inc. Drinkin’ Cold Coffee -or- How to use Java Objects from ColdFusion.
CS-EE 481 Spring Founders Day, 2005 University of Portland School of Engineering Project Pocket Gopher Conversational Learning Agent Team Josh Jones.
Speech Recognition Final Project Resources
11 Games and Content Session 4.1. Session Overview  Show how games are made up of program code and content  Find out about the content management system.
Temple University Speech Recognition using Sphinx 4 (Ti Digits test) Jaykrishna shukla,Amir Harati,Mubin Amehed,& cara Santin Department of Electrical.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Instant Recognition of Numbers TEENS base 10 Student instantly recognizes the 10 and the one. Then add the two to make the teen. For example the.
Use effective written methods to add whole numbers.
Speech Recognition ECE5526 Wilson Burgos. Outline Introduction Objective Existing Solutions Implementation Test and Result Conclusion.
 Feature extractor  Mel-Frequency Cepstral Coefficients (MFCCs) Feature vectors.
Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.
CMU Shpinx Speech Recognition Engine Reporter : Chun-Feng Liao NCCU Dept. of Computer Sceince Intelligent Media Lab.
By: Meghal Bhatt.  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language.
How to Tag a Corpus Using Stanford Tagger. Accuracy All tokens: 97.32% Unknown words: 90.79%
Basic Setup Copyright © Liferay, Inc. All Rights Reserved. No material may be reproduced electronically or in print without written permission.
Dynamic Web Pages & JavaScript. Dynamic Web Pages Dynamic = Change Dynamic Web Pages are web pages that change. More than just moving graphics around.
Practice and Evaluation. Practice Develop a java class called: SumCalculator.java which computes a sum of all integer from 1 to 100 and displays the result.
1 Speech Processing. 2 Speech Processing:  Review of DSP Concepts  Review of Probability and Stochastic Processes  Anatomy and Physiology of Speech.
Introduction to JavaScript CS101 Introduction to Computing.
Dean Anderson Polk County, Oregon GIS in Action 2014 Modifying Open Source Software (A Case Study)
PROPOSAL : The Use of Voice Command in Operating Personal Computer By : COLLEGE OF ART & SCIENCE UNIVERSITI UTARA MALAYSIA STIW5023 ADVANCED PROGRAMMING.
Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.
Modularity Computer Science 3. What is Modularity? Computer systems are organized into components called modules. The extent to which this is done is.
Basic structure of sphinx 4
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
GoogleEarth Keyhole Markup Language (KML) Based on Extensible Markup Language (XML) KML files can come from Internet servers or files on your computer.
ALPHABET RECOGNITION USING SPHINX-4 BY TUSHAR PATEL.
Speech Recognition Created By : Kanjariya Hardik G.
Simple Project on Digit Recognition By: Class: Faculty: Manish Ravlani Speech Recognition Dr. Kepuska.
#SummitNow Yes, I'm able to index audio files within Alfresco 2013 Fernando González @fegorama.
Informing Projects that Face East
Problem Identification
Object-Oriented Programming Using Java
Yes, I'm able to index audio files within Alfresco
Event loops 16-Jun-18.
Multiplying 2 Digit Factors
Peter and the Wolf Recognizing Themes.
MLX Revenue Cycle Management -
Lab 2: Isolated Word Recognition
Installing and Using MARIE
Specifying, Compiling, and Testing Grammars
Event loops.
EEE-425 Programming Languages Lecturer: Assoc. Prof Turgay İBRİKÇİ
Installing and Using MARIE
Lab 3: Isolated Word Recognition
WEBINAR: Selenium Page Object vs Object Repository
Programming.
Event loops 17-Jan-19.
Installing and Using MARIE
Event loops 8-Apr-19.
Modified at -
Lecture 18 Compilers and Language Translation (S&G, ch. 9)
Communications Haven, Yovannca.
Event loops.
Using Addition Properties
Event loops 19-Aug-19.
Presentation transcript:

5526 Speech Recognition Application of Sphinx-4 Yuan Hao

SPHINX-4 Providing a more flexible framework for research in speech recognition Written entirely in the Java programming language

ZipCity A simple application for Sphinx-4. ZipCity listens for the zip code and show the location related to the zip code.

ZipCity What should we do if we don’t know the zip code, but we know the name of the city? Modify ZipCity!

Things we should modify ZipCity.configer.xml – This document demonstrate the model and dictionary we use. Now, it only can recognize digit. ZipCity.gram ZipRecognizer.java ZipDatabase.java

ZipCity.configer.xml Change the dictionary path. – <property name="dictionaryPath" – value="resource:/WSJ_8gau_13dCep_16k_40mel_130Hz_ 6800Hz/dict/cmudict.0.6d"/> Change the filler path. – <property name="fillerPath" – value="resource:/WSJ_8gau_13dCep_16k_40mel_130Hz_ 6800Hz/dict/fillerdict"/>

ZipCity.configer.xml Change the acoustic model. – <property name="location" – value="resource:/WSJ_8gau_13dCep_16k_40mel_130Hz_ 6800Hz"/> – <property name="modelDefinition" – value="etc/WSJ_clean_13dCep_16k_40mel_130Hz_6800H z.4000.mdef"/> – <property name="dataLocation" – value="cd_continuous_8gau/"/>

ZipCity.gram Adjust the grammar – public = ; – = New-york | Cocoa | San-Francisco | Chicago | Houston | San-diego | Tallahassee | Titusville | Orlando | Miami | computer ;

ZipRecognizer.java Add the city name we list in ZipCity.gram – Recognizer returen the value of “digitMap.put()”, so we should add the city name in digitMap code. – digitMap.put("chicago", "chicago"); digitMap.put("san- francisco", "san-francisco"); digitMap.put("new-york", "new-york");

ZipDatabase.java Look up the info of city using city name instead of using zip code. – zipDB.put(city, new ZipInfo(zip, city, state,latitude, longitude))

Done! Let’s see what happened!

Supplement The grammar should be complete, and so do the dictionary. Now, it just contain 10 cities name. This application also can recognize word, even that word is not a city name.