USA AREA CODES APPLICATION by Koffi Eddy Ihou May 6,2011 Florida Institute of Technology 1
Problem Statement Florida Institute of Technology 2 The elaboration of the zip code algorithm is a bit straight forward since the program actually maps each city and it corresponding zip code [13]. And this is done with the java harsh map class which associates keys and values. Unfortunately, each key (zip city) can only access or refer one value. Therefore one city corresponds to one zip code and one zip code only.
This situation is reversible with this new application; the area code maps a single area code to all the corresponding cites Florida Institute of Technology Why is the New Approach Better ? In this new algorithm, each key can be pointed to many cities (values). In other words, the values can be more than one To fix this problem in the zip code application, the single values data are replaced by arrays that contain the corresponding cities Area code application as extension to the zip code application 3
II-The Area Codes application II-1-The Database Florida Institute of Technology 4 The database contains in text file the list of all cities in USA with their associated area codes and the GPS coordinates (longitudes and latitudes). A conversion is internally processed to obtain those GPS coordinates from DEGREE/MINUTES/SECONDS format into a decimal degree format. A grammar file contains the representation of the number of digits used. The area code application a 3 digit recognizer, therefore this grammar file provide a representation of these digits associated with the corresponding word. The words range from “zero” to “nine”. That means all the possible 3 digit number will be formed from these words. As a record no area or zip code in USA starts with zero.
Florida Institute of Technology 5 II-2-The Recognizer II-2-1-General Information The recognizer system used is the sphinx-4. Sphinx4 is in fact a powerful speech recognition system entirely written in Java. The recognizer itself has been designed in collaboration between the sphinx group at Carnegie Mellon University, Suns Microsystems Laboratories, Mitsubishi Electric Research Labs, Hewlett Packard and a contribution from MIT and University of California at Santa Cruz Concerning its performance sphinx- 4 is capable of performing many different types of recognitions tasks
Florida Institute of Technology 6 II-2-2-Performance The sphinx-4 recognizer can ultimately perform many different tasks of recognitions. Its flexibility and capabalities provide recognition of discrete and continous speech. Sphynx-4 includes pluggable implementations of preeamphasis, Hamming window, FFT, Mel frequency filter bank, discrete cosine, cepstral mean normalization, feature extraction of cepstral, delta cepstra, double delta cepstra features. It also includes an acoustic model architecture, language model for ASII and binary versions of unigram, bigram, trigram, and a generalized pluggable front end application [1] Finally sphinx-4 provides pluggable support for word pruning searches, search management, and breadth first.
Florida Institute of Technology 7 Test S3.3 WER S4 WERS3.3RTS4 RT(1) S4 RT(2) Vocabula ry size Languag e model TI Isolated digits recognitio n TIDIGITS Continuo us digits AN trigram RMI ,000trigram WSJ5K ,000trigram HUB ~ ,000trigram II-2-2-Performance
Florida Institute of Technology 8 II-2-2-Performance -FWER~Word error rate(%) lower is better -RT- Real time-ratio of processing time to audio time (lower is desirable) -S3.3RT-result for a single or dual CPU configuration -S4 RT(1)-results on a single-CPU configuration -S4 RT(2)- results for a dual-CPU configuration -Small Vocabulary (AN4) which extends the vocabulary to around 100 words, with data input such as speaking words, spelling words out letter by letter -Medium vocabulary(RM1) which extends the vocabulary to approximately 1,000 words -Medium vocabulary(WSJ5K) extends the vocabulary to approximately 5,000 words -Large Vocabulary(HUB4): extends the vocabulary to aproximately 64,000 words
Florida Institute of Technology 9 II-2-3-HMM-Based speech recognition system figure1 [1] show how the words are processed from the entry node to the exit node
figure2 [1] shows all the process leading to obtain the highest score Florida Institute of Technology 10 II-2-3-HMM-Based speech recognition system
Florida Institute of Technology 11 II-3-The Mapping system The mapping function just takes the 3 digit number obtained from the recognizer (which corresponds to best high score) and then search through the area code database all the cities and areas that share that 3 digit area code number As example, if the word “three two one” where provided through the microphone, the recognizer if the scoring process was successful will return the digit 321. This digit will now load all the cities that have 321 as area code from the database. The global position coordinates (longitude and latitude provided) will allow the plots of those cities on USA map.
Florida Institute of Technology 12 III-Results The area code application simulation is a success. It was able to provide all the cities for a given area code. As an example 321 matches with Melbourne, Altamonte Springs, Apopka, and, Casselberry which all are located in the state of Florida. However, we observed sometimes the area code provided by the recognizer is different from the desired area code of the speaker would have expected. Therefore it is important to recognize the difficulty and the complexity of the search [2]. Often, for small vocabularies, it is possible to perform optimal search; however for a large vocabularies, pruning will be necessary. Pruning may introduce search errors that can affect the recognition accuracy [2].
Florida Institute of Technology 13 IV-Conclusion and Future Work The area code application is more flexible than the zip code algorithm as we were successfully able to visualize all the associated cities on the map from a given USA area code. The zip code algorithm should not been as bad but it could be considered an extension of the old one. A future work could combine the zip city and area codes applications to be implemented into a software that could be used by tourists visiting USA. Overall, nowadays the performance of speech recognition components has significantly improved: only within ten years we have passed from systems able to recognize isolated two words uttered by a single speaker using a limited lexicon of around 50 words to systems able to recognize continuous speech with an unlimited vocabulary uttered by any speaker; or to systems able to carry a spontaneous dialog with a vocabulary of a few thousands of words over the telephone (e.g information on train or airplane schedules)[3]
[1] [2] Kai-Fu Lee,’Automatic Speech Recognition, The development of the SPHINX system, forword by Raj Reddy, Kluwer Academic Publishers [3] WolFgang Minker, Samir Bennacef, Speech and Human-Machine Dialog, Kluwer Academic Publishers 14 Florida Institute of Technology References
Thank You Questions … Florida Institute of Technology 15