Download presentation
1
Speech Recognition Application
Voice Enabled Phone Directory - Yousef Rabah
2
Process of Speech Recognition
Speaker dependent vs. Speaker Independent Vocabulary Isolated vs. Continuous Frequency changes Pronunciation Speech Processing HMM – Probabilities, Parameters, Training Phonemes to words
3
Problem Automatic speech interacting phone directory assistance without human interaction.
4
Automatic Speech Recognition - Sphinx
Acoustic modeling Language Model Unigrams: <s> & </s> Bigrams: P(word2 | word1) Trigrams: P(word3| word2 | word1) Lexicon Structure ZERO Z IH R OW ONE W AH N TWO T UW <sil>
5
Input / Output FWDVIT: H E L L (null)
24003 samples in file /usr/local/share/sphinx3/model/lm/an4/hell.raw INFO: live.c(239): live_nfeatvec: 13 INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> INFO: live.c(239): live_nfeatvec: 12 INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> A(2) INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> EIGHTH INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E L INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E L OH Backtrace(null) LatID SFrm EFrm AScr LScr Type <sil> H E L L </s> (Total) FWDVIT: H E L L (null)
6
Difficulties Hardware issues ASR software issues
Letter phonemes - “e-set” Time
7
Solution Database (PostgreSQL) Names Numbers Phone number Fast access
8
Solution Architecture of application Example (general idea):
… PC: Say the letters of first name, press space bar before and after you speak: User: S AA EM PC: Did you say, SAM ? Architecture of application User Interaction Connects to Database Communicates with Sphinx Uses of C, Perl, shell scripts
9
Solution
10
Check List Reading ASR system Database - PSQL
Applications in C, Perl, PHP, vxml, shell
11
Timeline
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.