Download presentation
Presentation is loading. Please wait.
1
Speech Recognition Application
Voice Enabled Phone Directory - Yousef Rabah رباح يوسف -
2
Why Speech Enabled Phone Directory
Growing Technology Easy Access Mainly used for: Educational purposes People with certain Disabilities Mobile use
3
Problem Automatic speech interacting phone directory assistance
4
Automatic Speech Recognition - Sphinx
Speaker Dependent vs. Independent Acoustic modeling Isolated vs. Continuous HMM – Probabilities, Parameters, Training Language Model Unigrams: <s> & </s> Bigrams: P(word2 | word1) Phonemes Lexicon Structure ZERO Z IH R OW TWO T UW H A HEIGH H
5
Input / Output FWDVIT: H E L L (null)
24003 samples in file /usr/local/share/sphinx3/model/lm/an4/hell.raw INFO: live.c(239): live_nfeatvec: 13 INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> INFO: live.c(239): live_nfeatvec: 12 INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> A(2) INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> EIGHTH INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E L INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E L OH Backtrace (null) LatID SFrm EFrm AScr LScr Type <sil> H E L L </s> (Total) FWDVIT: H E L L (null)
6
Difficulties Hardware issues ASR software issues Letter phonemes Time
7
Solution 4 Stage Process :
8
Solution Database (PostgreSQL) Names Phone numbers Fast access
9
Solution Architecture of application Example: db.pm people.pm
people.pl record.pl wav_to_raw.pl get_speech.pl display_speech.pm display_speech.pl VEPD.pm VEPD.pl Example: … PC: press space bar before and after you speak: User: S AH EM PC: Decoded as, SAM ? Results | 1 1. SAM |SMITH |
10
Solution
11
Results A first step towards hands free speech enabled phone directory
Speaker Independent Application’s Features: Adding user Retrieving user (via speech) Manual search Viewing current phone directory
12
Possible Future Enhancement
ASR enabled for : Adding users Phone # search Word Recognition (instead of letters) More accurate ASR (as tech. Grows) Graphical outlook (via perl/tk) Communication through VoiceXML
13
Special Thanks To friends and family Jim Rogers Hassan Halta
Skylar Thompson Kushboo Goel Rabah family El-Shabab el-taybeh
14
Questions/Comments
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.