Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speech Recognition Application

Similar presentations


Presentation on theme: "Speech Recognition Application"— Presentation transcript:

1 Speech Recognition Application
Voice Enabled Phone Directory - Yousef Rabah رباح يوسف -

2 Why Speech Enabled Phone Directory
Growing Technology Easy Access Mainly used for: Educational purposes People with certain Disabilities Mobile use

3 Problem Automatic speech interacting phone directory assistance

4 Automatic Speech Recognition - Sphinx
Speaker Dependent vs. Independent Acoustic modeling Isolated vs. Continuous HMM – Probabilities, Parameters, Training Language Model Unigrams: <s> & </s> Bigrams: P(word2 | word1) Phonemes Lexicon Structure ZERO Z IH R OW TWO T UW H A HEIGH H

5 Input / Output FWDVIT: H E L L (null)
24003 samples in file /usr/local/share/sphinx3/model/lm/an4/hell.raw INFO: live.c(239): live_nfeatvec: 13 INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> INFO: live.c(239): live_nfeatvec: 12 INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> A(2) INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> EIGHTH INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E L INFO: main_live_pretend.c(92): PARTIAL HYP: <sil> H E L OH Backtrace (null) LatID SFrm EFrm AScr LScr Type <sil> H E L L </s> (Total) FWDVIT: H E L L (null)

6 Difficulties Hardware issues ASR software issues Letter phonemes Time

7 Solution 4 Stage Process :

8 Solution Database (PostgreSQL) Names Phone numbers Fast access

9 Solution Architecture of application Example: db.pm people.pm
people.pl record.pl wav_to_raw.pl get_speech.pl display_speech.pm display_speech.pl VEPD.pm VEPD.pl Example: PC: press space bar before and after you speak: User: S AH EM PC: Decoded as, SAM ? Results | 1 1. SAM |SMITH |

10 Solution

11 Results A first step towards hands free speech enabled phone directory
Speaker Independent Application’s Features: Adding user Retrieving user (via speech) Manual search Viewing current phone directory

12 Possible Future Enhancement
ASR enabled for : Adding users Phone # search Word Recognition (instead of letters) More accurate ASR (as tech. Grows) Graphical outlook (via perl/tk) Communication through VoiceXML

13 Special Thanks To friends and family Jim Rogers Hassan Halta
Skylar Thompson Kushboo Goel Rabah family El-Shabab el-taybeh

14 Questions/Comments


Download ppt "Speech Recognition Application"

Similar presentations


Ads by Google