Download presentation
Presentation is loading. Please wait.
Published byAmelia Osborne Modified over 9 years ago
1
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training In HMM
2
2 Speech Recognition Concepts NLP Speech Processing Text Speech NLP Speech Processing Speech Understanding Speech Synthesis Text Phone Sequence Speech Recognition Speech recognition is inverse of Speech Synthesis
3
3 Speech Recognition Approaches Bottom-Up Approach Top-Down Approach Blackboard Approach
4
4 Bottom-Up Approach Signal Processing Feature Extraction Segmentation Signal Processing Feature Extraction Segmentation Sound Classification Rules Phonotactic Rules Lexical Access Language Model Voiced/Unvoiced/Silence Knowledge Sources Recognized Utterance
5
5 Unit Matching System Top-Down Approach Feature Analysis Lexical Hypo thesis Syntactic Hypo thesis Semantic Hypo thesis Utterance Verifier/ Matcher Inventory of speech recognition units Word Dictionary Grammar Task Model Recognized Utterance
6
6 Blackboard Approach Environmental Processes Acoustic Processes Lexical Processes Syntactic Processes Semantic Processes Black board
7
7 An overall view of a speech recognition system bottom up top down From Ladefoged 2001
8
8 Recognition Theories Articulatory Based Recognition –Use from Articulatory system for recognition –This theory is the most successful until now Auditory Based Recognition –Use from Auditory system for recognition Hybrid Based Recognition –Is a hybrid from the above theories Motor Theory –Model the intended gesture of speaker
9
9 Recognition Problem We have the sequence of acoustic symbols and we want to find the words that expressed by speaker Solution : Finding the most probable word sequence having Acoustic symbols
10
10 Recognition Problem A : Acoustic Symbols W : Word Sequence we should find so that
11
11 Bayse Rule
12
12 Bayse Rule (Cont’d)
13
13 Simple Language Model Computing this probability is very difficult and we need a very big database. So we use from Trigram and Bigram models.
14
14 Simple Language Model (Cont’d) Trigram : Bigram : Monogram :
15
15 Simple Language Model (Cont’d) Computing Method : Number of happening W3 after W1W2 Total number of happening W1W2 AdHoc Method :
16
16 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types
17
17 From Ladefoged 2001
18
18 P(A|W) Computing Approaches Dynamic Time Warping (DTW) Hidden Markov Model (HMM) Artificial Neural Network (ANN) Hybrid Systems
19
19 Dynamic Time Warping Method (DTW) To obtain a global distance between two speech patterns a time alignment must be performed Ex : A time alignment path between a template pattern “SPEECH” and a noisy input “SsPEEhH”
20
20 Recognition Tasks Isolated Word Recognition (IWR) And Continuous Speech Recognition (CSR) Speaker Dependent And Speaker Independent Vocabulary Size –Small <20 –Medium >100, 100, <1000 –Large >1000, 1000, <10000 –Very Large >10000
21
21 Error Production Factor Prosody (Recognition should be Prosody Independent) Noise (Noise should be prevented) Spontaneous Speech
22
22 Artificial Neural Network...... Simple Computation Element of a Neural Network
23
23 Artificial Neural Network (Cont’d) Neural Network Types –Perceptron –Time Delay –Time Delay Neural Network Computational Element (TDNN)
24
24 Artificial Neural Network (Cont’d)... Single Layer Perceptron
25
25 Artificial Neural Network (Cont’d)... Three Layer Perceptron...
26
26 Hybrid Methods Hybrid Neural Network and Matched Filter For Recognition PATTERN CLASSIFIER Speech Acoustic Features Delays Output Units
27
27 Neural Network Properties The system is simple, But too much iterative Doesn’t determine a specific structure Regardless of simplicity, the results are good Training size is large, so training should be offline Accuracy is relatively good
28
28 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,... Si Sj
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.