Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연, 김성호, 원윤정, 윤아림 한국과학기술원 응용수학전공.

Similar presentations


Presentation on theme: "A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연, 김성호, 원윤정, 윤아림 한국과학기술원 응용수학전공."— Presentation transcript:

1 A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연, 김성호, 원윤정, 윤아림 한국과학기술원 응용수학전공

2 Automatic Speech Recognition Message Encoding /Decoding

3 Hidden Markov Models The Markov Generation Model

4 Hidden Markov Models HMM is defined by : 1. A set S of Q states,, of a time-discrete Markov chain of order 1 2. An initial probability distribution of the states : 3. A transition probability distribution between states: 4. An emission probability distribution of the acoustic observations X within each state:

5 Hidden Markov Models Major problems of HMMs – Trainig – Decoding Solutions: – Baum/Welch algorithm – Viterbi algorithm

6 Hidden Markov Models Advantages of standard HMMs – provide a natural and highly reliable way of recognizing speech for a wide range of applications – integrate well into systems incorporating both task syntax and semantics Limitations of standard HMMs – non-discriminative training/decoding criterion – Arbitrary assumptions on the parametric form of probability distributions – High sensitivity to environmental conditions

7 Artificial Neural Networks Nice Properties of ANN * Learning Capability from examples * Generalization ability * Non-parametric estimation Limitations of ANN * Restricted to local decisions – generally used for classification of static input with no sequential processing * Not well-suited for dealing with time-varying Input patterns and segmentation of sequential inputs

8 Hybrid Models of HMM/ANN ANNs that emulate HMMs Connectionist probability estimation for continuous HMMs Hybrids with "global optimization" Connectionist Vector Quantizers for discrete HMMs ANNs as acoustic front-ends for continuous HMMs

9 Hybrid Models of HMM/ANN : 1. Initialization: – Initial segmentation of the training set – Labeling of the acoustic vectors with "0" or "1", according to the segmentation – ANN training via Back-Propagation (BP) or other algorithms 2. Iteration – New segmentation of training set according to Viterbi algorithm computed over ANN outputs – Labeling of the acoustic vectors with "0" or "1" – ANN retaining by BP

10 Proposed HMM/RBFN Model

11 1.First Training LBG clustering – Setting centers and variances of radial basis functions RLS algorithm – Training weights – Target:

12 2. Second Training-LCM/GPD

13 Simulation 1.Database – TIMIT1 Five class phoneme (C, L, N, S, V) Acoustic features: 26 dimension of MFCC features – TIMIT2 Digit(0, 1, 2, …,9) Acoustic features: 16 dimension of ZCPA features

14 Simulation 2. Results – TIMIT1 Table 1: result of 5 class recognition RBF nodesHMMRLSMCE-GPD 241 310 86.7390.81 92.87 92.86 92.87

15 – TIMIT2 noiseHMMHybrid(414)Hybrid(522) clean90.095.594.5 White Gaussian 15 10 5 86.25 87.00 82.25 88.50 87.50 72.00 90.00 89.75 77.75 OP Room 15 10 5 85.88 85.25 83.75 88.50 86.25 73.50 89.75 89.50 72.75 F1615 10 5 85.00 85.62 81.50 88.25 70.25 89.50 73.25 Table2: result of Digit recognition

16 Conclusion 1. Result – Non-parametric estimates: no a priori assumpitions on the form of the distributions – Better initialization than other hybrid system – Discriminative training – Improved performance over standard HMM 2. Further Works – Performance degration in noise environment – Clustering/Parameter Training – GPD is not stable


Download ppt "A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연, 김성호, 원윤정, 윤아림 한국과학기술원 응용수학전공."

Similar presentations


Ads by Google