SPEECH RECOGNITION Presented to Dr. V. Kepuska Presented by Lisa & Za ECE 5526
How does Sphinx3 work? Sphinx3 uses ---HMM with continuous probability density function Flat initialization state: - Mixture weights: the weights given to every Gaussian in the Gaussian mixture corresponding to a state - transition matrices: the matrix of state transition probabilities - means: means of all Gaussians - variances: variances of all Gaussians
How does Sphinx3 work? forward-backward re-estimation algorithm (Baum-Welch algorithm) - Use for converging the likelihood training Untied Modeling - Training for all context-dependent phones (usually triphones) that are seen in the training corpus
How does Sphinx3 work? Building decision tree - Used to decide which of the HMM states of all the triphones (seen and unseen) are similar to each other Pruning the decision trees
Our project:::Spelling Bees Use Sphinx3 to train the recorded data Compare the train data with the test data Result: We have used 224 train data and 73 test data. The dictionary has 46 words and 33 phones are used. 32.7% word error rate and 49.3% sentence error rate
The result:::
id: (fash-cen2-fash-b) Scores: (#C #S #D #I) REF: a m y HYP: a m y Speaker sentences 1: moe #utts: 8 id: (moe-m_oses1) Scores: (#C #S #D #I) REF: * m o s e S HYP: E m o s e * Eval: I D id: (moe-m_oses2) Scores: (#C #S #D #I) REF: m o s e s HYP: m o s e s Eval:
Reference: Lecture notes from Speech recognition class 85/ 85/ makeraw.m record.m