SPEECH RECOGNITION Presented to Dr. V. Kepuska Presented by Lisa & Za ECE 5526.

Slides:



Advertisements
Similar presentations
Sphinx-3 to 3.2 Mosur Ravishankar School of Computer Science, CMU Nov 19, 1999.
Advertisements

CS 224S / LINGUIST 285 Spoken Language Processing Dan Jurafsky Stanford University Spring 2014 Lecture 3: ASR: HMMs, Forward, Viterbi.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Rolling Dice Data Analysis - Hidden Markov Model Danielle Tan Haolin Zhu.
Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,
Sequential Modeling with the Hidden Markov Model Lecture 9 Spoken Language Processing Prof. Andrew Rosenberg.
The Acoustic/Lexical model: Exploring the phonetic units; Triphones/Senones in action. Ofer M. Shir Speech Recognition Seminar, 15/10/2003 Leiden Institute.
Speech Recognition Training Continuous Density HMMs Lecture Based on:
PatReco: Hidden Markov Models Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Speaker Adaptation in Sphinx 3.x and CALO David Huggins-Daines
Training Tied-State Models
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
ASR Evaluation Julia Hirschberg CS Outline Intrinsic Methods –Transcription Accuracy Word Error Rate Automatic methods, toolkits Limitations –Concept.
27 th, February 2004Presentation for the speech recognition system An overview of the SPHINX Speech Recognition System Jie Zhou, Zheng Gong Lingli Wang,
Part 4 c Baum-Welch Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Speech Technology Lab Ƅ ɜ: m ɪ ŋ ǝ m EEM4R Spoken Language Processing - Introduction Training HMMs Version 4: February 2005.
Authors: Anastasis Kounoudes, Anixi Antonakoudi, Vasilis Kekatos
CS 224S / LINGUIST 285 Spoken Language Processing Dan Jurafsky Stanford University Spring 2014 Lecture 5: Acoustic Modeling with Gaussians.
Automatic Continuous Speech Recognition Database speech text Scoring.
Introduction to Automatic Speech Recognition
RS, © 2004 Carnegie Mellon University Training HMMs with shared parameters Class 24, 18 apr 2012.
Gaussian Mixture Model and the EM algorithm in Speech Recognition
Macquarie RT05s Speaker Diarisation System Steve Cassidy Centre for Language Technology Macquarie University Sydney.
Speech and Language Processing
Diamantino Caseiro and Isabel Trancoso INESC/IST, 2000 Large Vocabulary Recognition Applied to Directory Assistance Services.
1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Lecture 19: More EM Machine Learning April 15, 2010.
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
Csc Lecture 7 Recognizing speech. Geoffrey Hinton.
Design and Implementation of Speech Recognition Systems Spring 2014 Class 13: Training with continuous speech 26 Mar
Modeling Speech using POMDPs In this work we apply a new model, POMPD, in place of the traditional HMM to acoustically model the speech signal. We use.
CMU Robust Vocabulary-Independent Speech Recognition System Hsiao-Wuen Hon and Kai-Fu Lee ICASSP 1991 Presenter: Fang-Hui CHU.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Hidden Markov Models: Decoding & Training Natural Language Processing CMSC April 24, 2003.
HMM - Part 2 The EM algorithm Continuous density HMM.
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2005 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
Training Tied-State Models Rita Singh and Bhiksha Raj.
Speech and Language Processing Chapter 9 of SLP Automatic Speech Recognition (II)
CS Statistical Machine learning Lecture 24
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.
The HTK Book (for HTK Version 3.2.1) Young et al., 2002.
ECE 8443 – Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional Likelihood Mutual Information Estimation (CMLE) Maximum MI Estimation.
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION DIGITAL SPEECH PROCESSING HOMEWORK #1 DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION Date: Oct, Revised.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition Objectives: Reestimation Equations Continuous Distributions Gaussian Mixture Models EM Derivation of Reestimation Resources:
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Page 1 of 10 ASR – effect of five parameters on the WER performance of HMM SR system Sanjay Patil, Jun-Won Suh Human and Systems Engineering Experimental.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
1 A speech recognition system for Swedish running on Android Simon Lindholm LTH May 7, 2010.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
Context-based vision system for place and object recognition
Statistical Models for Automatic Speech Recognition
LECTURE 15: REESTIMATION, EM AND MIXTURES
Presentation transcript:

SPEECH RECOGNITION Presented to Dr. V. Kepuska Presented by Lisa & Za ECE 5526

How does Sphinx3 work?  Sphinx3 uses ---HMM with continuous probability density function  Flat initialization state: - Mixture weights: the weights given to every Gaussian in the Gaussian mixture corresponding to a state - transition matrices: the matrix of state transition probabilities - means: means of all Gaussians - variances: variances of all Gaussians

How does Sphinx3 work?  forward-backward re-estimation algorithm (Baum-Welch algorithm) - Use for converging the likelihood training  Untied Modeling - Training for all context-dependent phones (usually triphones) that are seen in the training corpus

How does Sphinx3 work?  Building decision tree - Used to decide which of the HMM states of all the triphones (seen and unseen) are similar to each other  Pruning the decision trees

Our project:::Spelling Bees  Use Sphinx3 to train the recorded data  Compare the train data with the test data Result: We have used 224 train data and 73 test data. The dictionary has 46 words and 33 phones are used.  32.7% word error rate and 49.3% sentence error rate

The result:::

 id: (fash-cen2-fash-b)  Scores: (#C #S #D #I)  REF: a m y  HYP: a m y  Speaker sentences 1: moe #utts: 8  id: (moe-m_oses1)  Scores: (#C #S #D #I)  REF: * m o s e S  HYP: E m o s e *  Eval: I D   id: (moe-m_oses2)  Scores: (#C #S #D #I)  REF: m o s e s  HYP: m o s e s  Eval:

Reference:   Lecture notes from Speech recognition class  85/ 85/  makeraw.m  record.m