The Application of Hidden Markov Models in Speech Recognition Author:Mark Gales1 and Steve Young2 Published:21 Feb 2008 Subjects:Speech/audio/image/video compression
Outline Introduction Architecture of an HMM-Based Recogniser HMM Structure Refinements references
CHAPTER 1: Introduction
There is no data like more data. Recognition word error rate vs There is no data like more data. Recognition word error rate vs. the amount of training hours for illustrative purposes only. This figure illustrates how modern speech recognition systems can benefit from increased training data.
Automatic continuous speech recognition (CSR) has many potential applications including command and control, dictation, transcription of recorded speech, searching audio documents and interactive spoken dialogues. in the last decade or more, the detailed modelling techniques developed within this framework have evolved to a state of considerable sophistication. Since speech has temporal structure and can be encoded as a sequence of spectral vectors spanning the audio frequency range, the hidden Markov model (HMM) provides a natural framework for constructing such models [13].
CHAPTER 2: Architecture of an HMM-Based Recogniser
Architecture of a HMM-based Recogniser Bayes’ Rule Acoustic Models Language Model Acoustic Models Language Model
HMM-based phone model HMM Acoustic Models
Formation of tied-state phone models Fig. 2.4 Formation of tied-state phone models.
N-gram Language Models
CHAPTER 3: HMM Structure Refinements
Dynamic Bayesian Networks In Architecture of an HMM-Based Recogniser, the HMM was describedas a generative model which for a typical phone has three emitting
Gaussian Mixture Models
Covariance Modelling Structured Covariance Matrices Structured Precision Matrices
references 使用中 The Application of Hidden Markov Modelsin Speech Recognition A Historical Perspective of Speech Recognition Automatic Speech Recognition – A Brief History of the Technology Development Bayesian Networks