Part 6 HMM in Practice CSE717, SPRING 2008 CUBS, Univ at Buffalo.

Slides:



Advertisements
Similar presentations
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advertisements

Building an ASR using HTK CS4706
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Angelo Dalli Department of Intelligent Computing Systems
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
ECE 8443 – Pattern Recognition Objectives: Elements of a Discrete Model Evaluation Decoding Dynamic Programming Resources: D.H.S.: Chapter 3 (Part 3) F.J.:
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Introduction to Hidden Markov Models
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Ch 9. Markov Models 고려대학교 자연어처리연구실 한 경 수
Hidden Markov Models Theory By Johan Walters (SR 2003)
1 Hidden Markov Models (HMMs) Probabilistic Automata Ubiquitous in Speech/Speaker Recognition/Verification Suitable for modelling phenomena which are dynamic.
Hidden Markov Models in NLP
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
Hidden Markov Models (HMMs) Steven Salzberg CMSC 828H, Univ. of Maryland Fall 2010.
SPEECH RECOGNITION Kunal Shalia and Dima Smirnov.
INTRODUCTION TO Machine Learning 3rd Edition
Application of HMMs: Speech recognition “Noisy channel” model of speech.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Speech Recognition Training Continuous Density HMMs Lecture Based on:
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Part 4 b Forward-Backward Algorithm & Viterbi Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Part 4 c Baum-Welch Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Bayesian Estimation (BE) Bayesian Parameter Estimation: Gaussian Case
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
May 20, 2006SRIV2006, Toulouse, France1 Acoustic Modeling of Accented English Speech for Large-Vocabulary Speech Recognition ATR Spoken Language Communication.
1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Isolated-Word Speech Recognition Using Hidden Markov Models
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
7-Speech Recognition Speech Recognition Concepts
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
By: Meghal Bhatt.  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language.
Hidden Markov Models Yves Moreau Katholieke Universiteit Leuven.
Speech recognition and the EM algorithm
Modeling Speech using POMDPs In this work we apply a new model, POMPD, in place of the traditional HMM to acoustically model the speech signal. We use.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Chapter 3 (part 2): Maximum-Likelihood and Bayesian Parameter Estimation Bayesian Estimation (BE) Bayesian Estimation (BE) Bayesian Parameter Estimation:
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Evaluation Decoding Dynamic Programming.
Hidden Markov Models: Decoding & Training Natural Language Processing CMSC April 24, 2003.
CS Statistical Machine learning Lecture 24
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
Chapter 12 search and speaker adaptation 12.1 General Search Algorithm 12.2 Search Algorithms for Speech Recognition 12.3 Language Model States 12.4 Speaker.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
1 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,..., sN Si Sj.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition Objectives: Reestimation Equations Continuous Distributions Gaussian Mixture Models EM Derivation of Reestimation Resources:
Other Models for Time Series. The Hidden Markov Model (HMM)
A NONPARAMETRIC BAYESIAN APPROACH FOR
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
Statistical Models for Automatic Speech Recognition
Hidden Markov Models Part 2: Algorithms
Statistical Models for Automatic Speech Recognition
LECTURE 15: REESTIMATION, EM AND MIXTURES
Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.
Presentation transcript:

Part 6 HMM in Practice CSE717, SPRING 2008 CUBS, Univ at Buffalo

Practical Problems in the HMM Computation with Probabilities Configuration of HMM Robust Parameter Estimation (Feature Optimization, Tying) Efficient Model Evaluation (Beam Search, Pruning)

Computation with Probabilities Logarithmic Probability Representation Lower Bounds for Probabilities Codebook for Semi-Continuous HMMs

Probability of State Sequence s for a Given Model λ If all, for a sequence of T>100,

Logarithm Transformation

Kingsbury-Rayner Formula

Mixture Density Model Kingsbury-Rayner Formula is not advisable here (too many exps and logs) Approximation

Lower Bounds for Probabilities Choose a minimal probability For example: In training it is avoided that certain states are not considered for parameter estimation In decoding it is avoided that paths through states with vanishing output probabilities are immediately discarded

Codebook Evaluation for Semi- Continuous HMMs Semi-Continuous HMM

Codebook Evaluation for Semi- Continuous HMMs By Bayes’ Law Assume can be approximated by a uniform distribution, then

Codebook Evaluation for Semi- Continuous HMMs This reduces the dynamic range of all quantities involved

Configuration of HMM Model Topology Modularization Compound Models Modeling Emissions

Model Topology Input data of speech and handwriting recognition exhibit a chronological or linear structure Ergodic model is not necessary

Linear Model The most simple model that describes chronological sequences Transitions to the next state and to the current state are allowed

Bakis Model Skipping of states is allowed Larger flexibility inn the modeling of duration Widely used in speech and handwriting recognition

Left-to-right Model An arbitrary number of states may be skipped in forward direction Jumping back to “past” states is not allowed Can describe larger variations in the temporal structure; longer parts of the data may be missing

Modularization English Word Recognition Thousands of words: more than thousands of word models; requires large amount of training data 26 letters: limited number of character models Modularization: divides complex model into smaller models of segmentation units Word -> subword -> character

Variation of Segmentation Units in Different Context Phonetic transcription of word “speech” : /spitS/ Cannot easily be distinguished from achieve cheese (/tSiz/), or reality Triphone [Schwartz, 1984] Three immediately neighboring phone units taken as a segmentation units, e.g., p/i/t Eliminates the dependence of the variability of segmentation units on the context

Compound Models Parallel connection of all individual word models HMM structure for isolated word recognition Circles: Model States Squares: Non-emission States HMM structure for connected word recognition

Grammar Coded into HMM

Modeling Emissions Continuous feature vectors in the fields of speech and handwriting recognition are described by mixture models Size of the codebook and number of component densities per mixture density need to be decided No general way; a compromise between the precision of the model, its generalization capabilities, and the computation time Semi-Continuous Model Size of codebook: some hundred up to a few thousand densities Mixture Model: 8 to 64 component densities

References [1] Schwartz R, Chow Y, Roucos S, Krasner M, Makhoul J, Improved hidden Markov Modelling of phonemes for continuous speech recognition, in International Conference on Acoustics, Speech and Signal Processing, pp , 1984.

Robust Parameter Estimation Feature Optimization Tying

Feature Optimization Techniques