Dynamic Time Warping Applications and Derivation

Slides:

Advertisements

Similar presentations

Component Analysis (Review)

Advertisements

Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),

Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Speech Recognition with Hidden Markov Models Winter 2011

Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.

Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.

Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.

Profiles for Sequences

Hidden Markov Models Theory By Johan Walters (SR 2003)

Hidden Markov Models in NLP

SPEECH RECOGNITION Kunal Shalia and Dima Smirnov.

Visual Recognition Tutorial

HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.

Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Hidden Markov Models Pairwise Alignments. Hidden Markov Models Finite state automata with multiple states as a convenient description of complex dynamic.

Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.

Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.

Hidden Markov Models David Meir Blei November 1, 1999.

Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.

Hidden Markov Model Continues …. Finite State Markov Chain A discrete time stochastic process, consisting of a domain D of m states {1,…,m} and 1.An m.

A PRESENTATION BY SHAMALEE DESHPANDE

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.

Isolated-Word Speech Recognition Using Hidden Markov Models

Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.

Pattern Recognition: Baysian Decision Theory Charles Tappert Seidenberg School of CSIS, Pace University.

Principles of Pattern Recognition

7-Speech Recognition Speech Recognition Concepts

ECE 8443 – Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Normal Distributions Whitening Transformations Linear Discriminants Resources.

Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.

Dynamic Programming Chapter 15 Highlights Charles Tappert Seidenberg School of CSIS, Pace University.

H IDDEN M ARKOV M ODELS. O VERVIEW Markov models Hidden Markov models(HMM) Issues Regarding HMM Algorithmic approach to Issues of HMM.

1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.

LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.

Sequence Models With slides by me, Joshua Goodman, Fei Xia.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Evaluation Decoding Dynamic Programming.

1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2005 Oregon Health & Science University OGI School of Science & Engineering John-Paul.

CS Statistical Machine learning Lecture 24

1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.

1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.

M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

Automated Speach Recognotion Automated Speach Recognition By: Amichai Painsky.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.

Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.

Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:

Other Models for Time Series. The Hidden Markov Model (HMM)

Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.

Hidden Markov Models.

Statistical Models for Automatic Speech Recognition

Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)

Hidden Markov Models Part 2: Algorithms

CONTEXT DEPENDENT CLASSIFICATION

EE513 Audio Signals and Systems

LECTURE 15: REESTIMATION, EM AND MIXTURES

Parametric Methods Berlin Chen, 2005 References:

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

Presentation transcript:

Dynamic Time Warping Applications and Derivation Charles Tappert Seidenberg School of CSIS, Pace University

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm Many Applications Speech recognition Speech sound alignment Speech sound generation Online handwriting recognition

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm Derivation of a DTW algorithm variation (speech recognition) A speech utterance is represented as a time sequence of feature vectors Example

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm Consider a finite state machine model of a speech utterance prototype where the observable output from transitions between states is an acoustic feature vector which is a probabilistic function of the origin state of each transition Note: some transitions cause stretching and others cause compression of the sequence of feature vectors produced

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm Background information Univariate (one-dimensional) normal density function

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm Background information Multivariate normal density function

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm In traversing each arc in this model a feature vector is produced with assumed underlying normal distribution where i is the unknown, j the prototype, Vi are feature vectors of the unknown, Mj are mean feature vectors and sigma the covariance matrix of the prototype This statistical characterization of prototypes would require multiple repetitions of the vocabulary to be recognized

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm To find the optimal overall probability of the model (prototype) generating the candidate, we estimate the maximum value of the cumulative probability over the possible paths through the model Assuming statistical independence of the feature vectors, the best path to any point (i, j) and probability P(i, j) can be computed, starting with P(0, 0) = Prob(0, 0) and P(i, j) = 0 elsewhere, using the recursion relation

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm Taking the log of terms in previous equation, dropping constant terms, multiplying by -2, and assuming zero covariance terms yields the recursion relation where D(i, j) is considered a cumulative distance measure

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm Further, assume equal variances and transition probabilities, and include an index k indicating the prototype d(i, j; k) is the distance between feature vectors i and j Note: since the log function is a monotonically increasing function of its argument and changing sign converts a maximizing relation into a minimizing one, this distance relation leads to the same decisions as the probability recursion relation, except for the simplifying assumptions

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm This derivation shows the simplifying assumptions made in going from a probabilistic model to a greatly simplified distance model Currently, most string matching commercial and research systems use probabilistic models, and the Hidden Markov Model (HMM) is probably the dominant one As computing power has increased over the years, more complex and primarily probabilistic models requiring large training corpuses have been used

Dynamic Time Warping (DTW) non-linear/elastic matching, Viterbi algorithm In his research at IBM’s T.J. Watson Research Center, your instructor worked in both the speech recognition and the pen computing/handwriting recognition groups founding member of the speech group (once over 50 workers) spearheaded development of ThinkWrite handwriting recognizer in IBM’s pen-enabled ThinkPad product in the early 1990s The data in both the speech and online handwriting problems are time sequences Speech is recorded as a time waveform and usually transformed via frequency analysis into a sequence of spectral time samples Online handwriting is captured as a time sequence of x-y coordinates describing the trajectory of the handwriting