CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.

Slides:



Advertisements
Similar presentations
Pattern Finding and Pattern Discovery in Time Series
Advertisements

Rutgers CS440, Fall 2003 Review session. Rutgers CS440, Fall 2003 Topics Final will cover the following topics (after midterm): 1.Uncertainty & introduction.
Lirong Xia Probabilistic reasoning over time Tue, March 25, 2014.
Dynamic Bayesian Networks (DBNs)
Supervised Learning Recap
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Chapter 15 Probabilistic Reasoning over Time. Chapter 15, Sections 1-5 Outline Time and uncertainty Inference: ltering, prediction, smoothing Hidden Markov.
2004/11/161 A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition LAWRENCE R. RABINER, FELLOW, IEEE Presented by: Chi-Chun.
Hidden Markov Models Adapted from Dr Catherine Sweeney-Reed’s slides.
Hidden Markov Models Theory By Johan Walters (SR 2003)
1 Hidden Markov Models (HMMs) Probabilistic Automata Ubiquitous in Speech/Speaker Recognition/Verification Suitable for modelling phenomena which are dynamic.
Advanced Artificial Intelligence
Sequential Modeling with the Hidden Markov Model Lecture 9 Spoken Language Processing Prof. Andrew Rosenberg.
1 Reasoning Under Uncertainty Over Time CS 486/686: Introduction to Artificial Intelligence Fall 2013.
QUIZ!!  T/F: Rejection Sampling without weighting is not consistent. FALSE  T/F: Rejection Sampling (often) converges faster than Forward Sampling. FALSE.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
PatReco: Hidden Markov Models Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
Lecture 5: Learning models using EM
Learning, Uncertainty, and Information Big Ideas November 8, 2004.
Hidden Markov Models. Hidden Markov Model In some Markov processes, we may not be able to observe the states directly.
Temporal Processes Eran Segal Weizmann Institute.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Genome evolution: a sequence-centric approach Lecture 3: From Trees to HMMs.
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.
CHAPTER 15 SECTION 3 – 4 Hidden Markov Models. Terminology.
7-Speech Recognition Speech Recognition Concepts
CS 188: Artificial Intelligence Fall 2008 Lecture 19: HMMs 11/4/2008 Dan Klein – UC Berkeley 1.
CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains.
Sequence Models With slides by me, Joshua Goodman, Fei Xia.
Hidden Markov Models in Keystroke Dynamics Md Liakat Ali, John V. Monaco, and Charles C. Tappert Seidenberg School of CSIS, Pace University, White Plains,
CS 188: Artificial Intelligence Fall 2006 Lecture 18: Decision Diagrams 10/31/2006 Dan Klein – UC Berkeley.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
CS Statistical Machine learning Lecture 24
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
1 Chapter 15 Probabilistic Reasoning over Time. 2 Outline Time and UncertaintyTime and Uncertainty Inference: Filtering, Prediction, SmoothingInference:
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
CPS 170: Artificial Intelligence Markov processes and Hidden Markov Models (HMMs) Instructor: Vincent Conitzer.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Reasoning over Time  Often, we want to reason about a sequence of observations  Speech recognition  Robot localization  User attention  Medical monitoring.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연, 김성호, 원윤정, 윤아림 한국과학기술원 응용수학전공.
COMP 2208 Dr. Long Tran-Thanh University of Southampton Revision.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
CS 541: Artificial Intelligence Lecture VIII: Temporal Probability Models.
Automatic Speech Recognition
Structured prediction
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
Probabilistic reasoning over time
Statistical Models for Automatic Speech Recognition
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Hidden Markov Models Part 2: Algorithms
CS 188: Artificial Intelligence Spring 2007
Statistical Models for Automatic Speech Recognition
CONTEXT DEPENDENT CLASSIFICATION
Chapter14-cont..
Hidden Markov Models (cont.) Markov Decision Processes
Probabilistic reasoning over time
Presentation transcript:

CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS

Generative vs. Discriminative

The Perceptron Model

Example: Spam

Binary Decision Rule

Online Perceptron Training

Perceptron Training Illustration

Properties of Perceptrons

Issues with Perceptrons

Reasoning over Time Often, we want to reason about a sequence of observations Speech recognition Robot localization User attention Need to introduce time into our models Basic approach: hidden Markov models (HMMs) More general: dynamic Bayes’ nets

Markov Models

Conditional Independence

Weather Example

Mini-Forward Algorithm

Example

Stationary Distributions If we simulate the chain long enough: What happens? Uncertainty accumulates Eventually, we have no idea what the state is! Stationary distributions: For most chains, the distribution we end up in is independent of the initial distribution Called the stationary distribution of the chain Usually, can only predict a short time out

Example: Web Link Analysis

Mini-Viterbi Algorithm

Hidden Markov Models

Example

Conditional Independence

HMM Applications

Forward Algorithm

Viterbi Algorithm

Viterbi Example

Viterbi Properties Designed for computing the most likely state hidden sequence given a sequence of observations in Hidden Markov Models Two passes, forward to compute the forward probabilities, and then backward to reconstruct the maximum sequence What’s the time complexity? O(d2n) - Why is this exciting? There are many extensions to the basic Viterbi algorithm which have been developed for other models which have similar local structure: syntactic parsing, for instance.

Speech in an Hour

HMMs for Speech

HMMs for Continuous Obs.? Before: discrete, finite set of observations Now: spectral feature vectors are real-valued! Solution 1: discretization Solution 2: continuous emissions models Gaussians Multivariate Gaussians Mixtures of Multivariate Gaussians A state is progressively: Context independent subphone (~3 per phone) Context dependent phone (=triphones) State-tying of CD phone

ASR Lexicon: Markov Models

Viterbi with 2 Words + Unif. LM

Conclusion Perceptron A discriminative model, an alternative to generative models like Naïve Bayes Simple classification rule, based on a weight vector Simple online learning algorithm, guaranteed to converge if training set is separable Hidden Markov Models A special kind of Bayesian Network designed for reasoning about sequences of hidden states Polynomial time inference for most likely state sequence (Viterbi) and marginalization (Forward- Backward) Many applications