SPEECH RECOGNITION Kunal Shalia and Dima Smirnov.

Slides:



Advertisements
Similar presentations
Angelo Dalli Department of Intelligent Computing Systems
Advertisements

Hidden Markov Models By Marc Sobel. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Introduction Modeling.
Automatic Speech Recognition II  Hidden Markov Models  Neural Network.
Machine Learning Hidden Markov Model Darshana Pathak University of North Carolina at Chapel Hill Research Seminar – November 14, 2012.
Hidden Markov Model 主講人:虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate.
HIDDEN MARKOV MODELS Prof. Navneet Goyal Department of Computer Science BITS, Pilani Presentation based on: & on presentation on HMM by Jianfeng Tang Old.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Speech Recognition Problem and Hidden Markov Model Ziba Rostamian CS Winter 2008.
Ch 9. Markov Models 고려대학교 자연어처리연구실 한 경 수
Statistical NLP: Lecture 11
Ch-9: Markov Models Prepared by Qaiser Abbas ( )
Hidden Markov Models Theory By Johan Walters (SR 2003)
Foundations of Statistical NLP Chapter 9. Markov Models 한 기 덕한 기 덕.
Hidden Markov Models Fundamentals and applications to bioinformatics.
Hidden Markov Models in NLP
Apaydin slides with a several modifications and additions by Christoph Eick.
Albert Gatt Corpora and Statistical Methods Lecture 8.
INTRODUCTION TO Machine Learning 3rd Edition
Natural Language Processing - Speech Processing -
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Part 6 HMM in Practice CSE717, SPRING 2008 CUBS, Univ at Buffalo.
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
COMP 4060 Natural Language Processing Speech Processing.
Doug Downey, adapted from Bryan Pardo,Northwestern University
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Dynamic Time Warping Applications and Derivation
Fall 2001 EE669: Natural Language Processing 1 Lecture 9: Hidden Markov Models (HMMs) (Chapter 9 of Manning and Schutze) Dr. Mary P. Harper ECE, Purdue.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
Speech Technology Lab Ƅ ɜ: m ɪ ŋ ǝ m EEM4R Spoken Language Processing - Introduction Training HMMs Version 4: February 2005.
Natural Language Understanding
Introduction to Automatic Speech Recognition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Ch10 HMM Model 10.1 Discrete-Time Markov Process 10.2 Hidden Markov Models 10.3 The three Basic Problems for HMMS and the solutions 10.4 Types of HMMS.
Isolated-Word Speech Recognition Using Hidden Markov Models
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21- Forward Probabilities and Robotic Action Sequences.
Hidden Markov Models Applied to Information Extraction Part I: Concept Part I: Concept HMM Tutorial HMM Tutorial Part II: Sample Application Part II: Sample.
THE HIDDEN MARKOV MODEL (HMM)
Graphical models for part of speech tagging
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Hidden Markov Models (HMMs) Chapter 3 (Duda et al.) – Section 3.10 (Warning: this section has lots of typos) CS479/679 Pattern Recognition Spring 2013.
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 33,34– HMM, Viterbi, 14 th Oct, 18 th Oct, 2010.
Automated Speach Recognotion Automated Speach Recognition By: Amichai Painsky.
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Other Models for Time Series. The Hidden Markov Model (HMM)
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215.
Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker.
Speech Recognition Xiaofeng Lai. What is speech recognition?  Speech recognition :  This is the ability of a machine or program to identify words and.
MACHINE LEARNING 16. HMM. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies.
Hidden Markov Model LR Rabiner
HCI/ComS 575X: Computational Perception
CONTEXT DEPENDENT CLASSIFICATION
Handwritten Characters Recognition Based on an HMM Model
CPSC 503 Computational Linguistics
Hidden Markov Models By Manish Shrivastava.
Presentation transcript:

SPEECH RECOGNITION Kunal Shalia and Dima Smirnov

What is Speech Recognition?  Speech Recognition is the ability of a machine or program to identify words and phrases in spoken language and convert them to a machine readable format.  Speech Recognition vs. Voice Recognition

Speech Recognition Demonstration

Early Automatic SR Systems  Based on the theory of acoustic phonetics  Describes how phonetic elements are realized in speech  Compared input speech to reference patterns  Trajectories along the first and second formant frequencies for the numbers 1 through 9 and “oh”:  Used in the first speech recognizer built by Bell Laboratories in 1952

The Development of SR  1950s  RCA Laboratories – recognizing 10 syllables spoken by a single speaker  MIT Lincoln Lab – speaker-independent 10-vowel recognition  1960s  Kyoto University – speech segmenter  University College – first to use a statistical model of allowable phoneme sequences in the English language  RCA Laboratories – non-uniform time scale instead of speech segmentation  1970s  Carnegie Mellon – graph search based on a beam algorithm

The Two Schools of SR  Two schools of applicability of ASR for commercial applications were developed in the 1970s  IBM  Speaker-dependent  Converted sentences into letters and words  Transcription - focus on the probability associated with the structure of the language model  N-gram model  AT&T  Speaker-independent  Emphasis on an acoustic model over language model

Markov Models  A stochastic model where each state depends only on the previous state in time.  The simplest Markov Model is the Markov chain which undergoes transitions from one state to the other through a random process.  Markov Property

Hidden Markov Models  A Hidden Markov Model (HMM) is a Markov Model using the Markov Property with unobserved (hidden) states.  In a Markov Model the states are directly visible to the observer, while in an HMM the state is not directly visible but the output,which is dependent on the state, is visible.

Elements of a HMM  There are a finite number of N states, and each state possesses some measurable, distinctive properties.  At each clock time T, a new state is entered based upon a transition probability distribution which depends on the previous state(Markovian property)  After each transition, an observation output symbol is produced according to the probability distribution of the state.

Urn and Ball Example  We assume that there are N glass urns in room.  In each urn there is a large quantity of colored balls and M distinct colors.  A gene is in the room and randomly chooses the initial urn.  Then a ball is chosen at random, its color recorded, and then the ball is replaced in the same urn.  A new urn is selected according to a random procedure associated with the current urn.

Urn and Ball Example  Each state corresponds to a specific urn  Color probability is defined for each state (hidden)

Coin Toss Example  You are in a room with a barrier and you cannot see what is happening on the other side.  On the other side another person is performing a coin(or multiple coin) tossing experiment.  You wont know what is happening, but will receive the results of each coin flip. Thus a sequence of HIDDEN coin tosses are performed and you can only observe the results.

One coin toss

Two coins being tossed

Three coins being tossed

HMM Notation

The Three Problems for HMM  1. Given the observation sequence O = (o1... oT ), and a model λ = (A, B, π ), how do we e ffi ciently compute P (O| λ ), the probability of the observation sequence given the model?  2. Given the observation sequence O = (o1... oT ), and a model λ = (A, B, π ), how do we choose a corresponding sequence q = (q1... qT ) that is optimal in some sense (i.e., best “explains” the observations)?  3. How do we adjust the model parameters λ = (A, B, π ) to maximize P (O| λ )?

3 types of HMM  Ergodic Model  Left to Right Model  Parallel Left to Right Model

Ergodic Model  In an ergodic model it is possible to reach any state from any other state.

Left to Right (Bakis) Model  As time increases, the state index increases or stays the same

Parallel Right to Left Model  A left to right model where there are several paths through the states.

HMM in SR  1980s – shift to rigorous statistical framework  HMM can model the variability in speech  Use Markov chains to represent linguistic structure and the set of probability distributions  Baum-Welch Algorithm to find unknown parameters  Hidden Markov Model merged with finite-state network

Speech Recognition Today  Developments in algorithms and data storage models have allowed more efficient methods of storing larger vocabulary bases  Modern Applications  Military  Health care  Telephony  Computing