Download presentation
Presentation is loading. Please wait.
Published byBarnaby Johnston Modified over 9 years ago
1
Midterm Review Spoken Language Processing Prof. Andrew Rosenberg
2
Lecture 1 - Overview Applications –speech recognition –speech synthesis –other applications: indexing, language id, etc. Information in speech –words –speaker identity –speaker state –discourse acts 1
3
Lecture 2 – From Sounds to Language Differences between orthography and sounds Phonetic symbol sets –e.g. IPA, ARPAbet. Vocal organs –articulators Classes of sounds Coarticulation 2
4
Lecture 3 – Spoken Dialog Systems Maxims of Conversational Implicature Dialog System Architecture –Speech Recognition –Dialog Management –Response Generation –Speech Synthesis Dialog Strategies 3
5
Lecture 4 – Acoustics of Speech Phone Recognition Prosody Speech Waveforms Analog to Digital Conversion Nyquist Rate Pitch Doubling and Halving 4
6
Lecture 5 – Speech Recognition Overview History of Speech Recognition –Rule based recognition –Dynamic Time Warping –Statistical Modeling What are qualities that make speech recognition difficult? Noisy Channel Model Training and Test Corpora Word Error Rate 5
7
Lecture 6 – Fast Fourier Transform Multiplying Polynomials Divide-and-Conquer for multiplying polynomials. Relationship between multiplying polynomials and cosine transform Complex roots at unity 6
8
Lecture 7 - MFCC What is the MFCC used for? Overlapping Windows Mel Frequency Spectrogram 7
9
Lecture 8 – Statistical Modeling Probabilities –Bayes Rule –Bayesians vs. Frequentists Maximum Likelihood Estimation Multinomial Distribution –Bernoulli Distribution Gaussian Distribution –Multidimensional Gaussian Difference between Classification, Clustering, Regression Black Swans and the Long Tail 8
10
Lecture 9 – Acoustic Modeling What does an Acoustic Model do? Gaussian Mixture Model Potential Problems –Inconsistent Numbers of Gaussians –Singularities Training Acoustic Models. 9
11
Lecture 10 – Hidden Markov Model The Markov Assumption Difference between states and observations Finite State Automata Decoding using Viterbi Forced Alignment Flat Start Silence 10
12
Lecture 11 - Pronunciation Modeling Dictionary Finite State Automata Use in speech recognition Using morphology for pronunciation modeling Grapheme to Phoneme Conversion –Letter to Sound rules Machine Learning for G-to-P 11
13
Lecture 12 – Language Modeling Using a Context Free Grammar to define a set of recognized sequences of words. –Terminals, non-terminals, start state N-Gram models –Mathematical underpinnings –Theoretical background How a “word” is defined. Learning n-gram statistics Terminology 12
14
Next Class Midterm Exam Reading: J&M Chapter 4 13
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.