1 LING 696B: Final thoughts on nonparametric methods, Overview of speech processing.

Slides:

Advertisements

Similar presentations

Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin.

Advertisements

ECG Signal processing (2)

Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.

Neural networks Introduction Fitting neural networks

ONLINE ARABIC HANDWRITING RECOGNITION By George Kour Supervised by Dr. Raid Saabne.

Learning for Structured Prediction Overview of the Material TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A A.

An Introduction of Support Vector Machine

Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.

Machine learning continued Image source:

An Overview of Machine Learning

Dual-domain Hierarchical Classification of Phonetic Time Series Hossein Hamooni, Abdullah Mueen University of New Mexico Department of Computer Science.

Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

Artificial Intelligence Lecture 2 Dr. Bo Yuan, Professor Department of Computer Science and Engineering Shanghai Jiaotong University

Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.

CS Instance Based Learning1 Instance Based Learning.

Machine learning Image source:

Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.

Introduction to Automatic Speech Recognition

Source/Filter Theory and Vowels February 4, 2010.

Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.

1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL.

Gaussian process modelling

1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.

Outline Separating Hyperplanes – Separable Case

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.

Reporter: Shih-Hsiang( 士翔 ). Introduction Speech signal carries information from many sources –Not all information is relevant or important for speech.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.

1 LING 696B: Gradient phonotactics and well- formedness.

1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.

Csc Lecture 7 Recognizing speech. Geoffrey Hinton.

1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:

Basics of Neural Networks Neural Network Topologies.

LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.

Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

1 LING 696B: Midterm review: parametric and non-parametric inductive inference.

Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.

SPEECH PERCEPTION DAY 18 – OCT 9, 2013 Brain & Language LING NSCI Harry Howard Tulane University.

CHEE825 Fall 2005J. McLellan1 Spectral Analysis and Input Signal Design.

Introduction to Neural Networks and Example Applications in HCI Nick Gentile.

Some Aspects of Bayesian Approach to Model Selection Vetrov Dmitry Dorodnicyn Computing Centre of RAS, Moscow.

CS Inductive Bias1 Inductive Bias: How to generalize on novel data.

Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.

CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.

Combining Speech Attributes for Speech Recognition Jeremy Morris November 9, 2006.

Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.

1 LING 696B: Graph-based methods and Supervised learning.

Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,

RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.

Statistical Models for Automatic Speech Recognition Lukáš Burget.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

SVMs, Part 2 Summary of SVM algorithm Examples of “custom” kernels Standardizing data for SVMs Soft-margin SVMs.

Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.

PATTERN COMPARISON TECHNIQUES

Instance Based Learning

James L. McClelland SS 100, May 31, 2011

Conditional Random Fields for ASR

Statistical Models for Automatic Speech Recognition

Outline Linear Shift-invariant system Linear filters

Statistical Models for Automatic Speech Recognition

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.

8-Speech Recognition Speech Recognition Concepts

Support Vector Machines and Kernels

Introduction to Radial Basis Function Networks

August 8, 2006 Danny Budik, Itamar Elhanany Machine Intelligence Lab

Support Vector Machines 2

Presentation transcript:

1 LING 696B: Final thoughts on nonparametric methods, Overview of speech processing

2 For those who are taking the class for credits Talk to me some time about what you are planning to do (term project / homeworks) My OH: TR 2:00-3:00

3 Review: inductive inference from last time Hypothesis Old dataNew data Estimation Prediction Interpolation/ Smoothing

4 Example from last time: Transductive SVM Generalization can also depend on other new data (see demo)

5 Example from last time: Gaussian process Infinite feed-forward neural net: Hidden: h j (x) = tanh(  i v ij x i + a j ) Output: o k (x) =  j w jk h j (x) + b k Weights: v ij, w jk ; bias: a j, b k Don’t train the network with backprop: letting weights be random, then this network becomes a Gaussian process model Another non-parametric machine (see demo) Hidden units can be thought of as complex kernel extensions -- simple kernels work too

6 Making non-parametric method more analogy-like Function approximation: predict y  Y from (x 1, y 1 ), …, (x N, y N ) and a new x  X Building blocks of predictor: kernel functions K(x 1, x 2 ): similarity between x 1 and x 2 This is not yet “analogy” -- x  R n has no structure (data points)

7 Making non-parametric method more analogy-like What if the input x has some structure? Example: x 1, x 2 are sequences Extension: choose kernel functions sensitive to the structure of x 1, x 2, e.g. string kernels K t (x 1, x 2 ) = number of common subsequences of length t Finding the “right” metric requires some understanding of the structure Example: p kernels K(x 1, x 2 )=  i p(x 2 |h)p(x 1 |h)p(h)

8 Making non-parametric method more analogy-like What if the output y has some structure? Make the error function sensitive to the structure of y (intense computations) These extensions have made the non- parametric, discriminative methods (e.g. SVM) “outperform” other ones in many tasks

9 Making non-parametric method more analogy-like What if the output has some structure? Make the error function sensitive to the structure of y (intense computations) These extensions have made the non- parametric, discriminative methods (e.g. SVM) “outperform” other ones in many tasks One exception: speech

10 Final thoughts on non- parametric models Machine: most non-parametric methods look like the following minimize (error + constant*complexity)

11 Final thoughts on non- parametric models Machine: most non-parametric methods look like the following minimize (error + constant*complexity) People: are often able to generalize without relying on explicit rules

12 Final thoughts on non- parametric models Machine: most non-parametric methods look like the following minimize (error + constant*complexity) People: are often able to generalize without relying on explicit rules Connectionist propaganda often sells this Yet unable to control either the error or complexity

13 Final thoughts on non- parametric models Why not build explicit similarity/analogy models with non-parametric methods?

14 Final thoughts on non- parametric models Why not build explicit similarity/analogy models with non-parametric methods? Term project idea: find some experimental data from literature, and build a model that “outperforms” neural nets

15 Final thoughts on non- parametric models Why not build explicit similarity/analogy models with non-parametric methods? Term project idea: find some experimental data from literature, and build a model that “outperforms” neural nets Maybe “outperform” isn’t the right goal How does the model help us understand people?

16 Moving on to phonology “these problems do not arise when phonetic transcription is understood in the terms outlined above, that is, not as a direct record of the speech signal, but rather as a representation of what the speaker of a language takes to be the phonetic properties of an utterance…” -- SPE p. 294

17 Alternative to feature/ segment representations? Exemplar people Yet convincing arguments for real alternatives are few Coleman paper: maybe should explore more “realistic” representations by looking at acoustics This is often hard, seen in many years of research on speech recognition

18 Ladefoged’s experiment “There was once a young rat named Arthur, who could never take the trouble to make up his mind. “There was once a young rat named Arthur, who could never take the trouble to make up his mind. Superimposed with a word “dot” Where is “dot”?

19 A very quick tour of speech processing Dimension reduction: finding basis for speech signals Most often used: fourier basis (sinusoids) Orthogonal v.s. overcomplete basis Short-time processing assumption: taking snapshots over time No perfect snapshots: either loses time or frequency resolution

20 A zoo of signal representations LPC/reflection coefficients

21 A zoo of signal representations Mel-frequency filterbank / cepstra

22 A zoo of signal representations PLP/RASTA spectra/cepstra for “linguistics”

23 The perceptual relevance of distance metrics People do not information from all frequency bands to get the linguistic content Example: low-pass, high-pass and band-pass filtered speech

24 Extending distance metric to sequences Dynamic time warping Template-based method Depends on distance metric between single frames Often requires many heuristics (large literature) See example