Optimal Adaptation for Statistical Classifiers Xiao Li
Motivation Problem A statistical classifier works well if the test set matches the data distribution of the train set It is difficult to get a large amount of matched training data A case study – vowel classification Target test set – pure vowel articulation for specific speakers Available train set – conversational speech with a great number of speakers
Adaptation Methodology 1. Extract vowel segments from conversational speech to form a train set 2. Feature extraction and class labeling 3. Train speaker-independent models on this train set 4. Ask a speaker to articulate a few seconds of vowels for each class 5. Adapt the classifier on this small amount of speaker- dependent, pure vowel data
Two Classifiers Gaussian mixture models (GMM) Generative models Training objective: maximum likelihood via EM Neural Networks (NN) Multilayer perceptrons Training objective: Least square error Minimum relative entropy
MLLR for GMM Adaptation Maximum Likelihood Linear Regression Apply a linear transformation on the Gaussian mean Same transformation for the mixture of Gaussians in the same class Adaptation Objective Find the transformation matrices that maximizes the likelihood via EM
NN Adaptation Idea -- Fix the nonlinear mapping and update the last layer of linear classifier Two alternative methods with different objectives 1. Minimum relative entropy Optimization method – gradient descent 2. Optimal hyper-plane Optimization method – support vector machine
Vowel Classification Experiments Databases Database A – speaker-independent conversational speech Database B – sustained vowel recordings from 6 speakers, with different energy and pitch Method 1. Train speaker-independent classifiers Database A s 2. Adapt classifiers on a small set of Database B, samples per speaker 3. Test on the rest of Database B