Optimal Adaptation for Statistical Classifiers Xiao Li.

Slides:



Advertisements
Similar presentations
Neural networks Introduction Fitting neural networks
Advertisements

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
An Overview of Machine Learning
Supervised Learning Recap
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Computer vision: models, learning and inference
Classification Neural Networks 1
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Machine Learning Neural Networks
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Lecture 14 – Neural Networks
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Regularized Adaptation for Discriminative Classifiers Xiao Li and Jeff Bilmes University of Washington, Seattle.
Announcements See Chapter 5 of Duda, Hart, and Stork. Tutorial by Burge linked to on web page. “Learning quickly when irrelevant attributes abound,” by.
Speaker Adaptation for Vowel Classification
1 Regularized Adaptation: Theory, Algorithms and Applications Xiao Li Electrical Engineering Department University of Washington.
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
1 USING CLASS WEIGHTING IN INTER-CLASS MLLR Sam-Joo Doh and Richard M. Stern Department of Electrical and Computer Engineering and School of Computer Science.
Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.
Lecture 10: Support Vector Machines
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Collaborative Filtering Matrix Factorization Approach
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
This week: overview on pattern recognition (related to machine learning)
Biointelligence Laboratory, Seoul National University
Natural Gradient Works Efficiently in Learning S Amari (Fri) Computational Modeling of Intelligence Summarized by Joon Shik Kim.
Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear Linear Training Training Small, Round Blue-Cell Tumor Classification Example Small, Round Blue-Cell.
Classification / Regression Neural Networks 2
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
CS Statistical Machine learning Lecture 10 Yuan (Alan) Qi Purdue CS Sept
Virtual Vector Machine for Bayesian Online Classification Yuan (Alan) Qi CS & Statistics Purdue June, 2009 Joint work with T.P. Minka and R. Xiang.
Non-Bayes classifiers. Linear discriminants, neural networks.
A Baseline System for Speaker Recognition C. Mokbel, H. Greige, R. Zantout, H. Abi Akl A. Ghaoui, J. Chalhoub, R. Bayeh University Of Balamand - ELISA.
Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.
Map-Reduce for Machine Learning on Multicore C. Chu, S.K. Kim, Y. Lin, Y.Y. Yu, G. Bradski, A.Y. Ng, K. Olukotun (NIPS 2006) Shimin Chen Big Data Reading.
Linear Models for Classification
Insight: Steal from Existing Supervised Learning Methods! Training = {X,Y} Error = target output – actual output.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Speaker Verification Using Adapted GMM Presented by CWJ 2000/8/16.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
ECE 471/571 – Lecture 3 Discriminant Function and Normal Density 08/27/15.
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
Automatic Script Identification. Why do we need Script Identification OCRs are generally language dependent. Document layout analysis is sometimes language.
Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.
PatReco: Introduction Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Neural networks and support vector machines
Deep Feedforward Networks
Deep Learning Amin Sobhani.
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Overview of Supervised Learning
Probabilistic Models for Linear Regression
Statistical Learning Dong Liu Dept. EEIS, USTC.
Multi-layer perceptron
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Collaborative Filtering Matrix Factorization Approach
Introduction to Radial Basis Function Networks
What is Artificial Intelligence?
Presentation transcript:

Optimal Adaptation for Statistical Classifiers Xiao Li

Motivation Problem  A statistical classifier works well if the test set matches the data distribution of the train set  It is difficult to get a large amount of matched training data A case study – vowel classification  Target test set – pure vowel articulation for specific speakers  Available train set – conversational speech with a great number of speakers

Adaptation Methodology 1. Extract vowel segments from conversational speech to form a train set 2. Feature extraction and class labeling 3. Train speaker-independent models on this train set 4. Ask a speaker to articulate a few seconds of vowels for each class 5. Adapt the classifier on this small amount of speaker- dependent, pure vowel data

Two Classifiers Gaussian mixture models (GMM)  Generative models  Training objective: maximum likelihood via EM Neural Networks (NN)  Multilayer perceptrons  Training objective: Least square error Minimum relative entropy

MLLR for GMM Adaptation Maximum Likelihood Linear Regression  Apply a linear transformation on the Gaussian mean  Same transformation for the mixture of Gaussians in the same class Adaptation Objective  Find the transformation matrices that maximizes the likelihood via EM

NN Adaptation Idea -- Fix the nonlinear mapping and update the last layer of linear classifier Two alternative methods with different objectives 1. Minimum relative entropy Optimization method – gradient descent 2. Optimal hyper-plane Optimization method – support vector machine

Vowel Classification Experiments Databases  Database A – speaker-independent conversational speech  Database B – sustained vowel recordings from 6 speakers, with different energy and pitch Method 1. Train speaker-independent classifiers Database A s 2. Adapt classifiers on a small set of Database B, samples per speaker 3. Test on the rest of Database B