Pattern Classification via Density Estimation

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Chapter 4: Linear Models for Classification
Bayesian Learning Rong Jin. Outline MAP learning vs. ML learning Minimum description length principle Bayes optimal classifier Bagging.
Parameter Estimation: Maximum Likelihood Estimation Chapter 3 (Duda et al.) – Sections CS479/679 Pattern Recognition Dr. George Bebis.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Pattern Classification, Chapter 3 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P.
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Machine Learning CMPT 726 Simon Fraser University
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
Bayesian Learning Rong Jin.
Bayesian Estimation (BE) Bayesian Parameter Estimation: Gaussian Case
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
0 Pattern Classification, Chapter 3 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda,
Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis for Speech Recognition Bing Zhang and Spyros Matsoukas BBN Technologies Present.
Learning Theory Reza Shadmehr logistic regression, iterative re-weighted least squares.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Lab 3b: Distribution of the mean
Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
Optimal Bayes Classification
Chapter 3 (part 2): Maximum-Likelihood and Bayesian Parameter Estimation Bayesian Estimation (BE) Bayesian Estimation (BE) Bayesian Parameter Estimation:
START OF DAY 5 Reading: Chap. 8. Support Vector Machine.
Linear Models for Classification
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Nonparametric Density Estimation Riu Baring CIS 8526 Machine Learning Temple University Fall 2007 Christopher M. Bishop, Pattern Recognition and Machine.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Pattern Classification All materials in these slides* were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Crash course in probability theory and statistics – part 2 Machine Learning, Wed Apr 16, 2008.
CS479/679 Pattern Recognition Dr. George Bebis
Probability Theory and Parameter Estimation I
LECTURE 11: Advanced Discriminant Analysis
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Ch3: Model Building through Regression
Ch8: Nonparametric Methods
Distributions cont.: Continuous and Multivariate
CH 5: Multivariate Methods
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
Boosting and Additive Trees
Overview of Supervised Learning
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Bias and Variance of the Estimator
Classification Discriminant Analysis
Quantitative Methods PSY302 Quiz 6 Confidence Intervals
Classification Discriminant Analysis
ECE 539 Project Aditya Ghule
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Parametric Methods Berlin Chen, 2005 References:
Multivariate Methods Berlin Chen
Mathematical Foundations of BME
Probability.
Multivariate Methods Berlin Chen, 2005 References:
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Using Clustering to Make Prediction Intervals For Neural Networks
Lecture 16. Classification (II): Practical Considerations
Presentation transcript:

Pattern Classification via Density Estimation ECE 539 Jin Woo Yim 12/14/2010

Motivation: In class, we discussed MAP classifier and ML classifier along with the subject of pattern classification. In practice, the underlying densities of training sets are often unknown. We discussed many approaches to deal with this situation. However, the most fundamental approach would be estimating densities of training sets. Then we could use, for example when 2 classes exist, binary hypothesis testing.

How to estimate density? Histogram Density Estimator: The most intuitive way. (e.g. feat. dim. = 2, bins =16) e.g. Training data = 100, number of training samples in ith bin: (3,2) = 15. Bin 1 Bin 2 Bin 3 i th bin: (3,2) Bin 16

How good is this estimation? It will be tested using Mean Squared Error (Expected value of L2 norm difference between real density and estimated density). Mean Squared Error can be divided into bias and variance terms. These terms will be affected by the number of feature space dimension, the number of training samples, and the number of bins.

Comparison b/w Theory and Practice Write a matlab code to estimate Gaussian density(and possibly more densities) with n Gaussian training samples. Observe how estimated density performs with respect to increasing number of samples and number of feature spaces. Compare the simulated results with the theoretical results.