ECE – Pattern Recognition Lecture 4 – Parametric Estimation

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Classification and risk prediction
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
Introduction to Bayesian Parameter Estimation
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
Principles of Pattern Recognition
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.
ECE 8443 – Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Normal Distributions Whitening Transformations Linear Discriminants Resources.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
1 E. Fatemizadeh Statistical Pattern Recognition.
ECE 471/571 – Lecture 2 Bayesian Decision Theory 08/25/15.
ECE 471/571 – Lecture 6 Dimensionality Reduction – Fisher’s Linear Discriminant 09/08/15.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Machine Learning 5. Parametric Methods.
ECE 471/571 - Lecture 19 Review 11/12/15. A Roadmap 2 Pattern Classification Statistical ApproachNon-Statistical Approach SupervisedUnsupervised Basic.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 04: GAUSSIAN CLASSIFIERS Objectives: Whitening.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
ECE 471/571 – Lecture 21 Syntactic Pattern Recognition 11/19/15.
ECE 471/571 – Lecture 3 Discriminant Function and Normal Density 08/27/15.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Syntactic Pattern Recognition 04/28/17
ECE 471/571 - Lecture 19 Review 02/24/17.
Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17
Performance Evaluation 02/15/17
ECE 471/571 – Lecture 18 Classifier Fusion 04/12/17.
Unsupervised Learning - Clustering 04/03/17
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Unsupervised Learning - Clustering
Lecture 26: Faces and probabilities
Course Outline MODEL INFORMATION COMPLETE INCOMPLETE
REMOTE SENSING Multispectral Image Classification
REMOTE SENSING Multispectral Image Classification
ECE 471/571 – Lecture 12 Perceptron.
ECE 599/692 – Deep Learning Lecture 9 – Autoencoder (AE)
ECE 599/692 – Deep Learning Lecture 4 – CNN: Practical Issues
ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Pattern Recognition and Image Analysis
Hairong Qi, Gonzalez Family Professor
ECE 471/571 – Review 1.
Parametric Estimation
Pattern Recognition and Machine Learning
Hairong Qi, Gonzalez Family Professor
LECTURE 07: BAYESIAN ESTIMATION
Parametric Methods Berlin Chen, 2005 References:
ECE – Pattern Recognition Lecture 16: NN – Back Propagation
ECE – Pattern Recognition Lecture 15: NN - Perceptron
ECE – Pattern Recognition Lecture 13 – Decision Tree
Learning From Observed Data
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Hairong Qi, Gonzalez Family Professor
Hairong Qi, Gonzalez Family Professor
Data Exploration and Pattern Recognition © R. El-Yaniv
ECE – Lecture 1 Introduction.
ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.
Bayesian Decision Theory
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 8 – Performance Evaluation
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 14 – Gradient Descent
ECE – Pattern Recognition Midterm Review
Presentation transcript:

ECE471-571 – Pattern Recognition Lecture 4 – Parametric Estimation Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi Email: hqi@utk.edu

Pattern Classification Statistical Approach Non-Statistical Approach Supervised Unsupervised Decision-tree Basic concepts: Baysian decision rule (MPP, LR, Discri.) Basic concepts: Distance Agglomerative method Syntactic approach Parameter estimate (ML, BL) k-means Non-Parametric learning (kNN) Winner-takes-all LDF (Perceptron) Kohonen maps NN (BP) Mean-shift Support Vector Machine Deep Learning (DL) Dimensionality Reduction FLD, PCA Performance Evaluation ROC curve (TP, TN, FN, FP) cross validation Stochastic Methods local opt (GD) global opt (SA, GA) Classifier Fusion majority voting NB, BKS

Bayes Decision Rule Maximum Posterior Probability Likelihood Ratio If , then x belongs to class 1, otherwise, 2. Discriminant Function Case 1: Minimum Euclidean Distance (Linear Machine), Si=s2I Case 2: Minimum Mahalanobis Distance (Linear Machine), Si = S Case 3: Quadratic classifier , Si = arbitrary

Multivariate Normal Density

Estimating Normal Densities Calculate m, S

Covariance

*Why? – Maximum Likelihood Estimation Compare “likelihood”

*Derivation

* Derivative of a Quadratic Form

**Why? – Baysian Estimation Maximum likelihood estimation The parameters are fixed Find value for q that best agrees with or supports the actually observed training samples – likelihood of q w.r.t. the set of samples Baysian estimation Treat parameters as random variable themselves

** The pdf of the parameter (m) is Gaussian

** Derivation

** mn and sn Behavior of Bayesian learning Our best guess for m after observing n samples Measures our uncertainty about this guess Behavior of Bayesian learning The larger the n, the smaller the sn – each additional observation decreases our uncertainty about the true value of m As n approaches infinity, p(m|D) becomes more and more sharply peaked, approaching a Dirac delta function. mn is a linear combination between the sample mean and m0