Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17

Slides:



Advertisements
Similar presentations
Principles of Density Estimation
Advertisements

Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Lecture 3 Nonparametric density estimation and classification
Pattern recognition Professor Aly A. Farag
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1-NN Rule: Given an unknown sample X decide if for That is, assign X to category if the closest neighbor of X is from category i.
Prénom Nom Document Analysis: Non Parametric Methods for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule Non-parametric : Can be used with arbitrary distributions, No need to assume that the form.
Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 471/571 – Lecture 2 Bayesian Decision Theory 08/25/15.
ECE 471/571 – Lecture 6 Dimensionality Reduction – Fisher’s Linear Discriminant 09/08/15.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 6: Nearest and k-nearest Neighbor Classification.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
ECE 471/571 - Lecture 19 Review 11/12/15. A Roadmap 2 Pattern Classification Statistical ApproachNon-Statistical Approach SupervisedUnsupervised Basic.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
ECE 471/571 – Lecture 21 Syntactic Pattern Recognition 11/19/15.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.
ECE 471/571 – Lecture 3 Discriminant Function and Normal Density 08/27/15.
KNN & Naïve Bayes Hongning Wang
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Syntactic Pattern Recognition 04/28/17
ECE 471/571 - Lecture 19 Review 02/24/17.
Non-Parameter Estimation
Performance Evaluation 02/15/17
Ch8: Nonparametric Methods
Lecture 05: K-nearest neighbors
ECE 471/571 – Lecture 18 Classifier Fusion 04/12/17.
Unsupervised Learning - Clustering 04/03/17
Unsupervised Learning - Clustering
Outline Parameter estimation – continued Non-parametric methods.
K Nearest Neighbor Classification
Nearest-Neighbor Classifiers
ECE 471/571 – Lecture 12 Perceptron.
ECE 471/571 – Review 1.
Parametric Estimation
Hairong Qi, Gonzalez Family Professor
INTRODUCTION TO Machine Learning
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
LECTURE 16: NONPARAMETRIC TECHNIQUES
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
ECE – Pattern Recognition Lecture 16: NN – Back Propagation
ECE – Pattern Recognition Lecture 13 – Decision Tree
Nonparametric density estimation and classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Hairong Qi, Gonzalez Family Professor
Hairong Qi, Gonzalez Family Professor
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
ECE – Lecture 1 Introduction.
ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.
Bayesian Decision Theory
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 8 – Performance Evaluation
ECE – Pattern Recognition Lecture 4 – Parametric Estimation
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 14 – Gradient Descent
ECE – Pattern Recognition Midterm Review
Presentation transcript:

Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17 ECE 471/571 – Lecture 10 Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17

Different Approaches - More Detail Pattern Classification Statistical Approach Syntactic Approach Supervised Unsupervised Basic concepts: Baysian decision rule (MPP, LR, Discri.) Basic concepts: Distance Agglomerative method Parametric learning (ML, BL) k-means Non-Parametric learning (kNN) Winner-take-all NN (Perceptron, BP) Kohonen maps Dimensionality Reduction Fisher’s linear discriminant K-L transform (PCA) Performance Evaluation ROC curve TP, TN, FN, FP Stochastic Methods local optimization (GD) global optimization (SA, GA) ECE471/571, Hairong Qi

Bayes Decision Rule (Recap) Maximum Posterior Probability Likelihood Ratio Discriminant Function Three cases Estimate Gaussian, Two-modal Gaussian Dimensionality reduction Performance evaluation and ROC curve

Motivation Estimate the density functions without the assumption that the pdf has a particular form

*Problem with Parzen Windows Discontinuous window function -> Gaussian The choice of h Still another one: fixed volume

kNN (k-Nearest Neighbor) To estimate p(x) from n samples, we can center a cell at x and let it grow until it contains kn samples, and kn can be some function of n Normally, we let

kNN in Classification Given c training sets from c classes, the total number of samples is *Given a point x at which we wish to determine the statistics, we find the hypersphere of volume V which just encloses k points from the combined set. If within that volume, km of those points belong to class m, then we estimate the density for class m by

kNN Classification Rule The decision rule tells us to look in a neighborhood of the unknown feature vector for k samples. If within that neighborhood, more samples lie in class i than any other class, we assign the unknown as belonging to class i.

Problems What kind of distance should be used to measure “nearest” Euclidean metric is a reasonable measurement Computation burden Massive storage burden Need to compute the distance from the unknown to all the neighbors

Computational Complexity of kNN In both space (storage space) and time (search time) Algorithms reducing the computational burden Computing partial distances Prestructuring Editing the stored prototypes

Method 1 - Partial Distance Calculate the distance using some subset r of the full d dimensions. If this partial distance is too large, we don’t need to compute further What we know about the subspace is indicative of the full space

Method 2 - Prestructuring Prestructure samples in the training set into a search tree where samples are selectively linked Distances are calculated between the testing sample and a few stored “root” samples, and then consider only the samples linked to it Closest distance no longer guaranteed. X x o o x x o o X x x o o x x x x

Method 3 – Editing/Pruning/Condensing Eliminate “useless” samples during training For example, eliminate samples that are surrounded by training points of the same category label Leave the decision boundaries unchanged

Discussion Combining three methods Other concerns The choice of k The choice of metrics

Distance (Metrics) Used by kNN Properties of metrics Nonnegativity (D(a,b) >=0) Reflexivity (D(a,b) = 0 iff a=b) Symmetry (D(a,b) = D(b,a)) Triangle inequality (D(a,b) + D(b,c) >= D(a,c)) Different metrics Minkowski metric Manhattan distance (city block distance) Euclidean distance When k is inf, maximum of the projected distances onto each of the d coordinate axes

Visualizing the Distances