Download presentation
Presentation is loading. Please wait.
1
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI
2
2 Nearest Neighbor Algorithm Given training data Define a distance metric between points in inputs space. Common measures are: –Euclidean (squared) –Weighted Euclidean Intro AI
3
3 K-Nearest Neighbor Model Given test point Find the K nearest training inputs to given the distance metric Denote these points as Intro AI
4
4 K-Nearest Neighbor Model Regression: Classification: Intro AI
5
5 K-Nearest Neighbor Model: Weighted by Distance Regression: Classification: Intro AI
6
6 Picking K (and ) Given training examples Use N fold cross validation –Search over K = (1,2,3,…,Kmax). Choose search size Kmax based on compute constraints –Calculated the average error for each K: Calculate predicted class for each training point (using all other points to build the model) Average over all training examples Pick K to minimize the cross validation error Intro AI
7
7 Class Decision Boundaries: The Voronoi Diagram Each line segment is equidistance between points in opposite classes. The more points, the more complex the boundaries. Intro AI
8
8 K-Nearest Neighbor Algorithm Characteristics Universal Approximator –Can model any many-to-one mapping arbitrarily well Curse of Dimensionality: Can be easily fooled in high dimensional spaces –Dimensionality reduction techniques are often used Model can be slow to evaluate for large training sets –kd-trees can help (low dimensional problems) –Cover Trees (high dimensional): http://hunch.net/~jl/projects/cover_tree/cover_tree.html –Selectively storing data points also helps Intro AI
9
9 kd-trees Intro AI
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.