Download presentation
Presentation is loading. Please wait.
1
Nearest-Neighbor Classifiers Sec 4.7
2
5 minutes of math... Definition: a metric function is a function that obeys the following properties: Identity: Symmetry: Triangle inequality:
3
5 minutes of math... Examples: Euclidean distance * Note: omitting the square root still yields a metric and usually won’t change our results
4
5 minutes of math... Examples: Manhattan (taxicab) distance Distance travelled along a grid between two points No diagonals allowed
5
5 minutes of math... Examples: What if some attribute is categorical? Typical answer is 0/1 distance: For each attribute, add 1 if the instances differ in that attribute, else 0 (To make Daniel happy: for (i=0;i<xa.length;++i) { d+=(xa[i]!=xb[i]) ? 1 : 0; } )
6
Distances in classification Nearest neighbor: find the nearest instance to the query point in feature space, return the class of that instance Simplest possible distance-based classifier With more notation: Distance function is anything appropriate to your data
7
Properties of NN Training time of NN? Classification time? Geometry of model? d(, ) Closer to
8
Properties of NN Training time of NN? Classification time? Geometry of model?
9
Properties of NN Training time of NN? Classification time? Geometry of model?
10
Eventually...
11
NN miscellaney Slight generalization: k -Nearest neighbors ( k - NN) Find k training instances closest to query point Vote among them for label Q: How does this affect system? Q: Why does it work?
12
Geometry of k -NN d (7) Query point
13
Exercise Show that k-NN does something reasonable: Assume binary data Let X be query point, X’ be any k -neighbor of X Let p=Pr[Class(X’)==Class(X)] ( p>1/2 ) What is Pr[X receives correct label] ? What happens as k grows? But there are tradeoffs... Let V(k,N)=volume of sphere enclosing k neighbors of X, assuming N points in data set For fixed N, what happens to V(k,N) as k grows? For fixed k, what happens to V(k,N) as N grows? What about radius of V(k,N) ?
14
Excercise What is Pr[X receives correct label] ?
15
Excercise What happens as k →∞? Theorem: in the limit of large k, the binomial distribution is well approximated by the Gaussian:
16
Excercise So:
17
NN miscellaney Gotcha: unscaled dimensions What happens if one axis is measured in microns and one in lightyears? Usual trick is to scale each axis to [0,1] range (Sometimes [-1,1] is useful as well)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.