Instance Based Learning IB1 and IBK Small section in chapter 20.

Instance Based Learning IB1 and IBK Small section in chapter 20

1- Nearest Neighbor Basic distance function between attribute- values –If real, the absolute value –If nominal, d(v1,v2) = 1 if v1 \=v2, else 0. Distance between 2 instances is square root of sum of square. Usually normalize real-value distances for fairness amongst attributes.

Prediction For instance x, let y be closest instance to x in training set. Predict class x is the class of y. On some data sets, best algorithm.

Voronoi Diagram

For each point, draw the boundary of all points closest to it. Each point’s sphere of influence in convex. If noisy, can be bad. http://www.cs.cornell.edu/Info/People/chew /Delaunay.html - nice applet.http://www.cs.cornell.edu/Info/People/chew /Delaunay.html

Problems and solutions Noise –Remove bad examples –Use voting Bad distance measure –Use probability class vector Memory –Remove unneeded examples

Voting schemes K nearest neighbor –Let all the closest k neighbors vote (use k odd) Kernel K(x,y) – a similarity function –Let everyone vote, with decreasing weight according to K(x,y) –Ex: K(x,y) = e^(-distance(x,y)^2) –Ex. K(x,y) = inner product of x and y –Ex K(x,y) = inner product of f(x) and f(y) where f is some mapping of x and y into R^n.

Choosing the parameter K Divide data into train and test Run multiple values of k on train Choose k that does best on test. NOT – you have used test to data to pick the k.

Internal Cross-validation This can be used for selecting any parameter. Divide Data into Train and Test. Now do 10-fold CV on the training data to determine the appropriate value of k. Note: never touch the test data.

Probability Class Vector Let A be an attribute with values v1, v2,..vn Suppose class C1,C2,..Ck Prob Class Vector for vi is: Distance(vi,vj) = distance between probabiltiy class vectors.

PCV If an attribute is irrelevant and v and v’ are values, then PCV(v) ~ PCV(v’) so the distance will be close to 0. This discounts irrelevant attributes. It also works for real-attributes, after binning. Binning is a way to make real-values symbolic. Simple break data into k bins, eg. K = 5 or 10 seems to work. Or use DTs.

Regression by NN If 1-NN, use value of nearest example If k-nn, interpolate values of k nearest neighbors. Kernel methods work to. You avoid choice of k, but hide it in choice of kernel function.

Summary NN works for multi-class and regression. Sometimes called “poor man’s neural net’’ With enough data, it achieves ½ the “bayes optimal” error rate. Mislead by bad examples and bad features. Separates classes via piecewise linear boundaries.

Instance Based Learning IB1 and IBK Small section in chapter 20.

Similar presentations

Presentation on theme: "Instance Based Learning IB1 and IBK Small section in chapter 20."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Instance Based Learning IB1 and IBK Small section in chapter 20.

Similar presentations

Presentation on theme: "Instance Based Learning IB1 and IBK Small section in chapter 20."— Presentation transcript:

Similar presentations

About project

Feedback