Download presentation
Presentation is loading. Please wait.
1
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
2
SAD Tagus 2004/05 H. Galhardas The k-Nearest Neighbor Algorithm (1) All instances correspond to points in the n-D space. All instances correspond to points in the n-D space. Given an unknown tuple, the k-NN classifier searches the pattern space for the k training tuples that are closest to the unknown tuple. Given an unknown tuple, the k-NN classifier searches the pattern space for the k training tuples that are closest to the unknown tuple. The nearest neighbor is defined in terms of Euclidean distance: The nearest neighbor is defined in terms of Euclidean distance: The target function could be discrete- or real- valued. The target function could be discrete- or real- valued. For discrete-valued, the k-NN returns the most common value among the k training examples nearest to x q. For discrete-valued, the k-NN returns the most common value among the k training examples nearest to x q.
3
SAD Tagus 2004/05 H. Galhardas The k-Nearest Neighbor Algorithm (2) Key idea: just store all training examples Key idea: just store all training examples Nearest neighbor: Given query instance x q, first locate nearest training example x n, then estimate f(x q )=f(x n ) Given query instance x q, first locate nearest training example x n, then estimate f(x q )=f(x n ) K-nearest neighbor: Given x q, take vote among its k nearest neighbors (if discrete-valued target function) Given x q, take vote among its k nearest neighbors (if discrete-valued target function) Take mean of f values of k nearest neighbors (if real-valued) f(x q )= i=1 k f(x i )/k Take mean of f values of k nearest neighbors (if real-valued) f(x q )= i=1 k f(x i )/k
4
SAD Tagus 2004/05 H. Galhardas Lazy vs Eager Learning lazy evaluation Instance-based learning: lazy evaluation eager evaluation Decision-tree and Bayesian classification: eager evaluation Key differences Lazy: may consider query instance xq when deciding how to generalize beyond the training data D Eager: cannot since they have already chosen global approximation when seeing the query Efficiency: Lazy - less time training but more time predicting Efficiency: Lazy - less time training but more time predicting Accuracy Accuracy Lazy: effectively uses a richer hypothesis space since it uses many local linear functions to form its implicit global approximation to the target function Eager: must commit to a single hypothesis that covers the entire instance space
5
SAD Tagus 2004/05 H. Galhardas When to Consider Nearest Neighbors Instances map to points in R N Instances map to points in R N Less than 20 attributes per instance, typically normalized Less than 20 attributes per instance, typically normalized Lots of training data Lots of training data Advantages: Training is very fast Training is very fast Learn complex target functions Learn complex target functions Do not loose information Do not loose information Disadvantages: Slow at query time Slow at query time Presorting and indexing training samples into search trees reduces time Easily fooled by irrelevant attributes Easily fooled by irrelevant attributes
6
SAD Tagus 2004/05 H. Galhardas How to determine the good value for K? Determined experimentally Determined experimentally Start with K=1 and use a test set to validate the error rate of the classifier Start with K=1 and use a test set to validate the error rate of the classifier Repeat with K=K+1 Repeat with K=K+1 Choose the value of K for which the error rate is minimum Choose the value of K for which the error rate is minimum
7
SAD Tagus 2004/05 H. Galhardas Definition of Voronoi diagram the decision surface induced by 1-NN for a typical set of training examples. the decision surface induced by 1-NN for a typical set of training examples.. _ + _ xqxq + _ _ + _ _ +.....
8
SAD Tagus 2004/05 H. Galhardas 1-Nearest Neighbor query point q f nearest neighbor q i
9
SAD Tagus 2004/05 H. Galhardas 3-Nearest Neighbors query point q f 3 nearest neighbors 2x,1o
10
SAD Tagus 2004/05 H. Galhardas 7-Nearest Neighbors query point q f 7 nearest neighbors 3x,4o
11
SAD Tagus 2004/05 H. Galhardas Distance Weighted k-NN Give more weight to neighbors closer to the query point f ^ (x q ) = i=1 k w i f(x i ) / i=1 k w i where w i =K(d(x q,x i )) and d(x q,x i ) is the distance between x q and x i
12
SAD Tagus 2004/05 H. Galhardas Curse of Dimensionality Imagine instances described by 20 attributes but only 3 are relevant to target function Curse of dimensionality: nearest neighbor is easily misled when instance space is high-dimensional One approach: Stretch j-th axis by weight z j, where z 1,…,z n chosen to minimize prediction error Stretch j-th axis by weight z j, where z 1,…,z n chosen to minimize prediction error Use cross-validation to automatically choose weights z 1,…,z n Use cross-validation to automatically choose weights z 1,…,z n Note setting z j to zero eliminates this dimension alltogether (feature subset selection) Note setting z j to zero eliminates this dimension alltogether (feature subset selection)
13
SAD Tagus 2004/05 H. Galhardas Bibliografia Data Mining: Concepts and Techniques, J. Han & M. Kamber, Morgan Kaufmann, 2001 (Sect. 7.7.1) Data Mining: Concepts and Techniques, J. Han & M. Kamber, Morgan Kaufmann, 2001 (Sect. 7.7.1) Machine Learning, Tom Mitchell, McGraw 1997 (Cap 8) Machine Learning, Tom Mitchell, McGraw 1997 (Cap 8)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.