Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.

Similar presentations


Presentation on theme: "Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers."— Presentation transcript:

1 Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers

2 What is Lazy Learning? Compare ANNs and CBR or k-NN classifier –Artificial Neural Networks are eager learners training examples compiled into a model at training time not available at runtime –CBR or k-Nearest Neighbour are lazy little offline learning done work deferred to runtime Compare conventional use of lazy-eager in computer science

3 Outline Classification problems Classification techniques k-Nearest Neighbour –Condense training Set –Feature selection –Feature weighting Ensemble techniques in ML

4 Classification problems Exemplar characterised by a set of features; decide class to which exemplar belongs Compare regression problems Exemplar characterised by a set of features; decide value of continuous output (dependant) variable

5 Classifying apples and pears To what class does this belong?

6 Distance/Similarity Function For query q and training set X (described by features F) compute d(x,q) for each x  X, where and where Category of q decided by its k Nearest Neighbours

7 k-NN and Noise 1-NN easy to implement –susceptible to noise a misclassification every time a noisy pattern retrieved k-NN with k  3 will overcome this

8 e.g. Pregnancy prediction http://svr-www.eng.cam.ac.uk/projects/qamc/

9 e.g. MVT Machine Vision for inspection of PCBs –components present or absent –solder joints good or bad

10 Components present? Absent Present

11 Characterise image as a set of features

12 Classification techniques Artificial Neural Networks –also good for non linear regression –black box development tricky users do not know what is going on Decision Trees –built using induction (information theoretic analysis) k-Nearest Neighbour classifiers –keep training examples, find k nearest at run time

13 Dimension reduction in k-NN Not all features required –noisy features a hindrance Some examples redundant –retrieval time depends on no. of examples p features q best features n covering examples m examples

14 Condensed NN D set of training samples Find E where E  D; NN rule used with E should be as good as with D choose x  D randomly, D  D \ {x}, E  {x}, DO learning?  FALSE, FOR EACH x  D classify x by NN using E, if classification incorrect then E  E  {x}, D  D \ {x}, learning  TRUE, WHILE (learning?  FALSE)

15 Condensed NN 100 examples 2 categories Different CNN solutions

16 Improving Condensed NN Sort data based on distance to nearest unlike neighbour A B –identify exemplars near decision surface –in diagram B more useful than A Different outcomes depending on data order –that’s a bad thing in an algorithm

17 Condensed NN 100 examples 2 categories Different CNN solutions CNN using NUN

18 m Feature selection Irrelevant features are noise: –make classification harder Extra features add to computation cost p

19 Ensemble techniques For the user with more machine cycles than they know what to do with Outcome Combiner Classifiers Build several classifiers –different training subsets –different feature subsets Aggregate results –voting vote based on generalisation error

20 Conclusions Finding a covering set of training data –very good solutions exist Compare with results of Ensemble techniques


Download ppt "Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers."

Similar presentations


Ads by Google