Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated.

Slides:



Advertisements
Similar presentations
INSTANCE-BASED LEARNING ALGORITHMS Presented by Yan T. Yang.
Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Linear Separators.
Indian Statistical Institute Kolkata
Instance Based Learning
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Lazy vs. Eager Learning Lazy vs. eager learning
Classification and Decision Boundaries
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
CS 678 –Relaxation and Hopfield Networks1 Relaxation and Hopfield Networks Totally connected recurrent relaxation networks Bidirectional weights (symmetric)
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Regression. So far, we've been looking at classification problems, in which the y values are either 0 or 1. Now we'll briefly consider the case where.
Linear Separators.
Nearest Neighbor. Predicting Bankruptcy Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return.
Prof. Ramin Zabih (CS) Prof. Ashish Raj (Radiology) CS5540: Computational Techniques for Analyzing Clinical Data.
K nearest neighbor and Rocchio algorithm
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA Who.
CES 514 – Data Mining Lecture 8 classification (contd…)
Linear Separators. Bankruptcy example R is the ratio of earnings to expenses L is the number of late payments on credit cards over the past year. We will.
CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Support Vector Machines Classification
INSTANCE-BASE LEARNING
Nearest Neighbor Classifiers other names: –instance-based learning –case-based learning (CBL) –non-parametric learning –model-free learning.
CS Instance Based Learning1 Instance Based Learning.
Rugby players, Ballet dancers, and the Nearest Neighbour Classifier COMP24111 lecture 2.
October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.
Module 04: Algorithms Topic 07: Instance-Based Learning
COMP 175: Computer Graphics March 24, 2015
Learning: Nearest Neighbor Artificial Intelligence CMSC January 31, 2002.
GEOMETRIC VIEW OF DATA David Kauchak CS 451 – Fall 2013.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
K Nearest Neighborhood (KNNs)
Data Mining – Algorithms: Prism – Learning Rules via Separating and Covering Chapter 4, Section 4.4.
K Nearest Neighbors Saed Sayad 1www.ismartsoft.com.
1 Pattern Classification X. 2 Content General Method K Nearest Neighbors Decision Trees Nerual Networks.
CSC 196k Semester Project: Instance Based Learning
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.
Ensemble Methods: Bagging and Boosting
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.
COMP 2208 Dr. Long Tran-Thanh University of Southampton K-Nearest Neighbour.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
K nearest neighbors algorithm Parallelization on Cuda PROF. VELJKO MILUTINOVIĆ MAŠA KNEŽEVIĆ 3037/2015.
CS Machine Learning Instance Based Learning (Adapted from various sources)
K-Nearest Neighbor Learning.
Computational Learning Theory Part 1: Preliminaries 1.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
Data Mining Practical Machine Learning Tools and Techniques Chapter 6.5: Instance-based Learning Rodney Nielsen Many / most of these slides were adapted.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Ananya Das Christman CS311 Fall 2016
Data Science Algorithms: The Basic Methods
Data Mining – Algorithms: Instance-Based Learning
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Instance Based Learning (Adapted from various sources)
Machine Learning Week 1.
K Nearest Neighbor Classification
Nearest Neighbors CSC 576: Data Mining.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Machine Learning in Practice Lecture 20
Data Mining CSCI 307, Spring 2019 Lecture 11
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Presentation transcript:

Instance Based Learning

Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated with it In order to say what point is nearest, we have to define what we mean by "near". Typically, we use Euclidean distance between two points. Nominal attributes: distance is set to 1 if values are different, 0 if they are equal

Predicting Bankruptcy

Now, let's say we have a new person with R equal to 0.3 and L equal to 2. What y value should we predict? And so our answer would be "no".

Scaling The naïve Euclidean distance isn't always appropriate. Consider the case where we have two features describing a car. –f 1 = weight in pounds –f 2 = number of cylinders. Any effect of f 2 will be completely lost because of the relative scales. So, rescale the inputs to put all of the features on about equal footing:

Time and Space Learning is fast –We just have to remember the training data. Space is n. What takes longer is answering a query. If we do it naively, we have to, for each point in our training set (and there are n of them) compute the distance to the query point (which takes about m computations, since there are m features to compare). So, overall, this takes about m * n time.

Noise Someone with an apparently healthy financial record goes bankrupt.

Remedy: K-Nearest Neighbors k-nearest neighbor algorithm: –Just like the old algorithm, except that when we get a query, we'll search for the k closest points to the query points. Output what the majority says. –In this case, we've chosen k to be 3. –The three closest points consist of two "no"s and a "yes", so our answer would be "no". Find the optimal k using cross- validation

Other Variants IB2: save memory, speed up classification –Work incrementally –Only incorporate misclassified instances –Problem: noisy data gets incorporated IB3: deal with noise –Discard instances that don’t perform well –Keep a record of the number of correct and incorrect classification decisions that each exemplar makes. –Two predetermined thresholds are set on success ratio. If the performance of exemplar falls below the low threshold it is deleted. If the performance exceeds the upper threshold it is used for prediction.

Instance-based learning: IB2 IB2: save memory, speed up classification –Work incrementally –Only incorporate misclassified instances –Problem: noisy data gets incorporated Data: “Who buys gold jewelry” (25,60,no) (45,60,no) (50,75,no) (50,100,no) (50,120,no) (70,110,yes) (85,140,yes) (30,260,yes) (25,400,yes) (45,350,yes) (50,275,yes) (60,260,yes)

Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) –(70,110,yes) –(25,400,yes) –(50,100,no) –(45,350,yes) –(50,275,yes) –(60,260,yes) This is the final answer. I.e. we memorize only these 5 points. However, let’s compute gradually the classifier.

Instance-based learning: IB2 Data: –(25,60,no)

Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) Since so far the model has only the first instance memorized, this second instance gets wrongly classified. So, we memorize it as well.

Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) So far the model has the two first instances memorized. The third instance gets properly classified, since it happens to be closer with the first. So, we don’t memorize it.

Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) So far the model has the two first instances memorized. The fourth instance gets properly classified, since it happens to be closer with the second. So, we don’t memorize it.

Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) So far the model has the two first instances memorized. The fifth instance gets properly classified, since it happens to be closer with the first. So, we don’t memorize it.

Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) So far the model has the two first instances memorized. The sixth instance gets wrongly classified, since it happens to be closer with the second. So, we memorize it.

Instance-based learning: IB2 Continuing in a similar way, we finally get, the figure in the right. –The colored points are the one that get memorized. This is the final answer. I.e. we memorize only these 5 points.

Instance-based learning: IB3 IB3: deal with noise –Discard instances that don’t perform well –Keep a record of the number of correct and incorrect classification decisions that each exemplar makes. –Two predetermined thresholds are set on success ratio. –An instance is used for training: If the number of incorrect classifications is  the first threshold, and If the number of correct classifications  the second threshold.

Instance-based learning: IB3 Suppose the lower threshold is 0, and upper threshold is 1. Shuffle the data first –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) –(70,110,yes) –(25,400,yes) –(50,100,no) –(45,350,yes) –(50,275,yes) –(60,260,yes)

Instance-based learning: IB3 Suppose the lower threshold is 0, and upper threshold is 1. Shuffle the data first –(25,60,no) [1,1] –(85,140,yes) [1,1] –(45,60,no) [0,1] –(30,260,yes) [0,2] –(50,75,no) [0,1] –(50,120,no) [0,1] –(70,110,yes) [0,0] –(25,400,yes) [0,1] –(50,100,no) [0,0] –(45,350,yes) [0,0] –(50,275,yes) [0,1] –(60,260,yes) [0,0]

Instance-based learning: IB3 The points that will be used in classification are: –(45,60,no) [0,1] –(30,260,yes) [0,2] –(50,75,no) [0,1] –(50,120,no) [0,1] –(25,400,yes) [0,1] –(50,275,yes) [0,1]

Rectangular generalizations When a new exemplar is classified correctly, it is generalized by simply merging it with the nearest exemplar. The nearest exemplar may be either a single instance or a hyper- rectangle.

Rectangular generalizations Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) –(70,110,yes) –(25,400,yes) –(50,100,no) –(45,350,yes) –(50,275,yes) –(60,260,yes)

Classification Class 1 Class 2 Separation line If the new instance lies within a rectangle then output the rectangle class If the new instance lies in the overlap of several rectangles, then output the class of the rectangle whose center is the closest to the new data instance. If the new instance lies outside any of the rectangles, output the class of the rectangle, which is the closest to the data instance. The distance of a point from a rectangle is: 1.If an instance lies within rectangle, d=0 2.If outside, d = distance from the closest rectangle part, i.e. distance from some point in the rectangle boundary.