Instance Based Learning

Slides:



Advertisements
Similar presentations
1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.
Advertisements

Linear Classifiers (perceptrons)
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Indian Statistical Institute Kolkata
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Lazy vs. Eager Learning Lazy vs. eager learning
Classification and Decision Boundaries
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
Instance Based Learning
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Lezione 8 - Instance based learning Prof. Giancarlo.
K nearest neighbor and Rocchio algorithm
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions.
Instance Based Learning
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Instance Based Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán) 1.
INSTANCE-BASE LEARNING
CS Instance Based Learning1 Instance Based Learning.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
K Nearest Neighborhood (KNNs)
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule Non-parametric : Can be used with arbitrary distributions, No need to assume that the form.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
1 Instance Based Learning Ata Kaban The University of Birmingham.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 6: Nearest and k-nearest Neighbor Classification.
CpSc 881: Machine Learning Instance Based Learning.
CpSc 810: Machine Learning Instance Based Learning.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.
Information Retrieval and Organisation Chapter 14 Vector Space Classification Dell Zhang Birkbeck, University of London.
DATA MINING LECTURE 10b Classification k-nearest neighbor classifier
CS Machine Learning Instance Based Learning (Adapted from various sources)
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.
1 Instance Based Learning Soongsil University Intelligent Systems Lab.
KNN & Naïve Bayes Hongning Wang
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Data Science Algorithms: The Basic Methods
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Instance Based Learning
Information Retrieval
Classification Nearest Neighbor
Instance Based Learning
Instance Based Learning (Adapted from various sources)
K Nearest Neighbor Classification
Classification Nearest Neighbor
Nearest-Neighbor Classifiers
یادگیری بر پایه نمونه Instance Based Learning Instructor : Saeed Shiry
Text Categorization Assigning documents to a fixed set of categories
COSC 4335: Other Classification Techniques
Nearest Neighbor Classifiers
Chap 8. Instance Based Learning
CSC 558 – Data Analytics II, Prep for assignment 1 – Instance-based (lazy) machine learning January 2018.
Machine Learning: UNIT-4 CHAPTER-1
Nearest Neighbors CSC 576: Data Mining.
Data Mining Classification: Alternative Techniques
Nearest Neighbor Classifiers
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Presentation transcript:

Instance Based Learning Machine Learning Instance Based Learning Bai Xiao

Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit abstract generalization but classifies new instances based on direct comparison and similarity to known training instances. Training can be very easy, just memorizing training instances. Testing can be very expensive, requiring detailed comparison to all past training instances. Also known as: K-Nearest Neighbor Locally weighted regression Radial basis function Case based reasoning Lazy Learning

Similarity/Distance Metrics Instance-based methods assume a function for determining the similarity or distance between any two instances. For continuous feature vectors, Euclidian distance is the generic choice: Where ap(x) is the value of the pth feature of instance x. For discrete features, assume distance between two values is 0 if they are the same and 1 if they are different (e.g. Hamming distance for bit vectors).

K-Nearest Neighbor Calculate the distance between a test point and every training instance. Pick the k closest training examples and assign the test instance to the most common category amongst these nearest neighbors. Voting multiple neighbors helps decrease susceptibility to noise. Usually use odd value for k to avoid ties.

5-Nearest Neighbor Example

Implicit Classification Function Although it is not necessary to explicitly calculate it, the learned classification rule is based on regions of the feature space closest to each training example. For 1-nearest neighbor with Euclidian distance, the Voronoi diagram gives the complex polyhedra segmenting the space into the regions closest to each point.

Efficient Indexing Linear search to find the nearest neighbors is not efficient for large training sets. Indexing structures can be built to speed testing. For Euclidian distance, a kd-tree can be built that reduces the expected time to find the nearest neighbor to O(log n) in the number of training examples. Nodes branch on threshold tests on individual features and leaves terminate at nearest neighbors. Other indexing structures possible for other metrics or string data. Inverted index for text retrieval.

Nearest Neighbor Variations Can be used to estimate the value of a real-valued function (regression) by taking the average function value of the k nearest neighbors to an input point. All training examples can be used to help classify a test instance by giving every training example a vote that is weighted by the inverse square of its distance from the test instance.

Locally Weighted Regression Note kNN forms local approximation to f for each query point xq Why not form an explicit approximation f’(x) for region surrounding xq Fit linear function to k nearest neighbors where minimize the distance based error over k nearest neighbors. Update the weights of the linear function by using gradient descent.

Radial Basis Function The target function is based on the distance with the instance xu, One common choice for Ku is

Radial Basis Function Choose xu, scatter uniformly throughout instance space Or use training instance First choose variance for each Ku, then hold Ku fixed and train linear output layer.

Lazy and Eager Learning Lazy: wait for query before generalizing K-nearest neighbor Eager: generalize before seeing query Naïve Bayes, Decision Tree Eager learner must cerate global approximation Lazy learner can create many local approximations