1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.

Slides:



Advertisements
Similar presentations
Machine Learning Instance Based Learning & Case Based Reasoning Exercise Solutions.
Advertisements

Data Mining Classification: Alternative Techniques
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Lazy vs. Eager Learning Lazy vs. eager learning
1er. Escuela Red ProTIC - Tandil, de Abril, Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the.
Classification and Decision Boundaries
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
Instance Based Learning
Nearest Neighbor. Predicting Bankruptcy Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return.
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Lezione 8 - Instance based learning Prof. Giancarlo.
K nearest neighbor and Rocchio algorithm
Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions.
Learning from Observations Chapter 18 Section 1 – 4.
Instance Based Learning
Instance-Based Learning
These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Instance Based Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán) 1.
INSTANCE-BASE LEARNING
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor.
CS Instance Based Learning1 Instance Based Learning.
Module 04: Algorithms Topic 07: Instance-Based Learning
K Nearest Neighborhood (KNNs)
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
Machine Learning CSE 681 CH2 - Supervised Learning.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
1 Instance Based Learning Ata Kaban The University of Birmingham.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
CpSc 881: Machine Learning Instance Based Learning.
CpSc 810: Machine Learning Instance Based Learning.
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, November 23, 1999.
KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
Meta-learning for Algorithm Recommendation Meta-learning for Algorithm Recommendation Background on Local Learning Background on Algorithm Assessment Algorithm.
CS Machine Learning Instance Based Learning (Adapted from various sources)
K-Nearest Neighbor Learning.
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.
1 Instance Based Learning Soongsil University Intelligent Systems Lab.
General-Purpose Learning Machine
Data Science Algorithms: The Basic Methods
Instance Based Learning
Classification Nearest Neighbor
Data Mining: Concepts and Techniques (3rd ed
Instance Based Learning (Adapted from various sources)
K Nearest Neighbor Classification
Nearest-Neighbor Classifiers
یادگیری بر پایه نمونه Instance Based Learning Instructor : Saeed Shiry
Instance Based Learning
COSC 4335: Other Classification Techniques
Nearest Neighbor Classifiers
Chap 8. Instance Based Learning
Advanced Mathematics Hossein Malekinezhad.
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Machine Learning: UNIT-4 CHAPTER-1
Nearest Neighbors CSC 576: Data Mining.
Data Mining Classification: Alternative Techniques
Nearest Neighbor Classifiers
CSE4334/5334 Data Mining Lecture 7: Classification (4)
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Presentation transcript:

1 Classification using instance-based learning

3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity (proximity, inverse of distance) k-Nearest Neighbor (K-NN) method – Distance-weighted K-NN – Axis stretching Locally weighted regression Case-based reasoning Summary: – Lazy vs. eager learning – Classification methods (compare to decision trees) References: Chapter 8 in Machine Learning, Tom Mitchell Classification using instance-based learning

3 March, 2000Advanced Knowledge Management3 Instance-based Learning Key idea: In contrast to learning methods that construct a general, explicit description of the target function when training examples are provided, instance-based learning constructs the target function only when a new instance must be classified. Only store all training examples where x describes the attributes of each instance and f(x) denotes its class (or value). Use a Nearest Neighbor method: Given query instance, first locate nearest (most similar) training example, then estimate

3 March, 2000Advanced Knowledge Management4 Need to consider : 1) Similarity (how to calculate distance) 2) Number (and weight) of similar (near) instances Example Simple 2-D case, each instance described only by two values (x, y co-ordinates). The class is either or

3 March, 2000Advanced Knowledge Management5 Similarity: Euclidean distance, more precisely let an arbitrary instance x be described by the feature vector (set of attributes) as follows: where a r (x) denotes the value of the r th attribute of instance x. Then the distance between two instances xi and xj is defined to be d(x i, x j ) where

3 March, 2000Advanced Knowledge Management6 k-Nearest Neighbor: Given query instance, take vote among its k nearest neighbors to decide its class, (return most common value of f among the k nearest training elements to ) Simple-NN, class is, 5-NN, class is Advantage: overcome class noise in training set.

3 March, 2000Advanced Knowledge Management7 Training Algorithm For each training example, add the example to the list of training_examples Consider, the case of learning a discrete-valued target function of the form Key Advantages Instead of estimating the target function once for the entire data set (which can lead to complex and not necessarily accurate functions) IBL can estimate the required function locally and differently for each new instance to be classified. Algorithms for K-NN

3 March, 2000Advanced Knowledge Management8 Classification Algorithm Given a query instance x q to be classified 1. Let x 1 … x k denote the k instances from training_examples that are nearest to x q 2. Return where argmax f(x) returns the value of x which maximises f(x), e.g.

3 March, 2000Advanced Knowledge Management9 Lazy vs Eager Learning The K-NN method does not form an explicit hypothesis regarding the target classification function. It simply computes the classification for each new query instance as needed. Implied Hypothesis: the following diagram (Voronoi diagram) shows the shape of the implied hypothesis about the decision surface that can be derived for a simple 1- NN case. The decision surface is a combination of complex polyhedra surrounding each of the training examples. For every training example, the polyhedron indicates the set of query points whose classification will be completely determined by that training example.

3 March, 2000Advanced Knowledge Management10 Continuous vs Discrete valued functions (classes) K-NN works well for discrete-valued target functions. Furthermore, the idea can be extended f or continuos (real) valued functions. In this case we can take mean of the f values of k nearest neighbors:

3 March, 2000Advanced Knowledge Management11 When To Consider Nearest Neighbor ? Instances map to points in Average number of attributes (e.g. Less than 20 attributes per instance) Lots of training data When target function is complex but can be approximated by separate local simple approximations Note that efficient methods do exist to allow fast querying (kd-trees)

3 March, 2000Advanced Knowledge Management12 Distance-weighted k-NN Might want to weigh nearer neighbors more heavily and d (x q, x i ) is distance between x q and x i For continous functions: Note now it may make more sense to use all training examples instead of just k.

3 March, 2000Advanced Knowledge Management13 Curse of Dimensionality Imagine instances described by 20 attributes, but only 2 are relevant to target function: Instances that have identical values for the two relevant attributes may nevertheless be distant from one another in the 20-dimensional space. Curse of dimensionality: nearest neighbor is easily misled when high-dimensional X. (Compare to decision trees). One approach: Weight each attribute differently (Use training) 1) Stretch j th axis by weight z j, where z 1, …., z n chosen to minimize prediction error 2) Use cross-validation to automatically choose weights z 1, …., z n 3) Note setting z j to zero eliminates dimension i altogether

3 March, 2000Advanced Knowledge Management14 Locally-weighted Regression Basic idea: k-NN forms local approximation to f for each query point x q Why not form an explicit approximation f (x) for region surrounding x q Fit linear function to k nearest neighbors Fit quadratic,... Thus producing ``piecewise approximation'' to f

3 March, 2000Advanced Knowledge Management15 f1 (simple regression) Training data Predicted value using locally weighted (piece-wise) regression Predicted value using simple regression Locally-weighted regression f2 Locally-weighted regression f3 Locally-weighted regression f4

3 March, 2000Advanced Knowledge Management16 Several choices of error to minimize: e.g Squared error over k nearest neighbors or Distance-weighted square error over all neighbors or ….. f1 (simple regression) Locally-weighted regression f2 Locally-weighted regression f4

3 March, 2000Advanced Knowledge Management17 Case-Based Reasoning Can apply instance-based learning even when X != However, in this case we need different “distance” metrics. For example, case-based reasoning is instance-based learning applied to instances with symbolic logic descriptions: ((user-complaint error53-on-shutdown) (cpu-model PowerPC) (operating-system Windows) (network-connection PCIA) (memory 48meg) (installed-applications Excel Netscape VirusScan) (disk 1gig) (likely-cause ???))

3 March, 2000Advanced Knowledge Management18 Summary: Lazy vs. eager learning. Notion of similarity (proximity, inverse of distance). k-Nearest Neighbor (K-NN) method. Distance-weighted K-NN. Axis stretching (Curse of dimensionality). Locally weighted regression. Case-based reasoning. Compare to classification using decision trees.

3 March, 2000Advanced Knowledge Management19 Next Lecture: Monday 6 March 2000 Dr. Yike Guo Bayesian classification and Bayesian networks. Bring your Bayesian slides.