Machine Learning: UNIT-4 CHAPTER-1

Slides:

Advertisements

Similar presentations

1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.

Advertisements

Machine Learning Instance Based Learning & Case Based Reasoning Exercise Solutions.

Data Mining Classification: Alternative Techniques

K-means method for Signal Compression: Vector Quantization

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.

Instance Based Learning

1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)

Lazy vs. Eager Learning Lazy vs. eager learning

1er. Escuela Red ProTIC - Tandil, de Abril, Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the.

Classification and Decision Boundaries

Instance Based Learning

Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Lezione 8 - Instance based learning Prof. Giancarlo.

Pattern Recognition and Machine Learning

K nearest neighbor and Rocchio algorithm

MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 

RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.

Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions.

Radial Basis Functions

Instance Based Learning

Instance-Based Learning

Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.

These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.

Aprendizagem baseada em instâncias (K vizinhos mais próximos)

KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.

Instance Based Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán) 1.

INSTANCE-BASE LEARNING

CS Instance Based Learning1 Instance Based Learning.

Radial-Basis Function Networks

11/12/2012ISC471 / HCI571 Isabelle Bichindaritz 1 Prediction.

 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Statistical Inference (By Michael Jordon) l Bayesian perspective –conditional perspective—inferences.

1 Instance Based Learning Ata Kaban The University of Birmingham.

CpSc 881: Machine Learning Instance Based Learning.

CpSc 810: Machine Learning Instance Based Learning.

Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.

Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, November 23, 1999.

KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.

Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.

CS Machine Learning Instance Based Learning (Adapted from various sources)

Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.

Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.

CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.

1 Instance Based Learning Soongsil University Intelligent Systems Lab.

Machine Learning Supervised Learning Classification and Regression

Deep Feedforward Networks

Instance Based Learning

Data Mining: Concepts and Techniques (3rd ed

Data Mining Lecture 11.

Instance Based Learning (Adapted from various sources)

K Nearest Neighbor Classification

Data Mining Practical Machine Learning Tools and Techniques

Lecture 25 Radial Basis Network (II)

Nearest-Neighbor Classifiers

یادگیری بر پایه نمونه Instance Based Learning Instructor : Saeed Shiry

Neuro-Computing Lecture 4 Radial Basis Function Network

Instance Based Learning

COSC 4335: Other Classification Techniques

Chap 8. Instance Based Learning

Advanced Mathematics Hossein Malekinezhad.

Chapter 8: Generalization and Function Approximation

Nearest Neighbors CSC 576: Data Mining.

Nearest Neighbor Classifiers

Machine learning overview

Machine Learning: UNIT-3 CHAPTER-1

Memory-Based Learning Instance-Based Learning K-Nearest Neighbor

Support Vector Machines 2

Reinforcement Learning (2)

Presentation transcript:

Machine Learning: UNIT-4 CHAPTER-1 Instance-Based Learning (IBL) By A.Naga Raju

General Description IBL methods learn by simply storing the presented training data. When a new query instance is encountered, a set of similar related instances is retrieved from memory and used to classify the new query instance. IBL approaches can construct a different approximation to the target function for each distinct query. They can construct local rather than global approximations. IBL methods can use complex symbolic representations for instances. This is called Case-Based Reasoning (CBR).

Advantages and Disadvantages of IBL Methods Advantage: IBL Methods are particularly well suited to problems in which the target function is very complex, but can still be described by a collection of less complex local approximations. Disadvantage I: The cost of classifying new instances can be high (since most of the computation takes place at this stage). Disadvantage II: Many IBL approaches typically consider all attributes of the instances ==> they are very sensitive to the curse of dimensionality!

k-Nearest Neighbour Learning Assumption: All instances, x, correspond to points in the n-dimensional space Rn. x =<a1(x), a2(x)…an(x)>. Measure Used: Euclidean Distance: d(xi,xj)= r=1n (ar(xi)-ar(xj))2 Training Algorithm: For each training example <x,f(x)>, add the example to the list training_examples. Classification Algorithm: Given a query instance xq to be classified: Let x1…xk be the k instances from training_examples that are nearest to xq. Return f^(xq) <- argmaxvVr=1n (v,f(xi)) where (a,b)=1 if a=b and (a,b)=0 otherwise.

Example + - + - - : query, xq Decision Surface for 1-NN 1-NN: +

Distance-Weighted Nearest Neighbour k-NN can be refined by weighing the contribution of the k neighbours according to their distance to the query point xq, giving greater weight to closer neighbours. To do so, replace the last line of the algorithm with f^(xq) <- argmaxvVr=1n wi(v,f(xi)) where wi=1/d(xq,xi)2

Remarks on k-NN k-NN can be used for regression instead of classification. k-NN is robust to noise and, it is generally quite a good classifier. k-NN’s disadvantage is that it uses all attributes to classify instances Solution 1: weigh the attributes differently (use cross-validation to determine the weights) Solution 2: eliminate the least relevant attributes (again, use cross-validation to determine which attributes to eliminate)

Locally Weighted Regression Locally weighted regression generalizes nearest-neighbour approaches by constructing an explicit approximation to f over a local region surrounding xq. In such approaches, the contribution of each training example is weighted by its distance to the query point.

An Example: Locally Weighted Linear Regression f is approximated by: f^(x)=w0+w1a1(x)+…+wnan(x) Gradient descent can be used to find the coefficients w0, w1,…wn that minimize some error function. The error function, however, should be different from the one used in the Neural Net since we want a local solution. Different possibilities: Minimize the squared error over just the k nearest neighbours. Minimize the squared error over the entire training set but weigh the contribution of each example by some decreasing function K of its distance from xq. Combine 1 and 2

Radial Basis Function (RBF) Approximating Function: f^(x)=w0+ u=1k wu Ku(d(xu,x)) Ku(d(xu,x)) is a kernel function that decreases as the distance d(xu,x) increases (e.g., the Gaussian function); and k is a user-defined constant that specifies the number of kernel functions to be included. Although f^(x) is a global approximation to f(x) the contribution of each kernel function is localized. RBF can be implemented in a neural network. It is a very efficient two step algorithm: Find the parameters of the kernel functions (e.g., use the EM algorithm) Learn the linear weights of the kernel functions.

Case-Based Reasoning (CBR) CBR is similar to k-NN methods in that: They are lazy learning methods in that they defer generalization until a query comes around. They classify new query instances by analyzing similar instances while ignoring instances that are very different from the query. However, CBR is different from k-NN methods in that: They do not represent instances as real-valued points, but instead, they use a rich symbolic representation. CBR can thus be applied to complex conceptual problems such as the design of mechanical devices or legal reasoning

Lazy versus Eager Learning Lazy methods: k-NN, locally weighted regression, CBR Eager methods: RBF + all the methods we studied in the course so far. Differences in Computation Time: Lazy methods learn quickly but classify slowly Eager methods learn slowly but classify quickly Differences in Classification Approaches: Lazy methods search a larger hypothesis space than eager methods because they use many different local functions to form their implicit global approximation to the target function. Eager methods commit at training time to a single global approximation.