 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.

Slides:



Advertisements
Similar presentations
1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.
Advertisements

 2002, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci Deductive.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Instance Based Learning
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Lazy vs. Eager Learning Lazy vs. eager learning
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) KDD Group Research Seminar.
1er. Escuela Red ProTIC - Tandil, de Abril, Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the.
Classification and Decision Boundaries
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
Instance Based Learning
K nearest neighbor and Rocchio algorithm
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Learning from Observations Chapter 18 Section 1 – 4.
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Nearest Neighbor Models (Reading: Chapter.
Instance Based Learning
Instance-Based Learning
Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated.
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
Comparison of Instance-Based Techniques for Learning to Predict Changes in Stock Prices iCML Conference December 10, 2003 Presented by: David LeRoux.
Data Mining Classification: Alternative Techniques
These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.
CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Instance Based Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán) 1.
INSTANCE-BASE LEARNING
CS Instance Based Learning1 Instance Based Learning.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
 2002, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci Case-based.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 4.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
K Nearest Neighborhood (KNNs)
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
11/12/2012ISC471 / HCI571 Isabelle Bichindaritz 1 Prediction.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Statistical Inference (By Michael Jordon) l Bayesian perspective –conditional perspective—inferences.
1 Instance Based Learning Ata Kaban The University of Birmingham.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
CpSc 881: Machine Learning Instance Based Learning.
CpSc 810: Machine Learning Instance Based Learning.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
K nearest neighbors algorithm Parallelization on Cuda PROF. VELJKO MILUTINOVIĆ MAŠA KNEŽEVIĆ 3037/2015.
CS Machine Learning Instance Based Learning (Adapted from various sources)
K-Nearest Neighbor Learning.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
1 March 9, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 4 — Classification.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.
Instance Based Learning
K Nearest Neighbors and Instance-based methods
Instance Based Learning (Adapted from various sources)
K Nearest Neighbor Classification
Text Categorization Assigning documents to a fixed set of categories
Instance Based Learning
COSC 4335: Other Classification Techniques
Chap 8. Instance Based Learning
Machine Learning: UNIT-4 CHAPTER-1
Nearest Neighbors CSC 576: Data Mining.
Presentation transcript:

 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based Learning

 2003, G.Tecuci, Learning Agents Laboratory 2 Overview Exemplar-based representation of concepts The k-nearest neighbor algorithm Discussion Recommended reading Lazy Learning versus Eager Learning

 2003, G.Tecuci, Learning Agents Laboratory 3 Concepts representation Let us consider a set of concepts C = {c 1, c 2,..., c n }, covering a universe of instances I. Each concept c i represents a subset of I. How is a concept usually represented? How does one test whether an object ‘a’ is an instance of a concept “c 1 ”?

 2003, G.Tecuci, Learning Agents Laboratory 4 Intentional representation of concepts How could we represent a concept extensionally? How could we represent a concept extensionally, without specifying all its instances? Usually, a concept is represented intentionally by a description covering the positive examples of the concept and not covering the negative examples. How is a concept usually represented? How does one test whether an object ‘a’ is an instance of a concept “c i ”? The set of instances represented by a concept c i is the set of instances of the description of ci. Therefore, testing if an object a is an instance of a concept c i reduces to testing if the description of c i is more general than the description of a.

 2003, G.Tecuci, Learning Agents Laboratory 5 Exemplar based representation of concepts A concept c i may be represented extensionally by: - a collection of examples c i = {e i1, e i2,...}, - a similarity estimation function f, and - a threshold value . How could a concept c i be generalized in this representation? An instance ‘a’ belongs to the concept c i if ‘a’ is similar to an element e ij of c i, and this similarity is greater than , that is, f(e ij, c i ) > .

 2003, G.Tecuci, Learning Agents Laboratory 6 Generalization in exemplar based representations Generalizing the concept c i may be achieved by: - adding a new exemplar; - decreasing . How could a concept c i be generalized in this representation? Why are these generalization operations? Is there an alternative to considering the threshold value  for classification of an instance?

 2003, G.Tecuci, Learning Agents Laboratory 7 Prediction with exemplar based representations Let us consider a set of concepts C = {c 1, c 2,..., c n }, covering a universe of instances I. Each concept c i is represented extensionally as a collection of examples c i = {e i1, e i2,...}. Let ‘a’ be an instance to classify. How to decide to which concept does ‘a’ belong? Different answers to this question lead to different learning methods.

 2003, G.Tecuci, Learning Agents Laboratory 8 Let ‘a’ be an instance to classify in one of the classes {c 1, c 2,..., c n }. How to decide to which concept does it belong? Method 1 ‘a’ belongs to the concept c i if ‘a’ is similar to an element e ij of c i, and this similarity is greater than the similarity between ‘a’ and any other concept exemplar (1-nearest neighbor). Prediction (cont) What is a potential problem with 1-nearest neighbor? Hint: Think of an exemplar which is not typical.

 2003, G.Tecuci, Learning Agents Laboratory 9 Method 2 Consider the k most similar exemplars. ‘a’ belongs to the concept c i that contains most of the k exemplars (k-nearest neighbor). Prediction (cont) How could the problem with method 1 be alleviated? Use more than one example. What is a potential problem with k-nearest neighbor? Hint: Think of the intuition behind instance-based learning.

 2003, G.Tecuci, Learning Agents Laboratory 10 Answer 3 Consider the k most similar exemplars, but weight their contribution to the class of ‘a’ by their distance to ‘a’, giving greater weight to the closest neighbors (distance-weighted nearest neighbor). Prediction (cont) How could the problem with method 2 be alleviated? Weight the exemplars.

 2003, G.Tecuci, Learning Agents Laboratory 11 Overview Exemplar-based representation of concepts The k-nearest neighbor algorithms Discussion Recommended reading Lazy Learning versus Eager Learning

 2003, G.Tecuci, Learning Agents Laboratory 12 The k-nearest neighbor algorithm Training algorithm Each example is represented as a feature-value vector. For each training example (eik Ci) add eik to the exemplars of Ci. Classification algorithm Let ‘a’ be an instance to classify. Find the k most similar exemplars. Assign ‘a’ to the concept that contains the most of the k exemplars. Each example is represented using the feature-vector representation: ei = (a1=vi1, a2=vi2, …, an=vin) The distance between two examples ei and ej is the Euclidean distance: d(ei, ej) = √ Σ (vik - vjk) 2

 2003, G.Tecuci, Learning Agents Laboratory 13 Nearest neighbors algorithms: illustration e1 1-nearest neighbor: the concept represented by e1 5-nearest neighbors: q1 is classified as negative q1

 2003, G.Tecuci, Learning Agents Laboratory 14 Overview Exemplar based representation of concepts The k-nearest neighbor algorithms Discussion Recommended reading Lazy Learning versus Eager Learning

 2003, G.Tecuci, Learning Agents Laboratory 15 What is the inductive bias of the k-nearest neighbor algorithm? The assumption that the classification of an instance ‘a’ will be most similar to the classification of other instances that are nearby in the Euclidian space. Nearest neighbors algorithms: inductive bias

 2003, G.Tecuci, Learning Agents Laboratory 16 Application issues Which are some practical issues in applying the k-nearest neighbor algorithms? Because the algorithm delays all processing until a new classification/prediction is required, significant processing is needed to make the prediction. Because the distance between instances is based on all the attributes, less relevant attributes and even the irrelevant ones are used in the classification of a new instance. How to alleviate these problems? Because the algorithm is based on a distance function, the attribute values should be such that a distance could be computed.

 2003, G.Tecuci, Learning Agents Laboratory 17 Application issue: the use of the attributes Weight the contribution of each attribute, based on its relevance. The classification of an example is based on all the attributes, independent of their relevance. Even the irrelevant attributes are used. How to alleviate this problem? How to determine the relevance of an attribute? Use an approach similar to cross-validation. How?

 2003, G.Tecuci, Learning Agents Laboratory 18 Tress where the leaves are exemplars, nearby exemplars are stored at nearby nodes, and internal nodes sort the query to the relevant leaf by testing selected attributes. Application issue: processing for classification Because the algorithm delays all processing until a new classification/prediction is required, significant processing is needed to make the prediction. How to alleviate this problem? Use complex indexing techniques to facilitate the identification of the nearest neighbors at some additional cost in memory. How?

 2003, G.Tecuci, Learning Agents Laboratory 19 Instance-based learning: discussion Which are the advantages of the instance-based learning algorithms? Which are the disadvantages of the instance-based learning algorithms?

 2003, G.Tecuci, Learning Agents Laboratory 20 Instance-based learning: advantages Model complex concept descriptions using simpler example descriptions. Information present in the training examples is never lost, because the examples themselves are stored explicitly.

 2003, G.Tecuci, Learning Agents Laboratory 21 Instance-based learning: disadvantages Efficiency of labeling new instances is low, because all processing is done at prediction time. It is difficult to determine an appropriate distance function, especially when examples are represented as complex symbolic expressions. Irrelevant features have a negative impact of on the distance metric.

 2003, G.Tecuci, Learning Agents Laboratory 22 Lazy Learning versus Eager Learning Lazy learning Defer the decision of how to generalize beyond the training data until each new query instance is encountered. Eager learning Generalizes beyond the training data before observing the new query, committing at the training time to the learned concept. Lazy learners require less computation time for training and more for prediction. How do the two types of learning compare in terms of computation time?

 2003, G.Tecuci, Learning Agents Laboratory 23 Exercise Suggest a lazy version of the eager decision tree learning algorithm ID3. What are the advantages and disadvantages of your lazy algorithm compared to the original eager algorithm?

 2003, G.Tecuci, Learning Agents Laboratory 24 Recommended reading Mitchell T.M., Machine Learning, Chapter 8: Instance-based learning, pp , McGraw Hill, Kibler D, Aha D., Learning Representative Exemplars of Concepts: An Initial Case Study, in J.W.Shavlik, T.G.Dietterich (eds), Readings in Machine Learning, Morgan Kaufmann, 1990.