Download presentation
1
CSC 196k Semester Project: Instance Based Learning
Weka Assignment 2 Glynis Hawley
2
Agenda Background: Instance-based Learning Project Conclusions
Requirements Data Progress Conclusions References
3
Background: Instance Based Learning
Learning/classification based on information stored in a “set” of examples No rules or decision trees “New” instance classified based on its similarity to one (or more) stored example(s) e.g. Nearest-neighbor
4
IBL Algorithm research by David W. Aha
Two papers helpful in understanding this assignment Instance-based Learning Algorithms David W. Aha , Dennis Kibler, Marc K. Albert 1991 Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms David W. Aha 1992 Three algorithms: IB1, IB2, IB3
5
IB1: Instance-Based Learner version 1
Similar to nearest neighbor algorithm Differences: Normalizes all attributes in range [0,1] Handles missing attributes Training: Stores all instances from training set Classification: Searches all stored instances for nearest neighbor. High computational and spatial expense
6
IB2: Instance-Based Learner version 2
Attempts to reduce storage requirements and computational complexity Saves only misclassified instances Algorithm: Stored instances = {}; For each instance in the training set, Tentatively classify the instance based on nearest stored instance. If classification != true class Add the instance to the stored set Tends to accumulate noisy instances
7
IB3: Instance-Based Learner version 3
Tracks the performance of each exemplar Uses only those that are “good enough” Performance exceeds some upper threshold Discards those that are “not good enough” Performance falls below some lower threshold Exemplars “in between” Performance statistics upgraded whenever exemplar is the nearest neighbor to a “new” instance Performance and storage better that IB1 and IB2
8
Aha’s Results Results are averaged over 50 trials. [1:274], [2:57]
9
The Weka IBL Project Implement IB2 and IB3
Compare their performance with that of IB1 and C4.5 (Weka version is called J48) Data Iris data: for initial testing of IB2 LED data Glass data
10
LED Dataset Synthetic dataset created with led-creator.c [3]
8 attributes 7 segments of display: 0 or 1 Class: digits 0 through 9 Input Number of instances to be created Seed % noise per attribute 10% noise means each bit has a 10% chance of being flipped
11
Glass Identification Dataset
214 instances 163 Window glass (building windows and vehicle windows) 87 float processed 70 building windows 17 vehicle windows 76 non-float processed 76 building windows 0 vehicle windows 51 Non-window glass 13 containers 9 tableware 29 headlamps
12
Progress Report - Accomplished
Implemented IB2 Modification of IB1 class methods buildClassifier( ) updateClassifier( ) Preliminary testing with iris data Compared accuracy of IB1, IB2, and C4.5 on LED data 10 sets of 700 instances each with 10% noise training set = first 200 instances of each set testing set = last 500 instances of each set
13
Compare David Aha’s results [2:52] (over 50 trials): IB1: 70. 5 0
Compare David Aha’s results [2:52] (over 50 trials): IB1: 70.5 0.4 % IB2: 62.5 0.6 %
15
Progress Report - To Do Implement IB3
More involved than IB2 Even more difficult when you don’t know java Test accuracy of IB3 on LED data to compare with that of IB1, IB2, and C4.5 Test accuracy of IB1, IB2, IB3, and C4.5 on the glass data
16
Conclusions Thus far, comparisons of IB1 and IB2 are similar to David Aha’s results. Weka assignments (except perhaps #1) Are somewhat vague. Require some research to determine what actual project requirements should be. Are valuable in building an understanding of the algorithms and their design.
17
References [1] Aha, David W Training noisy, irrelevant and novel attributes in instance-based learning algorithms. International Journal of Man-Machine Studies 36(2): [2] Aha, David W., Dennis Kibler and Marc Albert Instance-based learning algorithms. Machine Learning 6:37-66. [3]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.