Fruıt ımage recognıtıon wıth weka

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Machine Learning III Decision Tree Induction
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Feature Selection Presented by: Nafise Hatamikhah
Final Project: Project 9 Part 1: Neural Networks Part 2: Overview of Classifiers Aparna S. Varde April 28, 2005 CS539: Machine Learning Course Instructor:
Three kinds of learning
1 Statistical Learning Introduction to Weka Michel Galley Artificial Intelligence class November 2, 2006.
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
An Exercise in Machine Learning
Issues with Data Mining
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Mohammad Ali Keyvanrad
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
GATree: Genetically Evolved Decision Trees 전자전기컴퓨터공학과 데이터베이스 연구실 G 김태종.
Learning from Observations Chapter 18 Through
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
Hands-on predictive models and machine learning for software Foutse Khomh, Queen’s University Segla Kpodjedo, École Polytechnique de Montreal PASED - Canadian.
Ensemble Methods: Bagging and Boosting
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
Computational Intelligence: Methods and Applications Lecture 20 SSV & other trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
Slides for “Data Mining” by I. H. Witten and E. Frank.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
An Exercise in Machine Learning
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Copyright  2004 limsoon wong Using WEKA for Classification (without feature selection)
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Machine Learning Reading: Chapter Classification Learning Input: a set of attributes and values Output: discrete valued function Learning a continuous.
Machine Learning with Spark MLlib
Presented by Khawar Shakeel
Computational Intelligence: Methods and Applications
Advanced data mining with TagHelper and Weka
National Taiwan University
Chapter 6 Classification and Prediction
Dipartimento di Ingegneria «Enzo Ferrari»,
CS6604 Project Ensemble Classification
Issues in Decision-Tree Learning Avoiding overfitting through pruning
Introduction to Data Mining, 2nd Edition by
Introduction to Data Mining, 2nd Edition by
Introduction to Data Mining, 2nd Edition by
Data Mining Practical Machine Learning Tools and Techniques
Weka Free and Open Source ML Suite Ian Witten & Eibe Frank
Machine Learning with Weka
CS539: Project 3 Zach Pardos.
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning: Lecture 3
Tutorial for WEKA Heejun Kim June 19, 2018.
Predicting Breast Cancer Diagnosis From Fine-Needle Aspiration
Discriminative Frequent Pattern Analysis for Effective Classification
Opening Weka Select Weka from Start Menu Select Explorer Fall 2003
Machine Learning Chapter 3. Decision Tree Learning
Text Analytics and Machine Learning Workshop
CSCI N317 Computation for Scientific Applications Unit Weka
Ensemble learning.
Machine Learning in Practice Lecture 22
CS539 Project Report -- Evaluating hypothesis
Ensemble learning Reminder - Bagging of Trees Random Forest
Lecture 10 – Introduction to Weka
CS412 – Machine Learning Sentiment Analysis - Turkish Tweets
Statistical Learning Introduction to Weka
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

Fruıt ımage recognıtıon wıth weka Ahmet Sapan 17693 Itır Ege Deger 19334 Mehmet Fazıl Tuncay 17528

AIM & method Finding the best/accurate classifier and effective features to classify data with the given fruit features. Method uses deep-learning techniques for feature extraction and classification Various classification methods are tested in order to achieve the best results

Method#1 – 1031 attrıbute, zeror and cross val wıth 10 folds ZeroR simplest classification method, relies on target and ignores all predictors Selects the most frequent value as target (in our case selects the most frequent ClassId) Our aim was to find the features that affects the results/accuracy of the classifier, but zeroR ignores all of these features Total 7720 instance, 83 of them correctly classified Accuracy : 1.07% Classification took very short time Also tested with only first 1024 features and ClassId, accuracy was the same

Method#2 – 1031 attrıbute, j48 and cross val wıth 10 folds J48 is a class for generating pruned or unpruned decision tree Pruned tree, classification made with confident factor (c) = 0.25 Total 7720 instance, 6244 of them correctly classified Accuracy: 80.88% Method#3 – 1031 attrıbute, j48 and use traınıng set Uses only training set Tests on the same data that was learned 7578 of the instances correctly classified Accuracy: 98.16% Might be overfitting, accuracy of unseen data might be poor Not very reliable

Figure1. Visualization of classification errors (using training set, accuracy: 98%)

Method#4 – 1025 attrıbute, j48 and cross val wıth 10 folds Features such as MediaId, Family, Genus, Date, Latitude, Longitude is removed 1617 correctly classified instance Accuracy: 20.94% Also tried adding MediaId and accuracy was same, so MediaId has no effect Method#5 – 1024+Famıly+classıd, j48 and cross val wıth 10 folds 4470 correctly classified instance Accuracy: 57.90% Almost 40 percent increase

Figure 2. Visualization of classification errors (for method#4, accuracy: 21%)

Method#6 – 1024+Genus+classıd, j48 and cross val wıth 10 folds 6233 correctly classified instance Accuracy: 80.73% Almost the same accuracy as when we included all the features Tree size: 1176 «Genus» is distinctive, divides decision tree efficiently Method#7 – 1024+DATE+classıd, j48 and cross val wıth 10 folds 4470 correctly classified instance Accuracy: 45.14% Almost 40 percent increase Tree size: 3684

Method#8 – 1024+Latıtude+classıd, j48 and cross val wıth 10 folds 1616 correctly classified instance Accuracy: 20.93% Almost the same accuracy as when we only used first 1024 features Has no effect Method#9 – 1024+Longıtude+classıd, j48 and cross val wıth 10 folds Accuracy: 20.93% Same as Latitude

Method#10 – 1031 Features, confırmatıon wıth bestfırst Up to now «Genus» seems as the best feauture that boosts up accuracy Attribute evaluator «CfsSubsetEval» and search method «bestFirst» is used İdentify a subset of attributes that are highly correlated with target while not being strongly correlated with one another. Selected «Genus» «Genus» is correlated with target but not very much with other attributes

Conclusıon Best classifier: J48 Best test method: Cross Validation Distinctive feature that highly affects accuracy: Genus MediaId, Latitude, Longitude has no effect on accuracy Date and Family has a considerable amount of effect

references Why does the C4.5 algorithm use pruning in order to reduce the decision tree and how does pruning affect the predicion accuracy? (n.d.). Retrieved December 19, 2017, from https://stackoverflow.com/questions/10865372/why-does-the-c4-5-algorithm-use-pruning-in- order-to-reduce-the-decision-tree-and Tutorial Exercises for the Weka Explorer (n.d). Retrieved December 19, 2017, from http://cobweb.cs.uga.edu/~khaled/DMcourse/Weka-Tutorial-Exercises.pdf Weka: Decision Trees – J48 (n.d). Retrieved December 19, 2017, from http://stp.lingfil.uu.se/~santinim/ml/2016/Lect_03/Lab02_DecisionTrees.pdf