Fruıt ımage recognıtıon wıth weka

Slides:

Advertisements

Similar presentations

COMP3740 CR32: Knowledge Management and Adaptive Systems

Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.

Machine Learning III Decision Tree Induction

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.

Feature Selection Presented by: Nafise Hatamikhah

Final Project: Project 9 Part 1: Neural Networks Part 2: Overview of Classifiers Aparna S. Varde April 28, 2005 CS539: Machine Learning Course Instructor:

Three kinds of learning

1 Statistical Learning Introduction to Weka Michel Galley Artificial Intelligence class November 2, 2006.

ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.

Rotation Forest: A New Classifier Ensemble Method 交通大學電子所蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.

An Exercise in Machine Learning

Issues with Data Mining

Fall 2004 TDIDT Learning CS478 - Machine Learning.

Machine Learning Chapter 3. Decision Tree Learning

Mohammad Ali Keyvanrad

1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.

GATree: Genetically Evolved Decision Trees 전자전기컴퓨터공학과 데이터베이스 연구실 G 김태종.

Learning from Observations Chapter 18 Through

Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)

Hands-on predictive models and machine learning for software Foutse Khomh, Queen’s University Segla Kpodjedo, École Polytechnique de Montreal PASED - Canadian.

Ensemble Methods: Bagging and Boosting

Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.

Computational Intelligence: Methods and Applications Lecture 20 SSV & other trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.

CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.

Slides for “Data Mining” by I. H. Witten and E. Frank.

1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.

Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.

Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.

An Exercise in Machine Learning

***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.

Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Copyright  2004 limsoon wong Using WEKA for Classification (without feature selection)

Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.

Machine Learning Reading: Chapter Classification Learning Input: a set of attributes and values Output: discrete valued function Learning a continuous.

Machine Learning with Spark MLlib

Presented by Khawar Shakeel

Computational Intelligence: Methods and Applications

Advanced data mining with TagHelper and Weka

National Taiwan University

Chapter 6 Classification and Prediction

Dipartimento di Ingegneria «Enzo Ferrari»,

CS6604 Project Ensemble Classification

Issues in Decision-Tree Learning Avoiding overfitting through pruning

Introduction to Data Mining, 2nd Edition by

Introduction to Data Mining, 2nd Edition by

Introduction to Data Mining, 2nd Edition by

Data Mining Practical Machine Learning Tools and Techniques

Weka Free and Open Source ML Suite Ian Witten & Eibe Frank

Machine Learning with Weka

CS539: Project 3 Zach Pardos.

Machine Learning Chapter 3. Decision Tree Learning

Machine Learning: Lecture 3

Tutorial for WEKA Heejun Kim June 19, 2018.

Predicting Breast Cancer Diagnosis From Fine-Needle Aspiration

Discriminative Frequent Pattern Analysis for Effective Classification

Opening Weka Select Weka from Start Menu Select Explorer Fall 2003

Machine Learning Chapter 3. Decision Tree Learning

Text Analytics and Machine Learning Workshop

CSCI N317 Computation for Scientific Applications Unit Weka

Ensemble learning.

Machine Learning in Practice Lecture 22

CS539 Project Report -- Evaluating hypothesis

Ensemble learning Reminder - Bagging of Trees Random Forest

Lecture 10 – Introduction to Weka

CS412 – Machine Learning Sentiment Analysis - Turkish Tweets

Statistical Learning Introduction to Weka

Sofia Pediaditaki and Mahesh Marina University of Edinburgh

Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Data Mining CSCI 307, Spring 2019 Lecture 8

Presentation transcript:

Fruıt ımage recognıtıon wıth weka Ahmet Sapan 17693 Itır Ege Deger 19334 Mehmet Fazıl Tuncay 17528

AIM & method Finding the best/accurate classifier and effective features to classify data with the given fruit features. Method uses deep-learning techniques for feature extraction and classification Various classification methods are tested in order to achieve the best results

Method#1 – 1031 attrıbute, zeror and cross val wıth 10 folds ZeroR simplest classification method, relies on target and ignores all predictors Selects the most frequent value as target (in our case selects the most frequent ClassId) Our aim was to find the features that affects the results/accuracy of the classifier, but zeroR ignores all of these features Total 7720 instance, 83 of them correctly classified Accuracy : 1.07% Classification took very short time Also tested with only first 1024 features and ClassId, accuracy was the same

Method#2 – 1031 attrıbute, j48 and cross val wıth 10 folds J48 is a class for generating pruned or unpruned decision tree Pruned tree, classification made with confident factor (c) = 0.25 Total 7720 instance, 6244 of them correctly classified Accuracy: 80.88% Method#3 – 1031 attrıbute, j48 and use traınıng set Uses only training set Tests on the same data that was learned 7578 of the instances correctly classified Accuracy: 98.16% Might be overfitting, accuracy of unseen data might be poor Not very reliable

Figure1. Visualization of classification errors (using training set, accuracy: 98%)

Method#4 – 1025 attrıbute, j48 and cross val wıth 10 folds Features such as MediaId, Family, Genus, Date, Latitude, Longitude is removed 1617 correctly classified instance Accuracy: 20.94% Also tried adding MediaId and accuracy was same, so MediaId has no effect Method#5 – 1024+Famıly+classıd, j48 and cross val wıth 10 folds 4470 correctly classified instance Accuracy: 57.90% Almost 40 percent increase

Figure 2. Visualization of classification errors (for method#4, accuracy: 21%)

Method#6 – 1024+Genus+classıd, j48 and cross val wıth 10 folds 6233 correctly classified instance Accuracy: 80.73% Almost the same accuracy as when we included all the features Tree size: 1176 «Genus» is distinctive, divides decision tree efficiently Method#7 – 1024+DATE+classıd, j48 and cross val wıth 10 folds 4470 correctly classified instance Accuracy: 45.14% Almost 40 percent increase Tree size: 3684

Method#8 – 1024+Latıtude+classıd, j48 and cross val wıth 10 folds 1616 correctly classified instance Accuracy: 20.93% Almost the same accuracy as when we only used first 1024 features Has no effect Method#9 – 1024+Longıtude+classıd, j48 and cross val wıth 10 folds Accuracy: 20.93% Same as Latitude

Method#10 – 1031 Features, confırmatıon wıth bestfırst Up to now «Genus» seems as the best feauture that boosts up accuracy Attribute evaluator «CfsSubsetEval» and search method «bestFirst» is used İdentify a subset of attributes that are highly correlated with target while not being strongly correlated with one another. Selected «Genus» «Genus» is correlated with target but not very much with other attributes

Conclusıon Best classifier: J48 Best test method: Cross Validation Distinctive feature that highly affects accuracy: Genus MediaId, Latitude, Longitude has no effect on accuracy Date and Family has a considerable amount of effect

references Why does the C4.5 algorithm use pruning in order to reduce the decision tree and how does pruning affect the predicion accuracy? (n.d.). Retrieved December 19, 2017, from https://stackoverflow.com/questions/10865372/why-does-the-c4-5-algorithm-use-pruning-in- order-to-reduce-the-decision-tree-and Tutorial Exercises for the Weka Explorer (n.d). Retrieved December 19, 2017, from http://cobweb.cs.uga.edu/~khaled/DMcourse/Weka-Tutorial-Exercises.pdf Weka: Decision Trees – J48 (n.d). Retrieved December 19, 2017, from http://stp.lingfil.uu.se/~santinim/ml/2016/Lect_03/Lab02_DecisionTrees.pdf