Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin

Slides:



Advertisements
Similar presentations
Machine Learning Homework
Advertisements

Florida International University COP 4770 Introduction of Weka.
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
Dori Baldwin Student Info Mgmt Coordinator Kent ISD.
Welcome to E-Prime E-Prime refers to the Experimenter’s Prime (best) development studio for the creation of computerized behavioral research. E-Prime is.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
WEKA (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains.
WEKA - Experimenter (sumber: WEKA Explorer user Guide for Version 3-5-5)
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Academic Advisor: Dr. Yuval Elovici Technical Advisor: Dr. Lidror Troyansky ADD Presentation.
March 25, 2004Columbia University1 Machine Learning with Weka Lokesh S. Shrestha.
Microsoft Visio is diagramming software for Microsoft Windows. It uses vector graphics to create diagrams. The 2007 Standard and Professional editions.
An Extended Introduction to WEKA. Data Mining Process.
1 Statistical Learning Introduction to Weka Michel Galley Artificial Intelligence class November 2, 2006.
+ Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Hands-on Introduction to Visual Basic.NET Programming Right from the Start with Visual Basic.NET 1/e 6.
I. Pribela, M. Ivanović Neum, Content Automated assessment Testovid system Test generator Module generators Conclusion.
CLassification TESTING Testing classifier accuracy
CS1100: Access Reports A (Very) Short Tutorial on Microsoft Access Report Construction Created By Martin Schedlbauer With contributions from Matthew Ekstrand-Abueg.
WEKA – Knowledge Flow & Simple CLI
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
WEKA and Machine Learning Algorithms. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of.
Appendix: The WEKA Data Mining Software
1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Hands-on predictive models and machine learning for software Foutse Khomh, Queen’s University Segla Kpodjedo, École Polytechnique de Montreal PASED - Canadian.
1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Touchstone Automation’s DART ™ (Data Analysis and Reporting Tool)
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
For ITCS 6265/8265 Fall 2009 TA: Fei Xu UNC Charlotte.
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
Feature Selection Benjamin Biesinger - Manuel Maly - Patrick Zwickl.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Genesys Shell development Input-side development progress.
Weka – A Machine Learning Toolkit October 2, 2008 Keum-Sung Hwang.
WEKA Machine Learning Toolbox. You can install Weka on your computer from
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
W E K A Waikato Environment for Knowledge Aquisition.
An Exercise in Machine Learning
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
WEKA's Knowledge Flow Interface Data Mining Knowledge Discovery in Databases ELIE TCHEIMEGNI Department of Computer Science Bowie State University, MD.
Introduction to CADStat. CADStat and R R is a powerful and free statistical package [
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Waikato Environment for Knowledge Analysis
WEKA.
Sampath Jayarathna Cal Poly Pomona
Machine Learning with WEKA
Machine Learning with WEKA
Learning about Taxes with Intuit ProFile
Hands-on Introduction to Visual Basic .NET
Welcome to E-Prime E-Prime refers to the Experimenter’s Prime (best) development studio for the creation of computerized behavioral research. E-Prime is.
Weka Package Weka package is open source data mining software written in Java. Weka can be applied to your dataset from the GUI, the command line or called.
Machine Learning with Weka
Opening Weka Select Weka from Start Menu Select Explorer Fall 2003
Learning about Taxes with Intuit ProFile
Machine Learning with Weka
Machine Learning with WEKA
Lecture 10 – Introduction to Weka
Data Mining CSCI 307, Spring 2019 Lecture 7
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin

What is different from Explorer? Experimenter is used for ‘batches’ of experiments Can only be used for Classification and Regression problems* Results are generated in a different way – Explorer: % correct = (sum of correctly classified instances for all test folds)/(Total No. of instances in dataset) – Experimenter: % correct = Average of correctly predicted over all folds *Possible to perform attribute selection but not covered here

Experimenter- Overview Compare the performance of different learning schemes easily Allows better analysis than Explorer Results: write-to-file or database Evaluation: cross-validation, learning curve, or hold-out Ability to iterate over different parameter settings Statistical significance tests “for free”!

Experimenter - Overview The interface essentially has three ‘panes’: – Setup: Configure experiments – Run: Generate results files – Analyse: Analyse the results of the experiments

Always use ‘corrected’ T-Tester! Use this to decide how you compare results

Lots of different ways to compare results!

Knowledge Flow Interface “Visual: drag-and-drop” user interface for WEKA - intuitive Java-Beans-based Can do everything that Explorer does (plus a bit more), but not as comprehensively as Experimenter Data sources, classifiers, etc. are beans and can be connected graphically Data “flows” through modules: e.g., “data source” ->“filter” ->“classifier”-> “evaluator” KF layouts can be saved and re-used later

Knowledge Flow: An Example What we want to do: – Take a dataset – Do some attribute selection – Perform some classification on the reduced data using 10 fold CV – Examine the subsets selected for each CV fold – Visualise the results in text format and ROC

Getting Started

A few ‘hidden’ steps…

Add the Classifier learner

…and the performance evaluator

TextViewers can be used for visualisation of results as well as examining the processes – more later…

‘Right-clicking’ on each ‘block’ allows you to configure it as well as ‘wire-up’ to others…

Connect: dataSet to CrossValidationFoldMaker

Continue to ‘wire-up’ each ‘block’…

…and so on

To see the results output: ‘dump’ the text to TextViewer…

When you have finished ‘wiring-up’, it’s time to configure each of the components/blocks…

Set the path/filename(s) of the datasets you would like to load…

Once all is configured, you are ready to start…

Once all experiments have finished, we can visualise the results…

Output is similar to that of console window of Explorer

But there are also ways save these results if we want to keep them for later…

TextViewer components are also useful for ‘looking-inside’ processes…

For example: attribute selection….

It is also possible to visualise data in a similar way to Explorer…e.g. ROC/threshold curves

Some problems you may encounter… Often caused by incorrectly defined.arff files… - too many attribs defined in the header - Incorrectly types Be aware also that WEKA labels the dataset by whatever name you put in field!

Some problems you may encounter… You may experience an error related to Java heap size if: -The initial heap size is too-small -You load a large dataset -Attempt to run a large number of experiments Can be fixed by initialising the JVM with a large initial heap size: Java –Xmx2048m...

Write your own algorithms… WEKA is Open Source! Much of the work is already done for you Take advantage of the WEKA framework Writing code and contributing to the WEKA project now easier than before see: +to+WEKA%3F

Conclusion Experimenter and Knowledge Flow: – offer useful and flexible ways to perform a range of batches of experiments – Beware of the way in which results are generated! – KF is particularly useful for visualisation – Experimenter more suited to learning Just a snapshot of capabilities of WEKA! Want more info? me (or Richard) These slides available at: