Download presentation
Presentation is loading. Please wait.
1
WEKA
2
Copyright: Martin Kramer (mkramer@wxs.nl)
Weka: the bird Copyright: Martin Kramer 9/22/2018 University of Waikato
3
9/22/2018 University of Waikato
4
9/22/2018 University of Waikato
5
9/22/2018 University of Waikato
6
Hamilton 9/22/2018 University of Waikato
7
WEKA: the software Waikato Environment for Knowledge Analysis
Collection of state-of-the-art machine learning algorithms and data processing tools implemented in Java Released under the GPL Support for the whole process of experimental data mining Preparation of input data Statistical evaluation of learning schemes Visualization of input data and the result of learning Used for education, research and applications Complements “Data Mining” by Witten & Frank 9/22/2018 University of Waikato
8
Main Features 49 data preprocessing tools
76 classification/regression algorithms 8 clustering algorithms 15 attribute/subset evaluators + 10 search algorithms for feature selection 3 algorithms for finding association rules 3 graphical user interfaces “The Explorer” (exploratory data analysis) “The Experimenter” (experimental environment) “The KnowledgeFlow” (new process model inspired interface) 9/22/2018 University of Waikato
9
History Project funded by the NZ government since 1993
Develop state-of-the art workbench of data mining tools Explore fielded applications Develop new fundamental methods 9/22/2018 University of Waikato
10
History - timeline Late 1992 - funding was applied for by Ian Witten
development of the interface and infrastructure WEKA acronym coined by Geoff Holmes WEKA’s file format “ARFF” was created by Andrew Donkin ARFF was rumored to stand for Andrew’s Ridiculous File Format Sometime in first internal release of WEKA TCL/TK user interface + learning algorithms written mostly in C Very much beta software Changes for the b1 release included (among others): “Ambiguous and Unsupported menu commands removed.” “Crashing processes handled (in most cases :-)” October first public release of WEKA (v 2.1) 9/22/2018 University of Waikato
11
History - timeline July 1997 - WEKA 2.2
Schemes: 1R, T2, K*, M5, M5Class, IB1-4, FOIL, PEBLS, support for C5 Included a facility (based on unix makefiles) for configuring and running large scale experiments Early decision was made to rewrite WEKA in Java Originated from code written by Eibe Frank for his PhD Originally codenamed JAWS (JAva Weka System) May WEKA 2.3 Last release of the TCL/TK-based system Mid WEKA 3 (100% Java) released Version to complement the Data Mining book Development version (including GUI) 9/22/2018 University of Waikato
12
Back then… 9/22/2018 University of Waikato
14
Today:
15
Explorer: pre-processing the data
Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL databases using JDBC Pre-processing tools in WEKA are called “filters” WEKA contains filters for: Discretization, normalization, resampling, attribute selection, attribute combination, … 9/22/2018 University of Waikato
16
Explorer: Building classification models
“Classifiers” in WEKA are models for predicting nominal or numeric quantities Implemented schemes include: Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “Meta”-classifiers include: Bagging, boosting, stacking, error-correcting output codes, data cleansing, … 9/22/2018 University of Waikato
17
Explorer: classification
9/22/2018 University of Waikato
18
Explorer: classification
9/22/2018 University of Waikato
19
Explorer: classification
9/22/2018 University of Waikato
20
Explorer: classification
9/22/2018 University of Waikato
21
Explorer: classification
9/22/2018 University of Waikato
22
Explorer: classification
9/22/2018 University of Waikato
23
Explorer: classification
9/22/2018 University of Waikato
24
Explorer: classification
9/22/2018 University of Waikato
25
Explorer: classification
9/22/2018 University of Waikato
26
KnowledgeFlow: process flows
9/22/2018 University of Waikato
27
KnowledgeFlow: batch processing
9/22/2018 University of Waikato
28
KnowledgeFlow: batch processing
9/22/2018 University of Waikato
29
KnowledgeFlow: incremental processing
9/22/2018 University of Waikato
30
Experimenter 9/22/2018 University of Waikato
31
Experimenter 9/22/2018 University of Waikato
32
Experimenter 9/22/2018 University of Waikato
33
Impact - downloads 9/22/2018 University of Waikato
34
Projects based on WEKA Incorporate/wrap WEKA Extend/modify WEKA
GRB Tool Shed - a tool to aid gamma ray burst research YALE - facility for large scale ML experiments GATE - NLP workbench with a WEKA interface Judge - document clustering and classification Extend/modify WEKA BioWeka - extension library for knowledge discovery in biology WekaMetal - meta learning extension to WEKA Weka-Parallel - parallel processing for WEKA Grid Weka - grid computing using WEKA Weka-CG - computational genetics tool library 9/22/2018 University of Waikato
35
The WEKA Project Today FRST funding for the next two years
Goal of the project remains the same People 6 staff 2 postdocs 3 PhD students 3 MSc students 2 research programmers 9/22/2018 University of Waikato
36
The Future Continue to develop and support WEKA
MOA (Massive Online Analysis) Framework that supports learning from data streams Facilities for data generation, experimental analysis, learning algorithms, etc. The Moa (another native NZ bird) is not only flightless, like the Weka, but also extinct First public release, probably this Christmas, or perhaps Thanksgiving (as it’s just another turkey) MILK Multi-Instance Learning Kit Proper Propositionalization toolbox for WEKA 9/22/2018 University of Waikato
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.