For ITCS 6265/8265 Fall 2009 TA: Fei Xu UNC Charlotte
Contents What is weka? RAFF data format Interface Explorer … Trouble shooting
WEKA: the bird Copyright: Martin Kramer
Weka: Data Mining Software Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. Weka is open source software in JAVA issued under the GNU General Public License.
WEKA only deals with “flat” files Weka has it own file format: Attribute-Relation File Format (ARFF) Header section and Data section Supported attributes: numeric, nominal, string, date Details at:
mpg cylinders displacement horsepower weight acceleration year origin {1,2,3} % 1 = usa; 2 = europe; 3 =
Explorer Interface Explorer pre-processing the data building “classifiers” (demo) clustering data finding associations attribute selection data visualization
Other interface More at: Experimenter makes it easy to compare the performance of different learning schemes Java-Beans-based interface for setting up and running machine learning experiments. Command line interface
Trouble shooting OutOfMemoryException Find “RunWeka.ini” under weka installation directory, default location in windows is “C:\Program Files\Weka- 3-6”. Find “maxheap” and change the value to proper size, for example 512M More at: