1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston.

Slides:



Advertisements
Similar presentations
Florida International University COP 4770 Introduction of Weka.
Advertisements

CS910: Foundations of Data Analytics
How to Run WEKA Demo SVM in WEKA T.B. Chen
WEKA (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Neural Networks Chapter Feed-Forward Neural Networks.
Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.
A Short Introduction to Weka Natural Language Processing Thursday, September 25th.
An Extended Introduction to WEKA. Data Mining Process.
1 Statistical Learning Introduction to Weka Michel Galley Artificial Intelligence class November 2, 2006.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
1 An Excel-based Data Mining Tool Chapter The iData Analyzer.
A Short Introduction to Weka Natural Language Processing Thursday, September 27 Frank Enos and Andrew Rosenberg.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Evaluating Performance for Data Mining Techniques
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
An Exercise in Machine Learning
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Contributed by Yizhou Sun 2008 An Introduction to WEKA.
Unsupervised Learning. CS583, Bing Liu, UIC 2 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate.
Project 1: Classification Using Neural Networks Kim, Kwonill Biointelligence laboratory Artificial Intelligence.
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
WEKA and Machine Learning Algorithms. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of.
Appendix: The WEKA Data Mining Software
Figure 1.1 Rules for the contact lens data.. Figure 1.2 Decision tree for the contact lens data.
David R. McWilliams, Ph.D. Section of Statistical Genetics, Department of Biostatistical Sciences, Center for Public Health Genomics Bioinformatician IV.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin
For ITCS 6265/8265 Fall 2009 TA: Fei Xu UNC Charlotte.
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Weka – A Machine Learning Toolkit October 2, 2008 Keum-Sung Hwang.
ITSC/University of Alabama in Huntsville ADaM version 4.0 (Eagle) Tutorial Information Technology and Systems Center University of Alabama in Huntsville.
CS910: Foundations of Data Analytics Graham Cormode Clustering.
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
Clustering Unsupervised learning introduction Machine Learning.
Apache Mahout Qiaodi Zhuang Xijing Zhang.
W E K A Waikato Environment for Knowledge Aquisition.
An Exercise in Machine Learning
Weka Tutorial. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering – association rule Created by.
Machine Learning (ML) with Weka Weka can classify data or approximate functions: choice of many algorithms.
Clustering in R Xue li CS548 showcase. Source html project.org/web/packages/cluster/index.html.
Cluster Analysis Data Mining Experiment Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
WEKA's Knowledge Flow Interface Data Mining Knowledge Discovery in Databases ELIE TCHEIMEGNI Department of Computer Science Bowie State University, MD.
Clustering, performance evaluation, and Term Project 1.Term Project 2.Resource for review.
Diagonal is sum of variances In general, these will be larger when “within” class variance is larger (a bad thing) Sw(iris[,1:4],iris[,5]) Sepal.Length.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
Data Mining – Algorithms: K Means Clustering
An Introduction to WEKA
Clustering CSC 600: Data Mining Class 21.
Waikato Environment for Knowledge Analysis
WEKA.
K-Means Lab.
Weka Package Weka package is open source data mining software written in Java. Weka can be applied to your dataset from the GUI, the command line or called.
Weka Free and Open Source ML Suite Ian Witten & Eibe Frank
Prepared by: Mahmoud Rafeek Al-Farra
Machine Learning with Weka
DataMining, Morgan Kaufmann, p Mining Lab. 김완섭 2004년 10월 27일
Tutorial for WEKA Heejun Kim June 19, 2018.
Opening Weka Select Weka from Start Menu Select Explorer Fall 2003
Lecture 10 – Introduction to Weka
Statistical Learning Introduction to Weka
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
Copyright: Martin Kramer
Data Mining CSCI 307, Spring 2019 Lecture 7
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston

2 What is Weka? Data mining software in Java –Supervised learning (classification) –Unsupervised learning (clustering) Tools –Exploration –Visualization –Experiment –Statistical summary

3 Download Weka –Window (weka-3-5-6jre.exe) –Linux

4 Getting Start

5 Memory Limitation in Weka Run Chooser from DOS to increase memory C:\>java -Xmx128m -classpath.;/progra~1/weka-3-5/weka.jar weka.gui.GUIChooser

6 Weka GUI

7 Explorer

8 Open Files (.csv,.arff)

9 Dataset’s Description Attributes Dataset’s statistics

10 Remove Class Attribute Non-class attributes

11 Select A Clustering Algorithm

12 Select A Clustering Algorithm

13 Select A Clustering Algorithm

14 Parameters’ Setting

15 Run A Clustering Algorithm

16 DBSCAN Results === Run information === Scheme: weka.clusterers.DBScan -E 0.9 -M 6 -I weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase -D weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Relation: iris-weka.filters.unsupervised.attribute.Remove-R5 Instances: 150 Attributes: 4 sepallength sepalwidth petallength petalwidth Test mode: evaluate on training data === Model and evaluation on training set === DBScan clustering results ======================================================================================== Clustered DataObjects: 150 Number of attributes: 4 Epsilon: 0.9; minPoints: 6 Index: weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase Distance-type: weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Number of generated clusters: 1 Elapsed time:.06 ( 0.) 5.1,3.5,1.4,0.2 --> 0 ( 1.) 4.9,3,1.4,0.2 --> 0 ( 2.) 4.7,3.2,1.3,0.2 --> 0 ( 3.) 4.6,3.1,1.5,0.2 --> 0 ( 4.) 5,3.6,1.4,0.2 --> 0 … (146.) 6.3,2.5,5,1.9 --> 0 (147.) 6.5,3,5.2,2 --> 0 (148.) 6.2,3.4,5.4,2.3 --> 0 (149.) 5.9,3,5.1,1.8 --> 0 Clustered Instances (100%)

17 Simplify A Tested Dataset

18 Simplify A Tested Dataset

19 Parameters’ Setting

20 DBSCAN Clustering Results === Run information === Scheme: weka.clusterers.DBScan -E 0.3 -M 50 -I weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase -D weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Relation: iris-weka.filters.unsupervised.attribute.Remove-R1-2,5 Instances: 150 Attributes: 2 petallength petalwidth Test mode: evaluate on training data === Model and evaluation on training set === DBScan clustering results ======================================================================================== Clustered DataObjects: 150 Number of attributes: 2 Epsilon: 0.3; minPoints: 50 Index: weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase Distance-type: weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Number of generated clusters: 2 Elapsed time:.03 ( 0.) 1.4,0.2 --> 0 ( 1.) 1.4,0.2 --> 0 ( 2.) 1.3,0.2 --> 0 ( 3.) 1.5,0.2 --> 0 … (146.) 5,1.9 --> 1 (147.) 5.2,2 --> 1 (148.) 5.4,2.3 --> 1 (149.) 5.1,1.8 --> 1 Clustered Instances 0 50 ( 33%) ( 67%)

21 Run k-Means in Weka

22 Parameters’ Setting

23 k-Means Clustering Results === Run information === Scheme: weka.clusterers.SimpleKMeans -N 2 -S 10 Relation: iris-weka.filters.unsupervised.attribute.Remove-R1-2,5 Instances: 150 Attributes: 2 petallength petalwidth Test mode: evaluate on training data === Model and evaluation on training set === kMeans ====== Number of iterations: 6 Within cluster sum of squared errors: Cluster centroids: Cluster 0 Mean/Mode: Std Devs: Cluster 1 Mean/Mode: Std Devs: Clustered Instances ( 67%) 1 50 ( 33%)

24 ArffViewer: Convert Dataset’s Extension

25 Open A Dataset’s file

26 Select A Dataset’s File

27 View the Dataset

28 Manipulate the Dataset (Optional)

29 Save As.Arff File

30 Weka Documentation