Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston.

Similar presentations


Presentation on theme: "1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston."— Presentation transcript:

1 1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston

2 2 What is Weka? Data mining software in Java –Supervised learning (classification) –Unsupervised learning (clustering) Tools –Exploration –Visualization –Experiment –Statistical summary

3 3 Download Weka http://www.cs.waikato.ac.nz/ml/weka/http://www.cs.waikato.ac.nz/ml/weka/ –Window (weka-3-5-6jre.exe) –Linux

4 4 Getting Start

5 5 Memory Limitation in Weka Run Chooser from DOS to increase memory C:\>java -Xmx128m -classpath.;/progra~1/weka-3-5/weka.jar weka.gui.GUIChooser

6 6 Weka GUI

7 7 Explorer

8 8 Open Files (.csv,.arff)

9 9 Dataset’s Description Attributes Dataset’s statistics

10 10 Remove Class Attribute Non-class attributes

11 11 Select A Clustering Algorithm

12 12 Select A Clustering Algorithm

13 13 Select A Clustering Algorithm

14 14 Parameters’ Setting

15 15 Run A Clustering Algorithm

16 16 DBSCAN Results === Run information === Scheme: weka.clusterers.DBScan -E 0.9 -M 6 -I weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase -D weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Relation: iris-weka.filters.unsupervised.attribute.Remove-R5 Instances: 150 Attributes: 4 sepallength sepalwidth petallength petalwidth Test mode: evaluate on training data === Model and evaluation on training set === DBScan clustering results ======================================================================================== Clustered DataObjects: 150 Number of attributes: 4 Epsilon: 0.9; minPoints: 6 Index: weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase Distance-type: weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Number of generated clusters: 1 Elapsed time:.06 ( 0.) 5.1,3.5,1.4,0.2 --> 0 ( 1.) 4.9,3,1.4,0.2 --> 0 ( 2.) 4.7,3.2,1.3,0.2 --> 0 ( 3.) 4.6,3.1,1.5,0.2 --> 0 ( 4.) 5,3.6,1.4,0.2 --> 0 … (146.) 6.3,2.5,5,1.9 --> 0 (147.) 6.5,3,5.2,2 --> 0 (148.) 6.2,3.4,5.4,2.3 --> 0 (149.) 5.9,3,5.1,1.8 --> 0 Clustered Instances 0 150 (100%)

17 17 Simplify A Tested Dataset

18 18 Simplify A Tested Dataset

19 19 Parameters’ Setting

20 20 DBSCAN Clustering Results === Run information === Scheme: weka.clusterers.DBScan -E 0.3 -M 50 -I weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase -D weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Relation: iris-weka.filters.unsupervised.attribute.Remove-R1-2,5 Instances: 150 Attributes: 2 petallength petalwidth Test mode: evaluate on training data === Model and evaluation on training set === DBScan clustering results ======================================================================================== Clustered DataObjects: 150 Number of attributes: 2 Epsilon: 0.3; minPoints: 50 Index: weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase Distance-type: weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclidianDataObject Number of generated clusters: 2 Elapsed time:.03 ( 0.) 1.4,0.2 --> 0 ( 1.) 1.4,0.2 --> 0 ( 2.) 1.3,0.2 --> 0 ( 3.) 1.5,0.2 --> 0 … (146.) 5,1.9 --> 1 (147.) 5.2,2 --> 1 (148.) 5.4,2.3 --> 1 (149.) 5.1,1.8 --> 1 Clustered Instances 0 50 ( 33%) 1 100 ( 67%)

21 21 Run k-Means in Weka

22 22 Parameters’ Setting

23 23 k-Means Clustering Results === Run information === Scheme: weka.clusterers.SimpleKMeans -N 2 -S 10 Relation: iris-weka.filters.unsupervised.attribute.Remove-R1-2,5 Instances: 150 Attributes: 2 petallength petalwidth Test mode: evaluate on training data === Model and evaluation on training set === kMeans ====== Number of iterations: 6 Within cluster sum of squared errors: 5.179687509974782 Cluster centroids: Cluster 0 Mean/Mode: 4.906 1.676 Std Devs: 0.8256 0.4248 Cluster 1 Mean/Mode: 1.464 0.244 Std Devs: 0.1735 0.1072 Clustered Instances 0 100 ( 67%) 1 50 ( 33%)

24 24 ArffViewer: Convert Dataset’s Extension

25 25 Open A Dataset’s file

26 26 Select A Dataset’s File

27 27 View the Dataset

28 28 Manipulate the Dataset (Optional)

29 29 Save As.Arff File

30 30 Weka Documentation


Download ppt "1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston."

Similar presentations


Ads by Google