Download presentation
Presentation is loading. Please wait.
Published byLorin Johnson Modified over 9 years ago
2
CSCI 347 / CS 4206: Data Mining Module 05: WEKA Topic 04: Data Preparation Tools
3
Explorer: Pre-Processing the Data Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for: Discretization, normalization, resampling, attribute selection, transforming and combining attributes, … 2
4
3 WEKA GUI Chooser
5
4 Preprocessing Data in WEKA
6
5
7
6
8
7
9
8
10
9
11
10 Preprocessing Data in WEKA
12
11 Preprocessing Data in WEKA
13
12 Preprocessing Data in WEKA
14
13 Preprocessing Data in WEKA
15
14 Preprocessing Data in WEKA
16
15 Preprocessing Data in WEKA
17
16 Preprocessing Data in WEKA
18
17 Preprocessing Data in WEKA
19
18 Preprocessing Data in WEKA
20
19 Preprocessing Data in WEKA
21
20 Preprocessing Data in WEKA
22
21 Preprocessing Data in WEKA
23
22 Preprocessing Data in WEKA
24
23 Preprocessing Data in WEKA
25
WEKA Explorer: Attribute Selection Panel that can be used to investigate which (subsets of) attributes are the most predictive ones Attribute selection methods contain two parts: A search method: best-first, forward selection, random, exhaustive, genetic algorithm, ranking An evaluation method: correlation-based, wrapper, information gain, chi-squared, … Very flexible: WEKA allows (almost) arbitrary combinations of these two 24
26
25 Attribute Selection in WEKA
27
26 Attribute Selection in WEKA
28
27 Attribute Selection in WEKA
29
28 Attribute Selection in WEKA
30
29 Attribute Selection in WEKA
31
30 Attribute Selection in WEKA
32
31 Attribute Selection in WEKA
33
32 Attribute Selection in WEKA
34
33 Attribute Selection in WEKA
35
Explorer: Data Visualization Visualization very useful in practice: e.g. helps to determine difficulty of the learning problem WEKA can visualize single attributes (1-d) and pairs of attributes (2-d) Color-coded class values “Jitter” option to deal with nominal attributes (and to detect “hidden” data points) “Zoom-in” function 34
36
35 Data Visualization in WEKA
37
36 Data Visualization in WEKA
38
37 Data Visualization in WEKA
39
38 Data Visualization in WEKA
40
39 Data Visualization in WEKA
41
40 Data Visualization in WEKA
42
41 Data Visualization in WEKA
43
42 Data Visualization in WEKA
44
43 Data Visualization in WEKA
45
The Mystery Sound And what would this be? 44
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.