Download presentation
Presentation is loading. Please wait.
Published byPhillip O’Brien’ Modified over 9 years ago
1
1 An Excel-based Data Mining Tool Chapter 4
2
2 4.1 The iData Analyzer
3
3
4
4
5
5 4.2 ESX: A Multipurpose Tool for Data Mining
6
6 ESX Supports supervised learning and unsupervised clustering Does not make statistical assumptions Deal with missing attribute values Applied to categorical and numerical data Point out inconsistencies and unusual values
7
7 For supervised classification, ESX can determine those instances and attributes best able to classify new instances For unsupervised clustering, ESX incorporates a globally optimizing evaluation function that encourages a best instance clustering
8
8
9
9 4.3 iDAV Format for Data Mining
10
10
11
11
12
12 4.4 A Five-step Approach for Unsupervised Clustering Step 1: Enter the Data to be Mined Step 2: Perform a Data Mining Session Step 3: Read and Interpret Summary Results Step 4: Read and Interpret Individual Class Results Step 5: Visualize Individual Class Rules
13
13 Step 1: Enter The Data To Be Mined
14
14
15
15 Step 2: Perform A Data Mining Session
16
16
17
17
18
18 Step 3: Read and Interpret Summary Results Class Resemblance Scores Domain Resemblance Score –Attributes, instances, no model Domain Predictability
19
19
20
20
21
21 Step 4: Read and Interpret Individual Class Results Class Predictability is a within- class measure. Class Predictiveness is a between-class measure.
22
22
23
23
24
24 Step 5: Visualize Individual Class Rules
25
25
26
26 4.5 A Six-Step Approach for Supervised Learning Step 1: Choose an Output Attribute Step 2: Perform the Mining Session Step 3: Read and Interpret Summary Results Step 4: Read and Interpret Test Set Results Step 5: Read and Interpret Class Results Step 6: Visualize and Interpret Class Rules
27
27 Read and Interpret Test Set Results
28
28 4.6 Techniques for Generating Rules 1. Choose an attribute 2. use the attribute to subdivide instances into classes 3. –if the instances in the subclass satisfy a predefined criteria, generate a defining rule –If not, repeat 1
29
29 4.6 Techniques for Generating Rules 1.Define the scope of the rules. 2.Choose the instances. 3.Set the minimum rule correctness. 4.Define the minimum rule coverage. 5.Choose an attribute significance value.
30
30
31
31 4.7 Instance Typicality
32
32 Typicality Scores Identify prototypical and outlier instances. Select a best set of training instances. Used to compute individual instance classification confidence scores.
33
33
34
34 4.8 Special Considerations and Features Avoid Mining Delays The Quick Mine Feature Erroneous and Missing Data
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.