An Excel-based Data Mining Tool Chapter 4
4.1 The iData Analyzer
Figure 4.1 The iDA system architecture
Figure 4.2 A successful installation
4.2 ESX: A Multipurpose Tool for Data Mining
Figure 4.3 An ESX concept hierarchy
4.3 iDAV Format for Data Mining
4.4 A Five-step Approach for Unsupervised Clustering Step 1: Enter the Data to be Mined Step 2: Perform a Data Mining Session Step 3: Read and Interpret Summary Results Step 4: Read and Interpret Individual Class Results Step 5: Visualize Individual Class Rules
Step 1: Enter The Data To Be Mined
Figure 4.4 The Credit Card Promotion Database
Step 2: Perform A Data Mining Session
Figure 4.5 Unsupervised settings for ESX
Figure 4.6 RuleMaker options
Step 3: Read and Interpret Summary Results Class Resemblance Scores Domain Resemblance Score Domain Predictability
Figure 4.8 Summery statistics for the Acme credit card promotion database
Figure 4.9 Statistics for numerical attributes and common categorical attribute values
Step 4: Read and Interpret Individual Class Results Class Predictability is a within-class measure. (=1 for necessary condition) Class Predictiveness is a between-class measure. (=1 for sufficient condition)
between-class measure.
Figure 4.10 Class 3 summary results
Figure 4.11 Necessary and sufficient attribute values for Class 3
Step 5: Visualize Individual Class Rules
Figure 4.7 Rules for the credit card promotion database
4.5 A Six-Step Approach for Supervised Learning Step 1: Choose an Output Attribute Step 2: Perform the Mining Session Step 3: Read and Interpret Summary Results Step 4: Read and Interpret Test Set Results Step 5: Read and Interpret Class Results Step 6: Visualize and Interpret Class Rules
Read and Interpret Test Set Results Figure 4.12 Test set instance classification
4.6 Techniques for Generating Rules Ref. Figure 4.6 All rules or covering set rules Define the scope of the rules. Choose the instances. Set the minimum rule correctness. Define the minimum rule coverage. Choose an attribute significance value
4.7 Instance Typicality
Typicality Scores Identify prototypical and outlier instances. Select a best set of training instances. Used to compute individual instance classification confidence scores.
Figure 4.13 Instance typicality
4.8 Special Considerations and Features Avoid Mining Delays The Quick Mine Feature Erroneous and Missing Data