An Excel-based Data Mining Tool Chapter 4. 4.1 The iData Analyzer.

Slides:



Advertisements
Similar presentations
Data Mining Tools Overview Business Intelligence for Managers.
Advertisements

Decision Tree Approach in Data Mining
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
Data Mining: A Closer Look Chapter Data Mining Strategies.
Chapter 9 Business Intelligence Systems
Clustering (slide from Han and Kamber)
Basic Data Mining Techniques Chapter Decision Trees.
Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.
Part II Tools for Knowledge Discovery. Knowledge Discovery in Databases Chapter 5.
Basic Data Mining Techniques
Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.
Neural Networks Chapter Feed-Forward Neural Networks.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Genetic Algorithm Genetic Algorithms (GA) apply an evolutionary approach to inductive learning. GA has been successfully applied to problems that are difficult.
Classification.
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 An Excel-based Data Mining Tool Chapter The iData Analyzer.
Part I: Classification and Bayesian Learning
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Enterprise systems infrastructure and architecture DT211 4
Evaluating Performance for Data Mining Techniques
Chapter 7 Decision Tree.
1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Basic Data Mining Techniques
Lab2 CPIT 440 Data Mining and Warehouse.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
An Excel-based Data Mining Tool Chapter The iData Analyzer.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Inductive learning Simplest form: learn a function from examples
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
Chapter 9 Neural Network.
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 3/1 Copyright © 2004 Please……. No Food Or Drink in the class.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Basic Data Mining Technique
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
1 Statistical Techniques Chapter Linear Regression Analysis Simple Linear Regression.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
N. GagunashviliRAVEN Workshop Heidelberg Nikolai Gagunashvili (University of Akureyri, Iceland) Data mining methods in RAVEN network.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Part I Data Mining Fundamentals Chapter 1 Data Mining: A First View Jason C. H. Chen, Ph.D. Professor.
Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.
Part II Tools for Knowledge Discovery Ch 5. Knowledge Discovery in Databases Ch 6. The Data Warehouse Ch 7. Formal Evaluation Technique.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 4 An Excel-based Data Mining Tool (iData Analyzer) Jason C. H. Chen, Ph.D. Professor of MIS.
DM.Lab in University of Seoul Data Mining Laboratory April 24 th, 2008 Summarized by Sungjick Lee An Excel-Based Data Mining Tool iData Analyzer.
Excel and Data Analysis. Excel can be a powerful tool for analysis Excel provides many tools for analyzing data –Filtering –Sorting –Formulas –Charts.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Big data classification using neural network
Chapter 6 Decision Tree.
NBA Draft Prediction BIT 5534 May 2nd 2018
An Excel-based Data Mining Tool
Prepared by: Mahmoud Rafeek Al-Farra
12/2/2018.
CSCI N317 Computation for Scientific Applications Unit Weka
©Jiawei Han and Micheline Kamber
Presentation transcript:

An Excel-based Data Mining Tool Chapter 4

4.1 The iData Analyzer

Figure 4.1 The iDA system architecture

Figure 4.2 A successful installation

4.2 ESX: A Multipurpose Tool for Data Mining Figure 4.3 An ESX concept hierarchy

4.3 iDAV Format for Data Mining

4.4 A Five-step Approach for Unsupervised Clustering Step 1: Enter the Data to be Mined Step 2: Perform a Data Mining Session Step 3: Read and Interpret Summary Results Step 4: Read and Interpret Individual Class Results Step 5: Visualize Individual Class Rules

Step 1: Enter The Data To Be Mined Figure 4.4 The Credit Card Promotion Database

Step 2: Perform A Data Mining Session Figure 4.5 Unsupervised settings for ESX

Figure 4.6 RuleMaker options

Step 3: Read and Interpret Summary Results Class Resemblance Scores Domain Resemblance Score Domain Predictability

Figure 4.8 Summery statistics for the Acme credit card promotion database

Figure 4.9 Statistics for numerical attributes and common categorical attribute values

Step 4: Read and Interpret Individual Class Results Class Predictability is a within-class measure. Class Predictiveness is a between- class measure.

Figure 4.10 Class 3 summary results

Figure 4.11 Necessary and sufficient attribute values for Class 3

Step 5: Visualize Individual Class Rules Figure 4.7 Rules for the credit card promotion database

4.5 A Six-Step Approach for Supervised Learning Step 1: Choose an Output Attribute Step 2: Perform the Mining Session Step 3: Read and Interpret Summary Results Step 4: Read and Interpret Test Set Results Step 5: Read and Interpret Class Results Step 6: Visualize and Interpret Class Rules

Figure 4.12 Test set instance classification Read and Interpret Test Set Results

4.6 Techniques for Generating Rules Simple Procedure for Creating Best Set of Covering Rules 1.Choose an attribute that best differentiate all domains. 2.Use the attribute to further subdivide instances into classes. 3.For each subclass created in step If the instances in the subclass satisfy a predefined criteria Then generate a defining rule for the subclass. 3.2 If the subclass does not satisfy the predefined criteria Then repeat step 1

4.6 Techniques for Generating Rules (RuleMaker) 1.Define the scope of the rules. 2.Choose the instances. 3.Set the minimum rule correctness. 4.Define the minimum rule coverage. 5.Choose an attribute significance value.

4.7 Instance Typicality The average similarity of instance to all other instances within its class. Identify prototypical and outlier instances. Select a best set of training instances. Used to compute individual instance classification confidence scores.

Figure 4.13 Instance typicality

4.8 Special Considerations and Features Avoid Mining Delays – at some point copy the original data into another Excel sheet The Quick Mine Feature – recommended when the dataset contains more than 2000 instances Erroneous and Missing Data – blank lines, beyond the last column, invalid characters