Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.

Slides:



Advertisements
Similar presentations
Decision Tree Approach in Data Mining
Advertisements

Data Mining: A Closer Look Chapter Data Mining Strategies.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Basic Data Mining Techniques Chapter Decision Trees.
Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.
Basic Data Mining Techniques
Neural Networks Chapter Feed-Forward Neural Networks.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Data Mining: A Closer Look Chapter Data Mining Strategies.
Genetic Algorithm Genetic Algorithms (GA) apply an evolutionary approach to inductive learning. GA has been successfully applied to problems that are difficult.
Classification.
Experimental Evaluation
1 An Excel-based Data Mining Tool Chapter The iData Analyzer.
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Enterprise systems infrastructure and architecture DT211 4
Evaluating Performance for Data Mining Techniques
1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised.
Basic Data Mining Techniques
Introduction to Linear Regression and Correlation Analysis
Chapter 10 Hypothesis Testing
Confidence Intervals and Hypothesis Testing - II
Fundamentals of Hypothesis Testing: One-Sample Tests
An Excel-based Data Mining Tool Chapter The iData Analyzer.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Inductive learning Simplest form: learn a function from examples
COMP3503 Intro to Inductive Modeling
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Chapter 9 Neural Network.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Part I Data Mining Fundamentals Chapter 1 Data Mining: A First View Jason C. H. Chen, Ph.D. Professor.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Part II Tools for Knowledge Discovery Ch 5. Knowledge Discovery in Databases Ch 6. The Data Warehouse Ch 7. Formal Evaluation Technique.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 4 An Excel-based Data Mining Tool (iData Analyzer) Jason C. H. Chen, Ph.D. Professor of MIS.
DM.Lab in University of Seoul Data Mining Laboratory April 24 th, 2008 Summarized by Sungjick Lee An Excel-Based Data Mining Tool iData Analyzer.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 6 The Data Warehouse Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
Data Mining and Decision Support
An Excel-based Data Mining Tool Chapter The iData Analyzer.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Big data classification using neural network
Chapter 7. Classification and Prediction
By Arijit Chatterjee Dr
Chapter 6 Classification and Prediction
Data Mining Lecture 11.
An Excel-based Data Mining Tool
CSCI N317 Computation for Scientific Applications Unit Weka
Presentation transcript:

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 2.1 Data Mining Strategies

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 3 Figure A hierarchy of data mining strategies Data Mining Strategies Unsupervised Clustering Supervised Learning Market Basket Analysis Classification Estimation Prediction Categorical/discrete (current behavior) Numeric Future outcome (categorical/numeric) No output attributes

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Classification Learning is supervised. The dependent variable is categorical. Well-defined classes. Current rather than future behavior. Estimation Learning is supervised. The dependent variable is numeric. Well-defined classes. Current rather than future behavior. Prediction The emphasis is on predicting future rather than current outcomes. The output attribute may be categorical or numeric.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 5 The Cardiology Patient Dataset

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 6 Table 2.1 Cardiology Patient Data

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 7 Table 2.2 Most and Least Typical Instances from the Cardiology Domain

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining A Healthy Class Rule for the Cardiology Patient Dataset IF 169 <= Maximum Heart Rate <=202 THEN Concept Class = Healthy Rule accuracy: 85.07% Rule coverage: 34.55% A Sick Class Rule for the Cardiology Patient Dataset IF Thal = Rev & Chest Pain Type = Asymptomatic THEN Concept Class = Sick Rule accuracy: 91.14% Rule coverage: 52.17%

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Unsupervised Clustering Determine if concepts can be found in the data. Evaluate the likely performance of a supervised model. Determine a best set of input attributes for supervised learning. Detect Outliers.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 10 Market Basket Analysis Find interesting relationships among retail products. Uses association rule algorithms. The results of a market basket analysis help retailers –Design promotions, –Arrange shelf or catalog items, and –Develop cross- marketing strategies

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Supervised Data Mining Techniques

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 12 The Credit Card Promotion Database

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 13 Table 2.3 The Credit Card Promotion Database

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 14 Data file: CreditCardPromotion.xls

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining A Hypothesis for the Credit Card Promotion Database A combination of one or more of the dataset attributes differentiate Acme Credit Card Company card holders who have taken advantage of the life insurance promotion and those card holders who have chosen not to participate in the promotional offer.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining A Production Rule for the Credit Card Promotion Database IF Sex = Female & 19 <=Age <= 43 THEN Life Insurance Promotion = Yes Rule Accuracy: % Rule Coverage: 66.67% Question: Can we assume that two-thirds of all females in the specified age range will take advantage of the promotion? Rule accuracy is a between-class measure. Rule coverage is a within-class measure.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 17 Neural Networks A neural network is a set of interconnected nodes designed to imitate the functioning of the human brain. Two phases of operations –Learning phase at the input layer until it reaches a predetermined minimum error rate –Fixing weights and recompute output values for new instances A major shortcoming of the neural network approach is a lack of explanation about what has been learned.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 18

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 19 Table 2.3 The Credit Card Promotion Database (Note that blue: input attributes, red: output attributes) Therefore, there four input nodes and one output node and chose five hidden-layer nodes.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 20 Table Neural Network Training: Actual and Computed Output

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 21 Table Neural Network Training: Actual and Computed Output

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 22 Statistical Regression Statistical regression is a supervised learning technique that generalizes a set of numeric data by creating a mathematical equation relating one or more input attributes to a single numeric output attributes. Linear regression model is characterized by an output attribute whose value is determined by a linear sum of weighted input attribute values.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Statistical Regression Life insurance promotion = (credit card insurance) (sex) Example, a female who does not have credit card insurance is a likely candidate for the life insurance promotion. Life insurance promotion = (0) – (0) = Because the value is close to 1.0, we conclude that the individual is likely to take advantage of the promotional offer.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Association Rules Association rule mining techniques are used to discover interesting associations between attributes contained in a database. Unlike traditional association rules, association rules can have one or several output attributes. An output attribute for one rule can be an input attribute for another rule. A popular technique for ‘market basket’ analysis. Problem: we may have several rules with little value.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining An Association Rule for the Credit Card Promotion Database IF Sex = Female & Age = over40 & Credit Card Insurance = No THEN Life Insurance Promotion = Yes

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Clustering Techniques By applying unsupervised clustering to the instances of the Acme Credit Card Company database, we will find a subset of input attributes that differentiate card holders who have taken advantage of the life insurance promotion from those cardholders who have not accepted the promotion offer.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 27 IF Sex=Female & 43>=Age>=35 & Credit Card Insurance=NO THEN Class = 3 Rule Accuracy: 100% Rule Coverage: 66.67%

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 28 End here for now!

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Evaluating Performance Performance evaluation is probably the most critical of all the steps in the data mining process. Three general questions: –1. Will the benefits received from a data mining project more than offset the cost of the data mining process? –2. How do we interpret the results of a data mining session? –3. Can we use the results of a data mining process with confidence?

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 30 Evaluating Supervised Learner Models Supervised learner models are designed to classify, estimate, and/or predict future outcome. Applications on classification correctness: –Develop a model to accept to reject credit card applications –Develop a model to accept or reject home mortgage applicants –Develop a model to decide whether or not to drill for oil

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Confusion Matrix A matrix used to summarize the results of a supervised classification. Entries along the main diagonal are correct classifications. Entries other than those on the main diagonal are classification errors.  Classification correctness is best calculated by presenting previously unseen data in the form of a test to the model being evaluated.  A confusion matrix is of little use for evaluating supervised learner models offering numeric output.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 32 Table 2.5 A Three-Class Confusion Matrix Computed Decision C 1 C 2 C 3 C 1 C 11 C 12 C 13 C 2 C 21 C 22 C 23 C 3 C 31 C 32 C 33 Rule 2: for C 2, C 21, C 22, C 23 are all actually members of C 2; but C 21 and C 23 are incorrectly classified as members of another class. Rule 3: for C 2, C 12 and C 32 are instances are incorrectly classified as members of class C 2. Rule 1: Values along the main diagonal represent correct classification e.g., C 11 represents the total number of class C 1 instance correctly classified by the model Computation questions: #1

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 33 Table 2.6 A Simple Confusion Matrix Two-Class Error Analysis The more the better The less the better

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 34 Table 2.7 Two Confusion Matrices Each Showing a 10% Error Rate Which model is better?

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Evaluating Numeric Mean absolute error Mean squared error Root mean squared error

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 36 Comparing Models by Measuring Lift Marketing applications that focus on response rates from mass mailings are less concerned with test set classification error and more interested in building models able to extract bias samples from large populations. Supervised learner models designed for extracting bias samples from a general population are often evaluated by a measure that comes directly from marketing known as lift.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Computing Lift C i is the class of al zero-balance customers who, given the opportunity, will take advantage of the promotional offer.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 38 Figure 2.4 – A lift chart (Targeted vs. mass mailing)

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 39 Table 2.8 Two Confusion Matrices: No Model and an Ideal Model Table 2.9 Two Confusion Matrices for Alternative Models with Lift Equal to 2.25

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 40 Unsupervised Model Evaluation Evaluating unsupervised data mining is, in general, a more difficult task than supervised evaluation. This is true because the goals of an unsupervised data mining session are frequently not as clear as the goals for supervised learning.

Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining 41