Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-1 Data Mining Methods: Classification Most frequently used DM method Employ supervised.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Evaluation.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Indexing Structures for Files.
Credibility: Evaluating what’s been learned. Evaluation: the key to success How predictive is the model we learned? Error on the training data is not.
Chapter Extension 14 Database Marketing © 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke.
Chapter Extension 12 Database Marketing.
Database Processing for Business Intelligence Systems
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Chapter 4: Data Mining for Business Intelligence
Chapter 5: Data Mining for Business Intelligence
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Chapter 5: Data Mining for Business Intelligence
Chapter 5: Data Mining for Business Intelligence
Chapter 5: Data Mining for Business Intelligence
Chapter 11 Simple Regression
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Chapter 4: Data Mining for Business Intelligence
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Chapter 11 Business Intelligence Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 11-1.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 11.5 Lines and Curves in Space.
Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. Chapter 5 Ratio, Proportion, and Measurement.
Confidence Intervals Population Mean σ 2 Unknown Confidence Intervals Population Proportion σ 2 Known Copyright © 2013 Pearson Education, Inc. Publishing.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(1)-1 Chapter 2: Displaying and Summarizing Data Part 1: Displaying Data With Charts.
Chapter 7 Risk Management.
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. Section 1.3 Complex Numbers Quadratic Equations in the Complex Number System.
Classification as data mining tool Classification as data mining tool Done by William Hellela William Hellela Rauf Gadar Alex Prewett.
Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Copyright © 2014 Pearson Education, Inc. 5-1 DATA MINING.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
The Inverse Trigonometric Functions (Continued)
Sources & Representation of Data
Sinusoidal Curve Fitting
Section 9.1 Polar Coordinates
Section R.8 nth Roots; Rational Exponents
Building Exponential, Logarithmic, and Logistic Models from Data
Chapter 5: Data Mining for Business Intelligence
Section 2.4 Circles Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
Equations Quadratic in Form Absolute Value Equations
Linear Models: Building Linear Functions from Data
Section 8.3 The Law of Cosines
Section 11.8 Linear Programming
Equations Quadratic in Form Absolute Value Equations
Copyright © 2008 Pearson Prentice Hall Inc.
Copyright © 2008 Pearson Prentice Hall Inc.
Polynomial and Rational Inequalities
Mathematical Models: Building Functions
Copyright © 2008 Pearson Prentice Hall Inc.
Classification and Prediction
Copyright © 2008 Pearson Prentice Hall Inc.
Partial Fraction Decomposition
Intro to Machine Learning
Section 3.2 The Graph of a Function
Sinusoidal Curve Fitting
Systems of Linear Equations: Matrices
Quadratic Equations in the Complex Number System
Partial Fraction Decomposition
Properties of Rational Functions
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Copyright © 2008 Pearson Prentice Hall Inc.
Copyright © 2008 Pearson Prentice Hall Inc.
The Inverse Trigonometric Functions (Continued)
Graphs of the Tangent, Cotangent, Cosecant, and Secant Functions
Graphs of the Tangent, Cotangent, Cosecant, and Secant Functions
Presentation transcript:

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-1 Data Mining Methods: Classification Most frequently used DM method Employ supervised learning Learn from past data, classify new data The output variable is categorical (nominal or ordinal) in nature Classification versus regression? Classification versus clustering?

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-2 Assessment Methods for Classification Predictive accuracy Hit rate Speed Model building; predicting Robustness Scalability Interpretability The level of understanding provided by the mdoel

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-3 Accuracy of Classification Models In classification problems, the primary source for accuracy estimation is the confusion matrix

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-4 Estimation Methodologies for Classification Simple split (or holdout or test sample estimation) Split the data into 2 mutually exclusive sets training (~70%) and testing (30%)

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-5 Estimation Methodologies for Classification k-Fold Cross Validation (rotation estimation) Split the data into k mutually exclusive subsets Use each subset as testing while using the rest of the subsets as training Repeat the experimentation for k times Aggregate the test results for true estimation of prediction accuracy training Other estimation methodologies Area under the ROC curve

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-6 Estimation Methodologies for Classification – ROC Curve

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-7 Example - BMW dealership The dealership is starting a promotional campaign, whereby it is trying to push a two-year extended warranty to its past customers. The dealership has done this before and has gathered 4,500 data points from past sales of extended warranties. The attributes in the data set are: Income bracket [0=$0-$30k, 1=$31k-$40k, 2=$41k-$60k, 3=$61k-$75k, 4=$76k-$100k, 5=$101k-$150k, 6=$151k-$500k, 7=$501k+] Year/month first BMW bought Year/month most recent BMW bought Whether they responded to the extended warranty offer in the past

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-8 Weka Input file format