Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services.

Slides:

Advertisements

Similar presentations

Chapter 4 Pattern Recognition Concepts: Introduction & ROC Analysis.

Advertisements

Learning Algorithm Evaluation

Evaluation of segmentation. Example Reference standard & segmentation.

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.

Chapter 15: Decisions Under Risk and Uncertainty McGraw-Hill/Irwin Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved.

Chapter 5 – Evaluating Classification & Predictive Performance

Chapter 4 – Evaluating Classification & Predictive Performance © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel.

Curva ROC figuras esquemáticas Curva ROC figuras esquemáticas Prof. Ivan Balducci FOSJC / Unesp.

ROC Statistics for the Lazy Machine Learner in All of Us Bradley Malin Lecture for COS Lab School of Computer Science Carnegie Mellon University 9/22/2005.

Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.

Model Evaluation Metrics for Performance Evaluation

Copyright © Siemens Medical Solutions, USA, Inc.; All rights reserved. Polyhedral Classifier for Target Detection A Case Study: Colorectal Cancer.

Cost-Sensitive Classifier Evaluation Robert Holte Computing Science Dept. University of Alberta Co-author Chris Drummond IIT, National Research Council,

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.

Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.

9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.

Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.

1 The Expected Performance Curve Samy Bengio, Johnny Mariéthoz, Mikaela Keller MI – 25. oktober 2007 Kresten Toftgaard Andersen.

Data mining and statistical learning - lecture 13 Separating hyperplane.

Jeremy Wyatt Thanks to Gavin Brown

ROC Curve and Classification Matrix for Binary Choice Professor Thomas B. Fomby Department of Economics SMU Dallas, TX February, 2015.

Chapter 5 Data mining : A Closer Look.

Decision Tree Models in Data Mining

Evaluation of Learning Models

Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.

Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 15.

Evaluating Classifiers

Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides

BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.

Criteria for Assessment of Performance of Cancer Risk Prediction Models: Overview Ruth Pfeiffer Cancer Risk Prediction Workshop, May 21, 2004 Division.

Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.

Evaluation – next steps

Version 1.2 Copyright © 2000 by Harcourt, Inc. All rights reserved. Requests for permission to make copies of any part of the work should be mailed to:

Some Background Assumptions Markowitz Portfolio Theory

Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 5: Classification Trees: An Alternative to Logistic.

9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.

Analyzing and Interpreting Quantitative Data

Investment Analysis and Portfolio Management First Canadian Edition By Reilly, Brown, Hedges, Chang 6.

A SAS Macro to Calculate the C-statistic Bill O’Brien BCBSMA SAS Users Group March 10, 2015.

Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.

F. Provost and T. Fawcett. Confusion Matrix 2Bitirgen - CS678.

Evaluating Results of Learning Blaž Zupan

Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.

Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.

Evaluating Predictive Models Niels Peek Department of Medical Informatics Academic Medical Center University of Amsterdam.

Preventing Overfitting Problem: We don’t want to these algorithms to fit to ``noise’’ Reduced-error pruning : –breaks the samples into a training set and.

Ensemble Learning for Low-level Hardware-supported Malware Detection

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Chapter 6: Analyzing and Interpreting Quantitative Data

© Copyright McGraw-Hill 2004

1 Performance Measures for Machine Learning. 2 Performance Measures Accuracy Weighted (Cost-Sensitive) Accuracy Lift Precision/Recall –F –Break Even Point.

Evaluating Classification Performance

Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.

SPH 247 Statistical Analysis of Laboratory Data. Binary Classification Suppose we have two groups for which each case is a member of one or the other,

Measure of System Effectiveness Missile Defense System By Alfred Terris UNCL:ASSIFIED1.

Chapter 5 – Evaluating Predictive Performance Data Mining for Business Analytics Shmueli, Patel & Bruce.

Data Analytics CMIS Short Course part II Day 1 Part 4: ROC Curves Sam Buttrey December 2015.

Evaluation of Learning Models Evgueni Smirnov. Overview Motivation Metrics for Classifier’s Evaluation Methods for Classifier’s Evaluation Comparing Data.

Classification Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 24, 2015.

2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.

Lecture 1.31 Criteria for optimal reception of radio signals.

Chapter 15: Decisions Under Risk and Uncertainty

Performance Evaluation 02/15/17

Data Mining Classification: Alternative Techniques

Features & Decision regions

Roc curves By Vittoria Cozza, matr

Chapter 15: Decisions Under Risk and Uncertainty

Presentation transcript:

Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 2 Rule-Based Knowledge Extraction  A typical goal in extracting knowledge from data is the production of a classification rule that will assign a class membership to a future event with a specified probability  A binary classifier assigns an object to one of two classes The decision regarding the class assignment will be either correct or incorrect, so there are four possible outcomes: {Predicted Event, Actual Event}(True Positive) {Predicted Event, Actual Nonevent}(False Positive) {Predicted Nonevent, Actual Event}(False Negative) {Predicted Nonevent, Actual Nonevent}(True Negative)

Copyright © 2003, SAS Institute Inc. All rights reserved. 3 Evaluating Classifier Performance  Use 2x2 classification table of predicted vs actual class membership  A critical concept in the discussion of decisions is the definition of an event An observation or instance, I, has been classified into class e with probability if the classifier assigns a probability

Copyright © 2003, SAS Institute Inc. All rights reserved. 4 The Cost of a Decision  Correct Decision:TP = p(E|p) =  False Positive:FP = p(E|n) = …  Assume that correct decisions incur no cost  The theoretical expected cost of misclassifying an instance I is

Copyright © 2003, SAS Institute Inc. All rights reserved. 5 Receiver Operating Characteristic  Compute a 2x2 classification table for values of and plot the curve traced by (FP, TP) as ranges from 0 to 1  This curve is called the “receiver operating characteristic” and was developed during World War II to assess the performance of radar receivers in detecting targets accurately  The area under the ROC curve (AUC) is defined to be the performance index of interest

Copyright © 2003, SAS Institute Inc. All rights reserved. 6 ROC Plot (.29,.70) (.29,.67)

Copyright © 2003, SAS Institute Inc. All rights reserved. 7 ROC Curve and Decision Costs  ROC curve does not include any class distribution or misclassification cost information in its construction Does not give much guidance in the choice among competing classifiers unless one of them clearly dominates all of the others over all values of  Overlay class distribution and misclassification cost on ROC curve using average cost of decision

Copyright © 2003, SAS Institute Inc. All rights reserved. 8 ROC Curve and Decision Costs (cont’d)  For the 2x2 classification table, the equation becomes  At minimum average cost point, slope of ROC curve is  ROC operating point is sensitive to class distribution, misclassification costs

Copyright © 2003, SAS Institute Inc. All rights reserved. 9 Determine ROC Operating Point  Represent slope of ROC curve using adjacent points to form the isoperformance line  Compute slopes at adjacent points, determine interval containing slope, match with classifier point, find

Copyright © 2003, SAS Institute Inc. All rights reserved. 10 ROC Convex Hull (Provost and Fawcett,1997)  Overlay multiple ROC curves on same (FP, TP) axes

Copyright © 2003, SAS Institute Inc. All rights reserved. 11 ROC Convex Hull (cont’d)  Add convex hull to ROC curves

Copyright © 2003, SAS Institute Inc. All rights reserved. 12 ROC Convex Hull (cont’d)  Add isoperformance line

Copyright © 2003, SAS Institute Inc. All rights reserved. 13 Selecting Classifiers Using ROC Method  The isoperformance line, which is tangent to the ROCCH at the point of minimum expected cost, indicates which classifier to use for a specified combination of class distribution and misclassification costs  Furthermore, the ROCCH method indicates the range of slopes over which a particular classifier is optimal with respect to class and costs

Copyright © 2003, SAS Institute Inc. All rights reserved. 14 Selecting Classifiers Using ROCCH (cont’d)  Convex hull points + associated classifier

Copyright © 2003, SAS Institute Inc. All rights reserved. 15 Selecting Classifiers Using ROCCH (cont’d)  Range of slopes, points of tangency, classifier

Copyright © 2003, SAS Institute Inc. All rights reserved. 16 Selecting Classifiers Using ROCCH (cont’d)  Classifier and AUC for German credit ensemble classifiers

Copyright © 2003, SAS Institute Inc. All rights reserved. 17 Selecting Classifiers Using ROCCH (cont’d)  Ensemble classifiers for Catalog Direct Mail

Copyright © 2003, SAS Institute Inc. All rights reserved. 18 Selecting Classifiers Using ROCCH (cont’d)  Classifier and AUC for Catalog Direct Mail

Copyright © 2003, SAS Institute Inc. All rights reserved. 19 Selecting Classifiers Using ROCCH (cont’d)  Ensemble classifiers for KDD-98 Cup

Copyright © 2003, SAS Institute Inc. All rights reserved. 20 Selecting Classifiers Using ROCCH (cont’d)  Classifier and AUC for KDD-98 Cup

Copyright © 2003, SAS Institute Inc. All rights reserved. 21 Summary  The ROCCH methodology for selecting binary classifiers explicitly includes class distribution and misclassification costs in its formulation.  It is a robust alternative to whole-curve metrics like AUC, which reports global classifier performance but which may not indicate the best classifier (in the least-cost sense) for the range of operating conditions under which the classifier will assign class memberships.

Copyright © 2003, SAS Institute Inc. All rights reserved. 22 Copyright © 2003, SAS Institute Inc. All rights reserved. 22