Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services.

Copyright © 2003, SAS Institute Inc. All rights reserved. 2 Rule-Based Knowledge Extraction  A typical goal in extracting knowledge from data is the production of a classification rule that will assign a class membership to a future event with a specified probability  A binary classifier assigns an object to one of two classes The decision regarding the class assignment will be either correct or incorrect, so there are four possible outcomes: {Predicted Event, Actual Event}(True Positive) {Predicted Event, Actual Nonevent}(False Positive) {Predicted Nonevent, Actual Event}(False Negative) {Predicted Nonevent, Actual Nonevent}(True Negative)

Copyright © 2003, SAS Institute Inc. All rights reserved. 3 Evaluating Classifier Performance  Use 2x2 classification table of predicted vs actual class membership  A critical concept in the discussion of decisions is the definition of an event An observation or instance, I, has been classified into class e with probability if the classifier assigns a probability

Copyright © 2003, SAS Institute Inc. All rights reserved. 4 The Cost of a Decision  Correct Decision:TP = p(E|p) =  False Positive:FP = p(E|n) = …  Assume that correct decisions incur no cost  The theoretical expected cost of misclassifying an instance I is

Copyright © 2003, SAS Institute Inc. All rights reserved. 5 Receiver Operating Characteristic  Compute a 2x2 classification table for values of and plot the curve traced by (FP, TP) as ranges from 0 to 1  This curve is called the “receiver operating characteristic” and was developed during World War II to assess the performance of radar receivers in detecting targets accurately  The area under the ROC curve (AUC) is defined to be the performance index of interest

Copyright © 2003, SAS Institute Inc. All rights reserved. 7 ROC Curve and Decision Costs  ROC curve does not include any class distribution or misclassification cost information in its construction Does not give much guidance in the choice among competing classifiers unless one of them clearly dominates all of the others over all values of  Overlay class distribution and misclassification cost on ROC curve using average cost of decision

Copyright © 2003, SAS Institute Inc. All rights reserved. 8 ROC Curve and Decision Costs (cont’d)  For the 2x2 classification table, the equation becomes  At minimum average cost point, slope of ROC curve is  ROC operating point is sensitive to class distribution, misclassification costs

Copyright © 2003, SAS Institute Inc. All rights reserved. 9 Determine ROC Operating Point  Represent slope of ROC curve using adjacent points to form the isoperformance line  Compute slopes at adjacent points, determine interval containing slope, match with classifier point, find

Copyright © 2003, SAS Institute Inc. All rights reserved. 13 Selecting Classifiers Using ROC Method  The isoperformance line, which is tangent to the ROCCH at the point of minimum expected cost, indicates which classifier to use for a specified combination of class distribution and misclassification costs  Furthermore, the ROCCH method indicates the range of slopes over which a particular classifier is optimal with respect to class and costs

Copyright © 2003, SAS Institute Inc. All rights reserved. 21 Summary  The ROCCH methodology for selecting binary classifiers explicitly includes class distribution and misclassification costs in its formulation.  It is a robust alternative to whole-curve metrics like AUC, which reports global classifier performance but which may not indicate the best classifier (in the least-cost sense) for the range of operating conditions under which the classifier will assign class memberships.

Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services.

Similar presentations

Presentation on theme: "Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services.

Similar presentations

Presentation on theme: "Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services."— Presentation transcript:

Similar presentations

About project

Feedback