CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.

Slides:



Advertisements
Similar presentations
Imbalanced data David Kauchak CS 451 – Fall 2013.
Advertisements

Authorship Verification Authorship Identification Authorship Attribution Stylometry.
Learning Algorithm Evaluation
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Lecture Notes for Chapter 4 Part III Introduction to Data Mining
Evaluation.
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Model Evaluation Metrics for Performance Evaluation
Model Evaluation Instructor: Qiang Yang
Cost-Sensitive Classifier Evaluation Robert Holte Computing Science Dept. University of Alberta Co-author Chris Drummond IIT, National Research Council,
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Model Evaluation Instructor: Qiang Yang
Tutorial 2 LIU Tengfei 2/19/2009. Contents Introduction TP, FP, ROC Precision, recall Confusion matrix Other performance measures Resource.
ROC & AUC, LIFT ד"ר אבי רוזנפלד.
Evaluation – next steps
2015/7/15University of Waikato1 Significance tests Significance tests tell us how confident we can be that there really is a difference Null hypothesis:
Chapter 5 Data mining : A Closer Look.
Evaluation of Learning Models
Data Mining – Credibility: Evaluating What’s Been Learned
Evaluating Classifiers
Chapter 4 Pattern Recognition Concepts continued.
Evaluation – next steps
Classification II. 2 Numeric Attributes Numeric attributes can take many values –Creating branches for each value is not ideal The value range is usually.
Slides for “Data Mining” by I. H. Witten and E. Frank.
1 Evaluating Model Performance Lantz Ch 10 Wk 5, Part 2 Right – Graphing is often used to evaluate results from different variations of an algorithm. Depending.
Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =
הערכת טיב המודל F-Measure, Kappa, Costs, MetaCost ד " ר אבי רוזנפלד.
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Experimental Evaluation of Learning Algorithms Part 1.
Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier.
Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Inferring implicit/new.
CpSc 810: Machine Learning Evaluation of Classifier.
Evaluating Results of Learning Blaž Zupan
Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
1 Evaluation of Learning Models Literature: Literature: T. Mitchel, Machine Learning, chapter 5 T. Mitchel, Machine Learning, chapter 5 I.H. Witten and.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Classification Evaluation. Estimating Future Accuracy Given available data, how can we reliably predict accuracy on future, unseen data? Three basic approaches.
Evaluating Classification Performance
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Evaluation of learned models Kurt Driessens again with slides stolen from Evgueni Smirnov and Hendrik Blockeel.
Chapter 5 – Evaluating Predictive Performance Data Mining for Business Analytics Shmueli, Patel & Bruce.
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Evaluation of Learning Models Evgueni Smirnov. Overview Motivation Metrics for Classifier’s Evaluation Methods for Classifier’s Evaluation Comparing Data.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten and E. Frank.
Data Mining Practical Machine Learning Tools and Techniques
Evaluation – next steps
Performance Evaluation 02/15/17
Evaluating Results of Learning
Data Mining – Credibility: Evaluating What’s Been Learned
9. Credibility: Evaluating What’s Been Learned
Performance evaluation
Performance Measures II
Chapter 7 – K-Nearest-Neighbor
Data Mining Practical Machine Learning Tools and Techniques
Data Mining Classification: Alternative Techniques
Features & Decision regions
Machine Learning in Practice Lecture 11
Evaluation and Its Methods
Learning Algorithm Evaluation
Model Evaluation and Selection
Computational Intelligence: Methods and Applications
Evaluation and Its Methods
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
ECE – Pattern Recognition Lecture 8 – Performance Evaluation
Slides for Chapter 5, Evaluation
Presentation transcript:

CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures

Counting the Cost ● In practice, different types of classification errors often incur different costs ● Examples:  Terrorist profiling ● “Not a terrorist” correct 99.99% of the time  Loan decisions  Oil-slick detection  Fault diagnosis  Promotional mailing

Counting the Cost ● The confusion matrix: There are many other types of cost! ● E.g.: cost of collecting training data Actual class True negativeFalse positiveNo False negativeTrue positiveYes NoYes Predicted class

The Kappa Statistic ● Two confusion matrices for a 3-class problem: actual predictor (left) vs. random predictor (right) Number of successes: sum of entries in diagonal (D) ● Kappa statistic: measures relative improvement over random predictor

Classification with Costs ● Two cost matrices: ● Success rate is replaced by average cost per prediction ● Cost is given by appropriate entry in the cost matrix

Cost-Sensitive Classification ● Can take costs into account when making predictions ● Basic idea: only predict high-cost class when very confident about prediction ● Given: predicted class probabilities ● Normally we just predict the most likely class ● Here, we should make the prediction that minimizes the expected cost ● Expected cost: dot product of vector of class probabilities and appropriate column in cost matrix ● Choose column (class) that minimizes expected cost

Cost-Sensitive Learning ● So far we haven't taken costs into account at training time ● Most learning schemes do not perform cost-sensitive learning ● They generate the same classifier no matter what costs are assigned to the different classes ● Example: standard decision tree learner ● Simple methods for cost-sensitive learning: ● Resampling of instances according to costs ● Weighting of instances according to costs ● Some schemes can take costs into account by varying a parameter, e.g. naïve Bayes

Lift Charts ● In practice, costs are rarely known ● Decisions are usually made by comparing possible scenarios ● Example: promotional mailout to 1,000,000 households Mail to all; 0.1% respond (1000) Data mining tool identifies subset of 100,000 most promising, 0.4% of these respond (400) 40% of responses for 10% of cost may pay off Identify subset of 400,000 most promising, 0.2% respond (800) ● A lift chart allows a visual comparison

Generating a Lift Chart ● Sort instances according to predicted probability of being positive: ● x axis is sample size ● y axis is number of true positives …… … Yes No Yes Yes Actual classPredicted probability

Comparing Models by Measuring Lift

A Hypothetical Lift Chart 40% of responses for 10% of cost 80% of responses for 40% of cost

ROC Curves ● ROC curves are similar to lift charts  Stands for “receiver operating characteristic”  Used in signal detection to show tradeoff between hit rate and false alarm rate over noisy channel ● Differences to lift chart:  y axis shows percentage of true positives in sample rather than absolute number  x axis shows percentage of false positives in sample rather than sample size

A Sample ROC Curve ● Jagged curve—one set of test data ● Smooth curve—use cross-validation

Cross-Validation and ROC Curves ● Simple method of getting a ROC curve using cross- validation:  Collect probabilities for instances in test folds  Sort instances according to probabilities ● This method is implemented in WEKA ● However, this is just one possibility  Another possibility is to generate an ROC curve for each fold and average them

ROC Curves for Two Schemes ● For a small, focused sample, use method A ● For a larger one, use method B ● In between, choose between A and B with appropriate probabilities

More Measures... ● Percentage of retrieved documents that are relevant: precision=TP/(TP+FP) ● Percentage of relevant documents that are returned: recall =TP/(TP+FN) ● Precision/recall curves have hyperbolic shape ● Summary measures: average precision at 20%, 50% and 80% recall (three-point average recall) ● F-measure=(2 × recall × precision)/(recall+precision) ● sensitivity × specificity = (TP / (TP + FN)) × (TN / (FP + TN)) ● Area under the ROC curve (AUC): probability that randomly chosen positive instance is ranked above randomly chosen negative one

Summary of Some Measures ExplanationPlotDomain TP/(TP+FN) TP/(TP+FP) Recall Precision Information retrieval Recall- precision curve TP/(TP+FN) FP/(FP+TN) TP rate FP rate CommunicationsROC curve TP (TP+FP)/(TP+FP+TN +FN) TP Subset size MarketingLift chart

The Mystery Sound  Bet you can’t tell what this is…