Performance Measures II

Slides:



Advertisements
Similar presentations
Chapter 4 Evaluating Classification and Predictive Performance
Advertisements

Learning Algorithm Evaluation
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Chapter 5 – Evaluating Classification & Predictive Performance
Chapter 4 – Evaluating Classification & Predictive Performance © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel.
ROC Statistics for the Lazy Machine Learner in All of Us Bradley Malin Lecture for COS Lab School of Computer Science Carnegie Mellon University 9/22/2005.
Lecture Notes for Chapter 4 Part III Introduction to Data Mining
Outliers Split-sample Validation
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Model Evaluation Metrics for Performance Evaluation
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
CAP and ROC curves.
Statistical Decision Theory, Bayes Classifier
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Outliers Split-sample Validation
Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.
ROC & AUC, LIFT ד"ר אבי רוזנפלד.
ROC Curve and Classification Matrix for Binary Choice Professor Thomas B. Fomby Department of Economics SMU Dallas, TX February, 2015.
February 15, 2006 Geog 458: Map Sources and Errors
1 Business Intelligence and Data Analytics Intro Qiang Yang Based on Textbook: Business Intelligence by Carlos Vercellis.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.
Evaluating Classifiers
Evaluation – next steps
1 Evaluating Model Performance Lantz Ch 10 Wk 5, Part 2 Right – Graphing is often used to evaluate results from different variations of an algorithm. Depending.
Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =
StAR web server tutorial for ROC Analysis. ROC Analysis ROC Analysis: This module allows the user to input data for several classifiers to be tested.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 3. Bayes Decision Theory: Part II. Prof. A.L. Yuille Stat 231. Fall 2004.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach,
Evaluating Results of Learning Blaž Zupan
Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Evaluating Classification Performance
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
Quiz 1 review. Evaluating Classifiers Reading: T. Fawcett paper, link on class website, Sections 1-4 Optional reading: Davis and Goadrich paper, link.
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Chapter 5 – Evaluating Predictive Performance Data Mining for Business Analytics Shmueli, Patel & Bruce.
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Data Analytics CMIS Short Course part II Day 1 Part 4: ROC Curves Sam Buttrey December 2015.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
Model Evaluation Saed Sayad
Classifiers!!! BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin.
7. Performance Measurement
Lecture 1.31 Criteria for optimal reception of radio signals.
Machine Learning – Classification David Fenyő
Evaluating Classifiers
Evaluation – next steps
Lecture Notes for Chapter 4 Introduction to Data Mining
Classifiers!!! BCH339N Systems Biology / Bioinformatics – Spring 2016
CSE 4705 Artificial Intelligence
Evaluating Results of Learning
9. Credibility: Evaluating What’s Been Learned
Chapter 8: Inference for Proportions
Chapter 7 – K-Nearest-Neighbor
Introduction to Data Mining and Classification
Lecture Notes for Chapter 4 Introduction to Data Mining
Data Mining Classification: Alternative Techniques
Exam #3 Review Zuyin (Alvin) Zheng.
Receiver Operating Curves
Big Data Analytics: Evaluating Classification Performance April, 2016 R. Bohn Some overheads from Galit Shmueli and Peter Bruce 2010.
Two Categorical Variables: The Chi-Square Test
Learning Algorithm Evaluation
iSRD Spam Review Detection with Imbalanced Data Distributions
آبان 96. آبان 96 Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan,
Nearest Neighbors CSC 576: Data Mining.
Computational Intelligence: Methods and Applications
Data Mining Class Imbalance
Evaluation and Its Methods
Roc curves By Vittoria Cozza, matr
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
Presentation transcript:

Performance Measures II Prof. Kislaya Prasad BUDT 733 (Spring 2015)

The Lift Charts Lift charts analyze the “class of interest” Fraudulent claims, responders to mailing, patients at risk of a heart attack, … , or customers that prefer regular beer! Builds on the probability of belonging to the class of interest Sorts all records in a descending order of probability of belonging to the class Cumulative number of cases is on the x-axis Cumulative number of true positives is on the y-axis College Park, Spring 2015

Sort by Probability of Success Row Id. Predicted Class Actual Class Prob. for 1 (success) Log odds Gender Married Income Age Cumulative Actual Class 99 1 0.998098573 6.26324737 $70,981.00 45 84 0.992419208 4.87452789 $53,768.00 34 2 100 0.992209529 4.84703297 $48,259.00 32 3 59 0.990379958 4.63424001 $49,585.00 4 91 0.990275805 4.62336635 $43,593.00 27 5 74 0.986379449 4.28246133 $42,775.00 6 98 0.977679981 3.77969841 $47,495.00 35 7 60 0.974108547 3.62760983 $44,802.00 8 78 0.935678309 2.67737481 $30,841.00 24 9 79 0.931294723 2.60674979 $52,341.00 42 10 83 0.902139926 2.22123099 $55,281.00 48 11 57 0.877447104 1.96847393 $36,821.00 12 80 0.784217321 1.29041439 $41,779.00 39 13 70 0.635233617 0.55473573 $26,598.00 29 14 86 0.627039399 0.51953659 $63,555.00 62 15 72 0.603541887 0.42024505 $46,626.00 46 16 0.529327998 0.11744681 $38,176.00 40 61 0.48084249 -0.07666757 $43,194.00 47 17 0.391588621 -0.44063941 $28,513.00 63 0.318190219 -0.76210133 $38,671.00 44 18 0.307067847 -0.81386337 $23,234.00 31 0.266415137 -1.01288753 $44,955.00 53 93 0.186442211 -1.47329563 $46,273.00 56 19 28 0.169421788 -1.58973071 $28,094.00 0.16137673 -1.64802001 $37,263.00 49 0.156689884 -1.68306617 $43,741.00 54 50 0.148633782 -1.74535693 $26,186.00 38 22 0.076571831 -2.48986375 $29,293.00 43 0.054489309 -2.85372067 $42,051.00 58 0.046760772 -3.01482125 $32,739.00 0.046027689 -3.03139149 $33,027.00 51 0.039124857 -3.20108647 $22,408.00 0.036293545 -3.27914683 $27,078.00 0.027872276 -3.55185469 $36,030.00 55 88 0.022470819 -3.77281061 $52,875.00 71 20 33 0.022468919 -3.77289711 $39,923.00 0.021737345 -3.80674643 $24,302.00 0.008187705 -4.79690025 $25,440.00 25 0.004971317 -5.29908683 $40,582.00 66 0.002902374 -5.83931975 $35,678.00 64 Construct cumulative of Actual Class Plot 20 of 40 cases in validation data are successes. In any M randomly selected cases we will have M/2 successes. This line is denoted in red Sort by Probability of Success College Park, Spring 2015

Lift Chart (Validation Data) Lift Charts generated in R (See script for details) Reference line, represents selecting records (in our case customers) using the Naïve rule

ROC (Receiver Operating Characteristic) Curve Developed in the 1950s for signal detection theory to analyze noisy signals Characterizes the trade-off between positive hits and false alarms Performance of each classifier represented as a point on the ROC curve Changing the threshold of an algorithm, the sample distribution or cost matrix changes the location of the point Decide where to set the cutoffs to find out false positives and false negatives College Park, Spring 2015

ROC (Receiver Operating Characteristic) Curve The ROC curve plots the pairs (1-specificity, sensitivity) Interesting points on the curve: (0,0): declare everything to be 0 (1,1): declare everything to be 1 (1,0): perfect classification Diagonal line: Random guessing Below diagonal line: prediction is opposite of the true class – the model is worse than random guessing! Cutoff 0 Cutoff 1 College Park, Spring 2015

Creating a ROC curve with R We have already calculated sensitivity and specificity in our data table example Add 1-specificity column Insert plot, with 1-specificity as the x values and sensitivity as the y values See script for details College Park, Spring 2015

Using ROC for Model Comparison In this example neither model consistently outperforms the other M1 is better for small sensitivity values M2 is better for larger sensitivity values College Park, Spring 2015

Summary The performance measures differ depending on the use of the model In data-mining we are usually more interested in prediction accuracy than statistical fit When predicting numerical values we can use numerous error measures such as the mean absolute error or RMSE. When classifying we again have many options: accuracy, sensitivity, misclassification costs, the ROC curve etc. Whenever possible, we should use a validation sample to estimate the performance of the model If the sample size is too small to split up the model – there are methods to estimate the prediction accuracy College Park, Spring 2015

Next We will introduce two more classification methods! Naive Bayes – for classification K-nearest neighbor – for classification and prediction Both identified as one of the top 10 algorithms by the IEEE International Conference on Data Mining (ICDM) in December 2006! (see here: http://www.cs.umd.edu/~samir/498/10Algorithms-08.pdf) College Park, Spring 2015