Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley.

Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley Satish Nargundkar November 24, 2003

Paper Research Questions This paper addresses the answers to two the following research questions: This paper addresses the answers to two the following research questions: 1.Does model development technique improve classification accuracy? 2.How will model selection vary based upon the evaluation method used?

Discussion Outline Discussion of Modeling Techniques Discussion of Model Evaluation Methods Global Classification Rate Global Classification Rate Loss Function Loss Function K-S Test K-S Test ROC Curves ROC Curves Empirical Example

Model Development Techniques Modeling plays an increasingly important role in CRM strategies: Target Marketing Response Models Response Models Risk Models Risk Models Customer Behavioral Models Usage Models Usage Models Attrition Models Attrition Models Activation Models Activation Models Collections Recovery Models Recovery Models Product Planning Customer Acquisitio n Customer CustomerManagementCustomerManagement CreatingValueCreatingValue Collection s/Recover y Other Models Segmentation Models Segmentation Models Bankruptcy Models Bankruptcy Models Fraud Models Fraud Models

Model Development Techniques Given that even minimal improvements in model classification accuracy can translate into significant savings or incremental revenue, an entire literature exists on the comparison of model development techniques (e.g., Atiya, 2001; Reichert et al., 1983; West, 2000; Vellido et al., 1993; Zhang et al., 1999). Statistical Techniques Linear Discriminant Analysis Linear Discriminant Analysis Logistic Analysis Logistic Analysis Multiple Regression Analysis Multiple Regression Analysis Non-Statistical Techniques Neural Networks Neural Networks Cluster Analysis Cluster Analysis Decision Trees Decision Trees

Model Evaluation Methods But, developing the model is really only half the problem. How do you then determine which model is “best”?

Model Evaluation Methods In the context of binary classification (one of the most common objectives in CRM modeling), one of four outcomes is possible: 1. True positive 2.False positive 3. True negative 4. False negative FN FPTP TN True Good True Bad Pred. Bad Pred. Good

Model Evaluation Methods If all of these outcomes, specifically the errors, have the same associated costs, then a simple global classification rate is a highly appropriate evaluation method: True Good True Bad Total Predicted Good Predicted Bad Total Classification Rate = 75% ((100+650)/1000) 20050650100 850150700300 1000

The global classification method is the most commonly used (Bernardi and Zhang, 1999), but fails when the costs of the misclassification errors are different (Type 1 vs Type 2 errors) - Model 1 results: Global Classification Rate = 75% False Positive Rate = 5% False Negative Rate = 20% Model 2 results: Global Classification Rate = 80% False Positive Rate = 15% False Negative Rate = 5% What if the cost of a false positive was great, and the cost of a false negative was negligible? What if it was the other way around? Model Evaluation Methods

If the misclassification error costs are understood with some certainty, a loss function could be used to evaluate the best model: Loss=π 0 f 0 c 0 + π 1 f 1 c 1 Where, π i is the probability that an element comes from class i, (prior probability), f i is the probability that an element will be misclassified in i class, and c i is the cost associated with that misclassification error.

Model Evaluation Methods An evaluation model that uses the same conceptual foundation as the global classification rate is the Kolmorgorov-Smirnov Test: Greatest separation occurs at a cut off score of.65

Model Evaluation Methods What if you don’t have ANY information regarding misclassification error costs…or…the costs are in the eye of the beholder?

Model Evaluation Methods The area under the ROC (Receiver Operating Characteristics) Curve accounts for all possible outcomes (Swets et al., 2000; Thomas et al., 2002; Hanley and McNeil, 1982, 1983): Sensitivity (True Positives) 1-Specificity (False Positives) 0 1 1 θ=.5 θ=1.5<θ<1 FN FPTP TN True Good True Bad Pred. Bad Pred. Good

Empirical Example So, given this background, the guiding questions of our research were – 1. Does model development technique impact prediction accuracy? 2. How will model selection vary with the evaluation method used?

Empirical Example We elected to evaluate these questions using a large data set from a pool of car loan applicants. The data set included: 14,042 US applicants for car loans between June 1, 1998 and June 30, 1999.14,042 US applicants for car loans between June 1, 1998 and June 30, 1999. Of these applicants, 9442 were considered to have been “good” and 4600 were considered to be “bad” as of December 31, 1999.Of these applicants, 9442 were considered to have been “good” and 4600 were considered to be “bad” as of December 31, 1999. 65 variables, split into two groups –65 variables, split into two groups – Transaction variables (miles on the vehicle, selling price, age of vehicle, etc.) Transaction variables (miles on the vehicle, selling price, age of vehicle, etc.) Applicant variables (bankruptcies, balances on other loans, number of revolving trades, etc.) Applicant variables (bankruptcies, balances on other loans, number of revolving trades, etc.)

Empirical Example – The LDA and Logistic models were developed using SAS 8.2, while the Neural Network models were developed using Backpack® 4.0. Because there is no accepted guidelines for the number of hidden nodes in Neural Network development (Zhang et al., 1999; Chen and Huang, 2003), we tested a range of hidden nodes from 5 to 50.

Empirical Example – Feed Forward Back Propogation Neural Networks: Input Layer Hidden Layer Output Layer ΣS Combination Function combines all inputs into a single value, usually as a weighted summation TransferFunction Calculates the output value from the combination function input input input input output

Empirical Example - Results Technique Class Rate “Goods” Class Rate “Bads” Class Rate “Global” Theta K-S Test LDA 73.91% 73.91% 43.40% 43.40% 59.74% 59.74%68.98%19% Logistic70.54%59.64%69.45%68.00%24% NN-5 Hidden Nodes 63.50%56.50%58.88%63.59%38% NN-10 Hidden Nodes 75.40%44.50%55.07%64.46%11% NN-15 Hidden Nodes 60.10%62.10%61.40%65.89%24% NN-20 Hidden Nodes 62.70%59.00%60.29%65.27%24% NN-25 Hidden Nodes 76.60%41.90%53.78%63.55%16% NN-30 Hidden Nodes 52.70%68.50%63.13%65.74%22% NN-35 Hidden Nodes 60.30%59.00%59.46%63.30%22% NN-40 Hidden Nodes 62.40%58.30%59.71%64.47%17% NN-45 Hidden Nodes 54.10%65.20%61.40%64.50%31% NN-50 Hidden Nodes 53.20%68.50%63.27%65.15%37%

Conclusions What were we able to demonstrate? 1.The “best” model depends upon the evaluation method selected; 2.The appropriate evaluation method depends upon situational and data context; 3.No multivariate technique is “best” under all circumstances. circumstances.

Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley.

Similar presentations

Presentation on theme: "Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley.

Similar presentations

Presentation on theme: "Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley."— Presentation transcript:

Similar presentations

About project

Feedback