Project
Overview Project details Open class to begin project work Fitness function Equal error costs Unequal error costs Open class to begin project work
Project details Objective (or fitness function)?
Equal Error Costs If type 1 and type 2 error costs are the same we can simply minimize misclassification rates (on hold out set)
These two models have the same misclassification rate: Unequal Error Costs In many real life scenarios the costs of type 1 and type 2 error costs are not the same… These two models have the same misclassification rate: Which one do you prefer?
Example Type 2 error Type 1 error MODEL A MODEL B Example Type 2 error expected cost of incorrectly classifying someone with cancer (actual YES) as not having cancer (predict NO) is $10000 expected cost of incorrectly classifying someone without cancer (actual NO) as having cancer (predict YES) is $500 Type 1 error
Example Instead of minimizing misclassification rate... MODEL A MODEL B Example Type 2 error cost = 10000 Type 1 error cost = 500 Revenue for correctly identifying cancer exists = 4000 Revenue for correctly identifying cancer doesn’t exist = 100 Instead of minimizing misclassification rate... let’s maximize expected profit revenue correctly predict 1 revenue correctly predict 0 #correct predict 1 #correct predict 0 cost of type 1 error cost of type 2 error x + x - #type 1 errors #type 2 errors x - x
Example Instead of minimizing misclassification rate... MODEL A MODEL B Example Type 2 error cost = 10000 Type 1 error cost = 500 Revenue for correctly identifying cancer exists = 4000 Revenue for correctly identifying cancer doesn’t exist = 100 Instead of minimizing misclassification rate... let’s maximize expected profit Model A: revenue correctly predict 1 revenue correctly predict 0 4000 #correct predict 1 5 100 #correct predict 0 5 cost of type 1 error cost of type 2 error 1 x + x - 500 #type 1 errors 9 10000 #type 2 errors x - x = 6000
Example Instead of minimizing misclassification rate... MODEL A MODEL B Example Type 2 error cost = 10000 Type 1 error cost = 500 Revenue for correctly identifying cancer exists = 4000 Revenue for correctly identifying cancer doesn’t exist = 100 Instead of minimizing misclassification rate... let’s maximize expected profit Model A: revenue correctly predict 1 revenue correctly predict 0 4000 #correct predict 1 5 100 #correct predict 0 5 cost of type 1 error cost of type 2 error x + x - 500 #type 1 errors 9 10000 #type 2 errors 1 x - x = 6000 Model B: revenue correctly predict 0 4000 #correct predict 1 5 cost of type 1 error 5 100 #correct predict 0 cost of type 2 error x + - 500 #type 1 errors 1 x 10000 #type 2 errors 9 x - x = -70000
Project Work