The Prediction Calculator Tool

The Prediction Calculator Tool
Imagine you are a salesperson, dealing with many customers and having to select the customers with whom to follow up to close a transaction. Following up comes at a cost: time you need to spend with the customer and marketing materials you will present them. So, you need to carefully pick the follow-ups.

Simple solution is a scorecard tool, which allows the salesperson to assign a score to each customer based on individual attributes. Typically, you would use a threshold, something like ‘‘If the total score is at least 70, then the customer is likely to buy a bike, so go on, follow up!’’

The Prediction Calculator tool produces such a scorecard. It also assists you in detecting the optimum threshold for using the scorecard— a threshold that minimizes any costs associated with incorrect predictions and maximizes any profits associated with a correct prediction. The Prediction Calculator tool can perform only binary predictions. It can be used to predict whether a column will have a certain value or not, but not to select between multiple alternatives.

Example In this example, we will use the Prediction Calculator tool to generate a scorecard that predicts whether a customer is likely to purchase a bike or not, based on demographics.

Example The Prediction Calculator The following Figure shows the operational Prediction Calculator report, which can be used interactively to perform predictions. The total is compared against the threshold at the top of the report (540, in this example). If the total exceeds the threshold, the predicted value for Purchased Bike is True,

Example The Prediction Calculator As an example, try to use the calculator to predict whether a new customer will buy a bike or not. Enter the customer’s demographics as shown here: Married for Marital Status Male for Gender for Income 3 for Children Graduate Degree for Education Professional for Occupation Yes for Home Owner 2 for Cars 0-1 Miles for Commute Distance North America for Region 46-55 for Age The total is modified to 603, which exceeds the 558 threshold. Therefore, the prediction is TRUE, and the customer is likely to buy a bike.

Refining the Results In the previous example, we used the Prediction Calculator to predict, based on demographic information, whether or not a customer will buy a bike. The Prediction Calculator associates a score with each column value. If the sum of these scores for a customer is equal to or exceeds a threshold, then the prediction is positive (the customer will likely buy a bike). If the sum of these scores is less than the threshold, then the prediction is negative.

Refining the Results The predictions can be classified into the following four categories: True negative predictions—This is correct prediction, but its a negative one. The tool predicts that a customer is not a bike buyer and if you ask the customer, you find out that, indeed, the customer is not interested in buying a bike True positive predictions—This is correct prediction False positive predictions, also known as Type I errors—This is an incorrect positive prediction. The tool predicts that a customer is a bike buyer but when you ask the customer, you find out that he or she is not interested in buying a bike. False negative predictions, also known as Type II errors—This is another kind of incorrect prediction, a negative one. The tool predicts that the customer is not a bike buyer, but you find out later that he or she was actually interested in buying a bike.

Refining the Results Our goal in using the calculator is to correctly identify as many bike buyers as possible. In this scenario, consider the following: A true positive prediction produces value— the profit margin associated with selling a bike. A true negative prediction does not produce value, nor does it produce any loss —it saves you the marketing effort on an uninterested customer. A false positive prediction may produce some loss— the marketing cost associated with that customer. A false negative prediction does not produce value—it may represent a lost opportunity to sell a bike. The total profit generated by the tool is the total profit margin associated with true positive predictions, minus the total marketing cost associated with false positive predictions.

Refining the Results Suppose that you are using the scorecard to identify high-risk patients. In this case, the profit is zero for a true negative prediction. a false positive prediction may have some cost associated with extra investigations, a false negative prediction has a very serious cost associated with patient risks— costs of treating a more advanced disease. a true positive prediction

Refining the Results We can use the Prediction Report to tune your Prediction Calculator to maximize the profit. The Figure shows Prediction Report tuning tool

Refining the Results By default, the tool associates a profit of $10 with a true positive prediction and a cost of $10 with a false positive prediction. These defaults represent a direct marketing scenario, where a true positive leads to revenue and a false positive leads to losses related to direct marketing costs. Use this section of the tool to specify your own costs and profits

Refining the Results The tool computes the optimum threshold for the Prediction Calculator as the threshold that maximizes the profit (revenue from correct predictions, minus costs from incorrect predictions) over the test set. During execution, the tool creates a set of randomly selected table rows for testing purposes.

Refining the Results Take a simple example, which considers only Commute Distance and Children. Assume that the test set contains five rows Also assume that the following things are true: A correct prediction (true positive or true negative) has a profit of $10. An incorrect prediction (false positive or false negative) has a cost of $10.

Refining the Results If the threshold is set to 524, then any score greater than or equal to 524 generates a positive prediction (correct or incorrect), and any score below 524 generates a negative prediction (correct or incorrect). For a threshold of 524, the test table produces the following: Three true positive predictions (rows with IDs 1, 2, and 3), resulting in a total revenue of $30. One true negative prediction (row 5), resulting in a total revenue of $10. Zero false negative predictions. One false positive predictions (row 4), resulting in a total cost of $10. Therefore, the total profit associated with a score threshold of 524 is $30.

Refining the Results If you repeat this experiment for all distinct score values in the test set, as well as for 0 and 1000 (the minimum and maximum possible scores), the total profit follows the values shown in the Table As a result, the total profit provided by the tool is $30, and it is maximized when the threshold is in the range of 221 to 524. Actually, the test set granularity does not permit comparing values in this range, so the tool will recommend a threshold of 221 (the first in the range) as the optimum threshold

Refining the Results The profit starts very low, for a low threshold. In this case, the number of false positives is very large As the score threshold grows, the number of false positives is reduced. As the score threshold grows even further, the number of false negatives increases. The evolution of the profit for various thresholds The cumulative costs associated with incorrect predictions

The Prediction Calculator Tool

Similar presentations

Presentation on theme: "The Prediction Calculator Tool"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Prediction Calculator Tool

Similar presentations

Presentation on theme: "The Prediction Calculator Tool"— Presentation transcript:

Similar presentations

About project

Feedback