The Prediction Calculator Tool

Slides:



Advertisements
Similar presentations
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Advertisements

SQL Server Data Mining tools. SQL Server Data Mining has become the most widely deployed data mining server in the industry, with many third-party software.
1/71 Statistics Tests of Goodness of Fit and Independence.
1 Chapter 4: Creating Simple Queries 4.1 Introduction to the Query Task 4.2 Selecting Columns and Filtering Rows 4.3 Creating New Columns with an Expression.
Evaluating Classification Performance
Copyright © 2009 Pearson Education, Inc. 4.3 Measures of Variation LEARNING GOAL Understand and interpret these common measures of variation: range, the.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
CHAPTER 7 Decision Analytic Thinking I: What Is a Good Model?
Chapter 7 Statistical Inference
Cost-Volume-Profit Analysis
Binomial Probability Distributions
PRICE AND QUANTITY DETERMINATION
Chapter 4 Selections © Copyright 2012 by Pearson Education, Inc. All Rights Reserved.
Consumers, Producers, and the Efficiency of markets
The Scenario Analysis If a car travels at 60 mph for two hours, how much distance will it cover? You find the answer easily, because you know the formula.
Decisions Under Risk and Uncertainty
CHAPTER 11 Inference for Distributions of Categorical Data
Probability Axioms and Formulas
Decision Analysis With Spreadsheet Software
Inference and Tests of Hypotheses
Capital Expenditure Decisions
A Basic Model of the Determination of GDP in the Short Term Chapter 16
Chi-Square X2.
Pure Competition in the Short-Run
Cima P2 Advanced Management Accounting.
The Shopping Basket Analysis Tool
Planning for Capital Investments
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
The Experimental Method
CHAPTER 11 Inference for Distributions of Categorical Data
Measures of Association
Lecture 9 The Costs of Production
Chapter 3 Control Statements Lecturer: Mrs Rohani Hassan
John Loucks St. Edward’s University . SLIDES . BY.
Measurement and Scaling: Fundamentals and Comparative Scaling
4.3 Measures of Variation LEARNING GOAL
Types of Algorithms.
Sampling and Sampling Distributions
Elementary Statistics
© 2007 Thomson South-Western
Inference on Categorical Data
M248: Analyzing data Block D.
Types of Algorithms.
Receiver Operating Curves
Two Categorical Variables: The Chi-Square Test
CHAPTER 11 Inference for Distributions of Categorical Data
Profit and Loss Statement
CHAPTER 11 Inference for Distributions of Categorical Data
12 Notes and teaching tips: 4, 6, 15, 23, 26, 40, 41, 45, 48, 57, 67, and 74. To view a full-screen figure during a class, click the expand button. To.
COUNTING AND PROBABILITY
Chapter 13: Inference for Distributions of Categorical Data
Optimization Techniques
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 9: Setting the list or quoted price
Types of Algorithms.
Chapter 7 Functions of Several Variables
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 4 SURVIVAL AND LIFE TABLES
Who is your Target Market?
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 9 Hypothesis Testing: Single Population
Chapter 18 The Binomial Test
CHAPTER 11 Inference for Distributions of Categorical Data
Customer lifetime value (CLV)
CHAPTER 11 Inference for Distributions of Categorical Data
Cost-Volume-Profit Relationships
Cost Accounting for Decision-making
Presentation transcript:

The Prediction Calculator Tool Imagine you are a salesperson, dealing with many customers and having to select the customers with whom to follow up to close a transaction. Following up comes at a cost: time you need to spend with the customer and marketing materials you will present them. So, you need to carefully pick the follow-ups.

The Prediction Calculator Tool Simple solution is a scorecard tool, which allows the salesperson to assign a score to each customer based on individual attributes. Typically, you would use a threshold, something like ‘‘If the total score is at least 70, then the customer is likely to buy a bike, so go on, follow up!’’

The Prediction Calculator Tool The Prediction Calculator tool produces such a scorecard. It also assists you in detecting the optimum threshold for using the scorecard— a threshold that minimizes any costs associated with incorrect predictions and maximizes any profits associated with a correct prediction. The Prediction Calculator tool can perform only binary predictions. It can be used to predict whether a column will have a certain value or not, but not to select between multiple alternatives.

The Prediction Calculator Tool Example In this example, we will use the Prediction Calculator tool to generate a scorecard that predicts whether a customer is likely to purchase a bike or not, based on demographics.

The Prediction Calculator Tool Example The Prediction Calculator The following Figure shows the operational Prediction Calculator report, which can be used interactively to perform predictions. The total is compared against the threshold at the top of the report (540, in this example). If the total exceeds the threshold, the predicted value for Purchased Bike is True,

The Prediction Calculator Tool Example The Prediction Calculator As an example, try to use the calculator to predict whether a new customer will buy a bike or not. Enter the customer’s demographics as shown here: Married for Marital Status Male for Gender 97111-127371 for Income 3 for Children Graduate Degree for Education Professional for Occupation Yes for Home Owner 2 for Cars 0-1 Miles for Commute Distance North America for Region 46-55 for Age The total is modified to 603, which exceeds the 558 threshold. Therefore, the prediction is TRUE, and the customer is likely to buy a bike.

The Prediction Calculator Tool Refining the Results In the previous example, we used the Prediction Calculator to predict, based on demographic information, whether or not a customer will buy a bike. The Prediction Calculator associates a score with each column value. If the sum of these scores for a customer is equal to or exceeds a threshold, then the prediction is positive (the customer will likely buy a bike). If the sum of these scores is less than the threshold, then the prediction is negative.

The Prediction Calculator Tool Refining the Results The predictions can be classified into the following four categories: True negative predictions—This is correct prediction, but its a negative one. The tool predicts that a customer is not a bike buyer and if you ask the customer, you find out that, indeed, the customer is not interested in buying a bike True positive predictions—This is correct prediction False positive predictions, also known as Type I errors—This is an incorrect positive prediction. The tool predicts that a customer is a bike buyer but when you ask the customer, you find out that he or she is not interested in buying a bike. False negative predictions, also known as Type II errors—This is another kind of incorrect prediction, a negative one. The tool predicts that the customer is not a bike buyer, but you find out later that he or she was actually interested in buying a bike.

The Prediction Calculator Tool Refining the Results Our goal in using the calculator is to correctly identify as many bike buyers as possible. In this scenario, consider the following: A true positive prediction produces value— the profit margin associated with selling a bike. A true negative prediction does not produce value, nor does it produce any loss —it saves you the marketing effort on an uninterested customer. A false positive prediction may produce some loss— the marketing cost associated with that customer. A false negative prediction does not produce value—it may represent a lost opportunity to sell a bike. The total profit generated by the tool is the total profit margin associated with true positive predictions, minus the total marketing cost associated with false positive predictions.

The Prediction Calculator Tool Refining the Results Suppose that you are using the scorecard to identify high-risk patients. In this case, the profit is zero for a true negative prediction. a false positive prediction may have some cost associated with extra investigations, a false negative prediction has a very serious cost associated with patient risks— costs of treating a more advanced disease. a true positive prediction

The Prediction Calculator Tool Refining the Results We can use the Prediction Report to tune your Prediction Calculator to maximize the profit. The Figure shows Prediction Report tuning tool

The Prediction Calculator Tool Refining the Results By default, the tool associates a profit of $10 with a true positive prediction and a cost of $10 with a false positive prediction. These defaults represent a direct marketing scenario, where a true positive leads to revenue and a false positive leads to losses related to direct marketing costs. Use this section of the tool to specify your own costs and profits

The Prediction Calculator Tool Refining the Results The tool computes the optimum threshold for the Prediction Calculator as the threshold that maximizes the profit (revenue from correct predictions, minus costs from incorrect predictions) over the test set. During execution, the tool creates a set of randomly selected table rows for testing purposes.

The Prediction Calculator Tool Refining the Results Take a simple example, which considers only Commute Distance and Children. Assume that the test set contains five rows Also assume that the following things are true: A correct prediction (true positive or true negative) has a profit of $10. An incorrect prediction (false positive or false negative) has a cost of $10.

The Prediction Calculator Tool Refining the Results If the threshold is set to 524, then any score greater than or equal to 524 generates a positive prediction (correct or incorrect), and any score below 524 generates a negative prediction (correct or incorrect). For a threshold of 524, the test table produces the following: Three true positive predictions (rows with IDs 1, 2, and 3), resulting in a total revenue of $30. One true negative prediction (row 5), resulting in a total revenue of $10. Zero false negative predictions. One false positive predictions (row 4), resulting in a total cost of $10. Therefore, the total profit associated with a score threshold of 524 is $30.

The Prediction Calculator Tool Refining the Results If you repeat this experiment for all distinct score values in the test set, as well as for 0 and 1000 (the minimum and maximum possible scores), the total profit follows the values shown in the Table As a result, the total profit provided by the tool is $30, and it is maximized when the threshold is in the range of 221 to 524. Actually, the test set granularity does not permit comparing values in this range, so the tool will recommend a threshold of 221 (the first in the range) as the optimum threshold

The Prediction Calculator Tool Refining the Results The profit starts very low, for a low threshold. In this case, the number of false positives is very large As the score threshold grows, the number of false positives is reduced. As the score threshold grows even further, the number of false negatives increases. The evolution of the profit for various thresholds The cumulative costs associated with incorrect predictions