Download presentation
Presentation is loading. Please wait.
1
Similarity Modeling Chuck Boucek
Insurance and Actuarial Advisory Services Similarity Modeling Chuck Boucek CAS Seminar on Predictive Modeling – Las Vegas, Nevada October 11 – 12,
2
Agenda Overview Underwriting Model Similarity Model Market Deployment
Conceptual framework Process Model results Market Deployment
3
Overview Company’s book of business is a function of its
Target market Underwriting practices Location of agencies Underwriting model is based on company’s own data Model is sometimes employed to score risks outside of target market Company approaches to addressing policyholders at extremes of historical data Limits on risks that are scored via model Manual underwriting of selected groups Modeling approach (Continued)
4
Overview Business risks of predictive modeling
Model does not appropriately assess policyholders Model deployment worsens policyholder retention Model is inappropriately extrapolated to policyholders Can an underwriting model be used to score all potential policyholders? How to treat policyholders at extremes of historical data? Is there a structured way of assessing how similar a policyholder is to the policyholders in the data used to build the model?
5
Underwriting Model – Lift Chart
1.4 1.4 Predicted Actual 1.3 1.3 1.2 1.2 1.1 1.1 Relative Frequency 1.0 1.0 0.9 0.9 0.8 0.8 0.7 0.7 0.6 Source of all graphs: Ernst & Young Insurance and Actuarial Advisory Services 0.6 (Continued)
6
Underwriting Model – Predictor Variable #1
Credit Variable 0.0 0.2 0.4 0.6 0.8 1.0 2,000 4,000 6,000 8,000 10,000 1.2 1.4 1.6 1.8 2.0 Relative Frequency Underwriting Model – Predictor Variable #1 2 SE Number of Observations # Observations (Continued)
7
Underwriting Model – Predictor Variable #2
2.0 60,000 1.8 A 1.6 2 SE Number of Observations 40,000 Relative Frequency # Observations 1.4 B 1.2 20,000 C 1.0 0.8 A B C Internal Class Variable Internal Class Variable
8
Similarity Model Conceptual framework
Policyholders with greater certainty in the claim frequency prediction should score higher Generally, there is greater uncertainty in predictions from areas with more sparse data Want to be able to assess similarity in a multivariate framework (Continued)
9
Similarity Model Process
Generate one record for every record in modeling database Numeric variable will be from a uniform distribution over the range of actual values Factor variable will have an equal frequency for each level of the factor Variables are limited to those in underwriting model Create an indicator variable 1 = Record is from actual data 0 = Record is from generated data Perform a logistic regression with the indicator variable as response variable Response variable assumed to be binomial Logit link function (Continued)
10
Similarity Model – Predictor Variable #1
Credit Variable: Indicator = 1 0.0 0.2 0.4 0.6 0.8 1.0 2,000 4,000 6,000 8,000 10,000 Credit Variable: Indicator = 0 Similarity Model – Predictor Variable #1 # Observations # Observations Number of Observations Number of Observations (Continued)
11
Similarity Model – Predictor Variable #2
60,000 60,000 Number of Observations Number of Observations # Observations # Observations 40,000 40,000 20,000 20,000 A B C A B C Internal Class Variable: Indicator = 1 Internal Class Variable: Indicator = 0 (Continued)
12
Similarity Model Results – Average Scores
50,000 0.98 40,000 # Observations 30,000 Number of Observations 20,000 10,000 0.88 0.8 0.75 0.66 0.65 0.56 0.55 0.63 0.59 0-0.1 0.9-1 (Continued) Credit Variable
13
Similarity Model Results – Average Scores
0.95 60,000 40,000 # Observations Number of Observations 0.92 20,000 0.8 A B C (Continued) Internal Class Variable
14
Similarity Model – Summary
Advantages of this process Produces results consistent with desired characteristics It is a structured way of sorting risks from the fringes of the distribution It is straightforward conceptually It employs a Generalized Linear Model This is not the only process that can be employed (Continued)
15
Similarity Model – Impact on Lift Chart
Similarity Decile 10 1.6 1.6 Predicted Actual 1.4 1.4 1.2 1.2 Relative Frequency 1.0 1.0 0.8 0.8 0.6 0.6 (Continued) 0.4 0.4
16
Similarity Model – Impact on Lift Chart
Similarity Decile 2 1.6 1.6 Predicted Actual 1.4 1.4 1.2 1.2 Relative Frequency 1.0 1.0 0.8 0.8 0.6 0.6 (Continued) 0.4 0.4
17
Similarity Model – Impact on Lift Chart
Similarity Decile 1 1.6 1.6 Predicted Actual 1.4 1.4 1.2 1.2 Relative Frequency 1.0 1.0 0.8 0.8 0.6 0.6 (Continued) 0.4 0.4
18
Similarity Model Observations
Policyholders in similarity decile 1 should show more variability in actual results The predicted spread in claim frequency is smaller in decile 10 than in decile 1 The policyholders in decile 10 tend to be alike
19
Market Deployment Applications Select a cutoff similarity score
Based on variability in lift chart Professional judgment of underwriting staff Policyholders with lower similarity scores will receive greater underwriting attention Monitor results by similarity group
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.