America CAS Predictive Modeling Seminar September 2005 Presented by: Rich Moncher – Bristol West Tom Hettinger – EMB America Vehicle Ratemaking Vehicles Need Class Too
2 Vehicle Ratemaking OUTLINE Background Vehicle Estimator Initial Estimator Diagnostics Tools Vehicle Symbols Symbol Relativities Summary PURPOSE: To discuss techniques for performing vehicle symbol analysis within the context of multivariate framework, including proper tools and diagnostics Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
3
4
5
6 Potential for Adverse Selection? Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary
7 Potential for Adverse Selection? Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary
8 Potential for Adverse Selection? Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary
9 Potential for Adverse Selection? Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary
10 Vehicle Classification/Relativity Analysis Vehicle is critical as it is a major risk driver and accounts for much of the variation in rates Two elements: Symbols Relativities (Both in terms of Model Year and Symbol) Historically, focus on relativities and vehicle age BUT not symbols Initial symbols based on limited data, competitors, bureaus, and judgment Regular reviews of relativities Model Year Symbols Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary
11 Why a Symbol Review? Reduce your reliance on third parties. Produce assignments you can understand and explain internally. Remove potential bias due to inaccurate assignments. Customize to meet the experience of your book. Why does one company up-charge a Ford Taurus after its initial assignment and another down-charge it? Difference in underlying books? Difference in methodologies? Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary
12 Why a Symbol Review? Develop better initial assignments. For example electronic stability control systems (ESCS): “The safety agency credited much of the reduction in SUV rollover risk to the increasing availability of electronic stability control systems on SUVs.” – Wall Street Journal “The systems are sometimes offered as standard equipment or, as an option, cost several hundred dollars.” – Wall Street Journal Could two versions (one with and one without ESCS) of a new SUV that both fall into a original cost new symbol of 22 ($40K-$45K) be different risks? If your experience showed ESCS enabled vehicles cost x% less to insure, how would you initially assign the symbol? Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary
13 The Analysis Use statistically credible techniques to develop the most appropriate symbol assignments and relativities. Utilize GLM techniques. Utilize Smoothing, Credibility Weighting, Clustering techniques. Allow for User Interaction. Get the most out of the company’s own data. How is the company’s vehicle experience different from other company’s or rating agencies underlying databases. How can known cars’ characteristics help you understand the loss potential where data is thin. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
14 The Analysis ????? Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
15 Issues with Classifying Vehicles High-dimensionality - Symbol analysis requires a large number of small vehicle units (VIN) as building blocks. - VINs are the building blocks of vehicle rating and have little to no experience. - Most companies only use two types of classifications for vehicles – model year and symbol. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
16 Issues with Classifying Vehicles High correlation - Vehicles tends to be highly correlated with other rating variables (e.g., Deductible, location, age, and limit) - Multivariate framework required to handle highly correlated variables Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
17 Purpose of Predictive Modeling To predict a response variable using a series of explanatory variables (or rating factors). Dependent/Response Losses Claims Retention Independent/Predictors Age Symbols LimitModel Year TerritoryCredit Score Weights Claims Exposures Premium Statistical Model Model Results Parameters Validation Statistics Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
18 Predictive Modeling Response Variable Systematic Component Random Component = + Signal: Function of the Rating Factors/Predictors Noise: Reflects stochastic process Overall Mean “Best” Model 1 parameter for each observation Model Complexity (Number of Parameters) Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
19 Vehicle Symbol/Relativity Analysis Vehicle symbol/relativity analysis is a multi-stage process. How do you isolate the signal from the data? Many techniques available. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Vehicle Level Data - Problems - Noisy - Limited Data Initial Vehicle Estimator - Choices to be made - Raw - Standardized - Isolate the Signal - By coverage - Frequency and Severity determined separately. - Residual corrections. - Dimensionally smoothed. Final Vehicle Risk Estimator
20 Vehicle Symbol/Relativity Analysis Vehicle symbol/relativity analysis is a multi-stage process. How do you group the data? Many techniques available. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Vehicle Symbol Assignments Symbol Relativities - Estimators combined. - Overall estimators clustered to form symbols. - Relativities calculated for each symbol.
21 Determining the Vehicle Risk Estimator Variety of methods required to decipher different risk drivers by coverage. For Example: Weight impacts BI/Med/Collision differently. This helps us better understand and explain differences. It can also help in creating better symbol assignments in the future. Recommend determining the estimator at the granular level (i.e., frequency/severity by coverage). Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Initial Vehicle Estimator Final Vehicle Risk Estimator
22 Determining the Vehicle Risk Estimator Output of this stage is a risk estimator for each vehicle. Dimensional Smoothing Credibility Weighting Residual Correction TOOLS GLM Tests Hold-Out Samples Residual Analysis DIAGNOSTICS P-Values Determination of final risk estimator is an iterative process. GLM Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Initial Vehicle Estimator Final Vehicle Risk Estimator
23 Determining the Vehicle Risk Estimator Dimensional Smoothing Credibility Weighting Residual Correction TOOLS GLM Tests Hold-Out Samples Residual Analysis DIAGNOSTICS P-Values Where do we start? Analyst has choices for initial estimator. GLM Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Initial Vehicle Estimator Final Vehicle Risk Estimator
24
25 Observed Data Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary Given the following rating factors –Age (a) –Sex (s) –Limit (l) –Vehicle: VIN (v) Then Initial estimate for the i th VIN America
26 Standardized Observed Data Limit/Deductible Territory … Policyholder Sex Vehicle Factors Standard Policy Factors GLM Current Symbols Make Model Categories VIN Groupings Residuals Policyholder Age Data Final Vehicle Factors Include basic vehicle factors within GLM model. Standard policy factors are captured correctly. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
27
28 Data GLM Standard Policy Factors Policyholder Age Policyholder Sex … Vehicle Age Vehicle Group Vehicle Factors Body Data Performance Data Crash/Theft Data Residuals Standardized Fitted Data Final Vehicle Factors Directly model vehicle estimators within GLM. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
29 Standardized Fitted Data with External Data Performance and body data differentiates among unique VINs. Transmission, Curb Weight, Wheelbase, Power, Torque, Engine Size, speed, Braking Distance, Turning Circle, etc. As Curb Weight increases, Property Damage Severity increases. As Curb Weight increases, Collision Severity decreases. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
30 Differentiating a VIN Origin Make Vehicle Series Body Style Engine Emission Check Figure Year Factory Code Serial Number Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
31 Differentiating a VIN Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
32 Differentiating a VIN Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
33 Standardized Fitted Data Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary Redefine the vehicle unit into meaningful concepts Age Sex Limit VIN Then Current Symbol VIN Cluster Body Data Performance Data Crash/Theft Data America
34 Determining the Vehicle Risk Estimator Variety of diagnostics can be used. Dimensional Smoothing Credibility Weighting Residual Correction TOOLS GLM Tests Hold-Out Samples Residual Analysis DIAGNOSTICS P-Values Need to determine how well the vehicle estimator is performing. GLM Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Initial Vehicle Estimator Final Vehicle Risk Estimator
35 Hold-Out Samples Split data into “Training” and “Test”. Create groupings/estimators with the “Training” data. Examine “Test” data to see how well groupings perform. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
36 P-Values p-value = probability that the modeled frequency is at least as extreme as that observed. Over-fitting Under- fitting Under the null hypothesis the p- values should be uniformly spread over [0,1]. Assume smoothed statistic is underlying frequency in each zip code. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
37 Residual Analysis Standardize the data for all factors to see if there is any systematic residual variation. Derive the residuals for each VIN Apply Multi-dimensional smoothing methods to aid interpretation Principle Components Residual Scoring Looking for systematic patterns in the residuals Multidimensional Residual Plots using the VIN characteristics Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
38 Determining the Vehicle Risk Estimator Tests will indicate which tools analyst should consider. Dimensional Smoothing Credibility Weighting Residual Correction TOOLS GLM Tests Hold-Out Samples Residual Analysis DIAGNOSTICS P-Values A variety of tools are needed to handle different situations. GLM Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Initial Vehicle Estimator Final Vehicle Risk Estimator
39 Data GLM Standard Policy Factors Vehicle Factors Residuals Standardized Fitted Data Data GLM Standard Policy Factors Vehicle Factors Residuals Overall Mean “Best” Model 1 parameter for each observation Model Complexity (Number of Parameters) Underfit Overfit Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
40 Standardized Fitted Data Overall Mean “Best” Model 1 parameter for each observation Model Complexity (Number of Parameters) Underfit Enhance GLM Credibility-weight Residual Correction Overfit Revisit GLM Smoothing Residual Correction TOOLS Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
41 Dimensional Smoothing Uses knowledge of similar vehicles to enhance estimates of the underlying risk. Similarity characteristics based on the parameters from the GLM Essentially applying dimension reduction techniques on the VIN characteristics to form a single continuous variable Similar to scoring routines Variates can then be applied to the scores to smooth the estimate Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
42 Residual Correction Factors Check residuals for underlying systematic patterns. Ideally, enhance underlying GLM to better explain data. Alternatively - Band the residuals via smoothing and clustering - Estimate a correction factor Effectively creating a new external factor to explain the vehicle residual effect Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
43 Credibility Weighting May want to control the amount of credibility weighting via max/min credibility constraints. Can employ standard credibility weighting techniques. Z * Primary Estimator+ (1 – Z) * Secondary Estimator Data GLM Standard Policy Factors Vehicle Factors Residuals Data GLM Standard Policy Factors Vehicle Factors Residuals Standardized Fitted (Underfit) Standardized Observed Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
44 Determining the Vehicle Symbol Assignment Use techniques to identify similar risk estimators to be group to create a manageable number of symbol assignments. Many choices are available to do this. Let statistics help you choose. Not practical to do in a GLM/Tree/Other environment. It is impractical to have a symbol assignment for each and every vehicle. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Final Vehicle Risk Estimator Vehicle Symbol Assignments
45 Creating New Symbol BI Frequency BI Severity PD Frequency PD Severity Comp Frequency Comp Severity Coll Frequency Coll Severity BI Estimator PD Estimator Comp Estimator Coll Estimator Vehicle Risk Estimators clustered to form symbols. Combine component estimators to determine a risk measure for each vehicle for use in building symbols. Coverage estimators can be further combined if desire 1 set of symbols for multiple coverages. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Vehicle Symbol Assignments
46 Clustering used to produce groupings that are predictive of the future: Minimize within-group heterogeneity. Maximize cross-group heterogeneity. Commonly-used clustering methods: Quantiles Equal Weight Similarity Methods Average Linkage Centroid Wards Clustering Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
47 Quantiles Create groups with equal numbers of observations. Equal Weight Create groups which have an equal amount of weight. Similarity Methods: Rank the data set by the statistic you wish to cluster. Decide on which pair of records are the ‘most similar.’ Group these records. Repeat until left with the desired number of groups. Wards Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary Clustering Methodologies America
48
49 Determining New Symbol Relativities GLM model fit using data grouped by new vehicle symbols. Test relativities using standard GLM tests. Predictive in GLM model Consistent over time in GLM model Predictive when tested against other data Refine symbols/relativities as appropriate. Incorporate rules-based restrictions. Apply actuarial knowledge. Investigate “neighbors” with very different relativities. Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America Vehicle Symbol Assignments Symbol Relativities
50 Accurate estimation of underlying risk associated with vehicle is a three stage process Vehicle Rating - Overview Step 1 Obtain a separate estimator by claim type and by frequency and severity for each VIN building block. Combine estimators, as appropriate. BI Frequency Severity Estimator PD Frequency Severity Estimator Comp Frequency Severity Estimator Coll Frequency Severity Estimator BI Estimator PD Estimator Comp Estimator Coll Estimator Vehicle Symbols Symbol Relativities Step 2 Cluster Vehicle building blocks to develop symbols separately by coverage or for several coverages combined Step 3 Determine by-coverage relativities for each symbol group Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America
51 Summary Vehicle is a major driver of risk, thus it is critical that companies review symbol assignments and relativities regularly. Issues exist that create special challenges with regards to symbol analysis. High-dimensionality Heavily correlated Vehicle symbol analysis requires a range of different approaches and tools (as there are different loss drivers by coverage). Diagnostics needed to ensure best model possible Background Symbol Relativities Vehicle Estimator –Initial EstimatorInitial Estimator Vehicle Symbols –DiagnosticsDiagnostics –ToolsTools Summary America