Mining Insurance Data to Promote Traffic Safety and Better Match Rates to Risk 2002 CAS Seminar on Ratemaking Greg Hayward
A Photo of the Data Mart
The Data Mart Opens New Opportunities !! 37 Terabytes of Data Storage Capacity 3.7 x 10^13 bytes 37 Terabytes of Data Storage Capacity 3.7 x 10^13 bytes
Data Mart Issues Data – What, When, How Often, How Long Data Dictionary: Source, Codes, Edits, Validation Optimize Access Spends Quality Control
Main Data Mart Menu
A Simple Data Mining Illustration Select some variables to explore Select some target variables Query the Data Mart Download the data into an Excel Pivot Table Let’s look at some real data in Excel
Step-Wise Regression Identify Transform Search Explore Quality Control Develop Derive Run Regression
Data Mart Provides for More Sophisticated Analysis Interactive Multi-Variable Rate Factor Analysis
Concept Multi-variable approach produces indicated factors simultaneously rather than producing each rating variable set of factors successively one after another Multi-variable approach takes into account the interaction between the different rating variables
Method Collect detailed data from the Data Mart Calculate single way indicated factors for each rating variable Compare the single way indicated factor to the current factor and use this to adjust the premium for each combination of rating variables
Method (Cont’d) Once we have the adjusted premium for each combination of rating variables, sum up the premium for each specific rating variable and recalculate the indicated factor for that variable Keep repeating the steps above until you have equal loss ratios for each rating variable
Putting the Technology to Use Vehicle Safety Dangerous Intersections Child Passenger Safety Teenage Driver Safety
Vehicle Safety Discount Collect data for Personal Injury Protection and Medical Payments Coverages for each make, model and body style of vehicle using Vehicle Identification Numbers (VIN) Loss ratios at uniform rate levels are calculated for each vehicle type by model year and compared to the average loss ratio for all vehicles of that particular model year to calculate an index
Vehicle Safety Discount (Cont’d) Resulting Vehicle Safety discounts assigned to each make, model and body style of vehicle were developed from real world data as to how well all the safety features of the vehicle, in combination, protected the occupants Collecting data in this detail allows us to more equitably distribute premiums to our policyholders and allows us to share this information with the auto manufacturers to promote safer vehicles
Dangerous Intersections Develop a query to sort the millions of claims into accident location Focus on the intersections with the most claims to search for additions and deletions Score each claim based on the property damage severity and whether the crash involved injury Adjust the score to a common baseline using the company’s market share of vehicles injured (to ZIP Code level) Rank order the intersections by score Share results with public, community leaders, and traffic safety engineers.
Child Passenger Safety Largest single research project devoted exclusively to pediatric motor vehicle injury Partnered with the Children’s Hospital of Philadelphia Multi-disciplinary research – insurance, medicine, biomechanics, engineering, health education, behavioral science Combines our insurance data with in-depth interviews, on-site crash investigations, and computer crash simulations Share results with parents, medical providers, auto manufacturers, and NHTSA
Teenage Drivers Attempt to study experience and attitude Experience – time, practice, and exposure to driving situations Attitude – patience, not being aggressive, not taking unsafe risks, not becoming distracted Worked with American Driver and Traffic Education Association Put Teenage Drivers through agent/parent/driver team program Mine the data from the experiment in great detail