Predictive Modeling for Property-Casualty Insurance

Slides:



Advertisements
Similar presentations
Requirements Engineering Processes – 2
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Zhongxing Telecom Pakistan (Pvt.) Ltd
1 UNIVERSITIES of AUSTRALASIA BENCHMARKING RISK MANAGEMENT BILL DUNNE DIRECTOR RISK MANAGEMENT UNSW. PROUDLY SPONSORED BY UNIMUTUAL.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
STATISTICS Random Variables and Distribution Functions
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Introduction to Algorithms 6.046J/18.401J
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.
1 The Impact of Buy-Down on Sell Up, Unconstraining, and Spiral-Down Edward Kambour, Senior Scientist E. Andrew Boyd, SVP and Senior Scientist Joseph Tama,
Organizational Control and Change
Introduction to Cost Behavior and Cost-Volume Relationships
Effective Test Planning: Scope, Estimates, and Schedule Presented By: Shaun Bradshaw
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Software testing.
1 METODOLOGÍAS Y PRÁCTICAS EN RESERVAS TÉCNICAS PARA SEGUROS DE SALUD Y SEGUROS GENERALES LIMA – 31 DE MAYO, 2007 APESEG Presentado por: APESEG & Milliman,
5-1 Chapter 5 Theory & Problems of Probability & Statistics Murray R. Spiegel Sampling Theory.
Legacy Systems Older software systems that remain vital to an organisation.
Copyright © 2013, 2009, 2005 Pearson Education, Inc.
Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005.
Hypothesis Tests: Two Independent Samples
Target Costing If you cannot find the time to do it right, how will you find the time to do it over?
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
1 General Iteration Algorithms by Luyang Fu, Ph. D., State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting LLP 2007 CAS.
Controlling as a Management Function
Determining How Costs Behave
Chapter 10: The Traditional Approach to Design
Analyzing Genes and Genomes
Systems Analysis and Design in a Changing World, Fifth Edition
Chapter 12 Analyzing Semistructured Decision Support Systems Systems Analysis and Design Kendall and Kendall Fifth Edition.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 View Design and Integration.
1 Interpreting a Model in which the slopes are allowed to differ across groups Suppose Y is regressed on X1, Dummy1 (an indicator variable for group membership),
©2006 Prentice Hall Business Publishing, Auditing 11/e, Arens/Beasley/Elder Audit Sampling for Tests of Controls and Substantive Tests of Transactions.
PSSA Preparation.
Essential Cell Biology
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
Chapter 13 The Data Warehouse
Simple Linear Regression Analysis
Correlation and Linear Regression
Energy Generation in Mitochondria and Chlorplasts
By Hui Bian Office for Faculty Excellence Spring
Implementing Strategy in Companies That Compete in a Single Industry
Chapter 5 The Mathematics of Diversification
Data, Now What? Skills for Analyzing and Interpreting Data
Data Mining: A Closer Look
Chapter 5 Data mining : A Closer Look.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
© Deloitte Consulting, 2005 Predictive Modeling – Panacea or Placebo? Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2005 Spring Meeting Scottsdale, AZ May.
© Deloitte Consulting, 2004 Introduction to Data Mining James Guszcza, FCAS, MAAA CAS 2004 Ratemaking Seminar Philadelphia March 11-12, 2004.
Chapter 7 Neural Networks in Data Mining Automatic Model Building (Machine Learning) Artificial Intelligence.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Neural Networks Automatic Model Building (Machine Learning) Artificial Intelligence.
© Deloitte Consulting, 2004 Alternatives to Credit Scoring in Insurance James Guszcza, FCAS, MAAA Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2004 Ratemaking.
Integrating the Broad Range Applications of Predictive Modeling in a Competitive Market Environment Jun Yan Mo Mosud Cheng-sheng Peter Wu 2008 CAS Spring.
2007 CAS Predictive Modeling Seminar Estimating Loss Costs at the Address Level Glenn Meyers ISO Innovative Analytics.
© Deloitte Consulting, 2005 What To Do When You Cannot Use Credit? (Personal Lines) Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2005 Special Interest Seminar.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
What is Data Mining? process of finding correlations or patterns among dozens of fields in large relational databases process of finding correlations or.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Predictive Modeling Spring 2005 CAMAR meeting Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc
Glenn Meyers ISO Innovative Analytics 2007 CAS Annual Meeting Estimating Loss Cost at the Address Level.
Data Mining and Decision Support
1 Deloitte Consulting LLP Predictive Modeling for Commercial Risks Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2005 Special Interest Seminar Chicago September.
Special Challenges With Large Data Mining Projects CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.
Data Mining Copyright KEYSOFT Solutions.
DATA MINING © Prentice Hall.
Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.
Presentation transcript:

Predictive Modeling for Property-Casualty Insurance James Guszcza, FCAS, MAAA Peter Wu, FCAS, MAAA SoCal Actuarial Club LAX September 22, 2004

Predictive Modeling: 3 Levels of Discussion Strategy Profitable growth Retain most profitable policyholders Methodology Model design (actuarial) Modeling process Technique GLM vs. decision trees vs. neural nets…

Methodology vs Technique How does data mining need actuarial science? Variable creation Model design Model evaluation How does actuarial science need data mining? Advances in computing, modeling techniques Ideas from other fields can be applied to insurance problems

Semantics: DM vs PM Data exploration techniques (some brute force) One connotation: Data Mining (DM) is about knowledge discovery in large industrial databases Data exploration techniques (some brute force) e.g. discover strength of credit variables Predictive Modeling (PM) applies statistical techniques (like regression) after knowledge discovery phase is completed. Quantify & synthesize relationships found during knowledge discovery e.g. build a credit model

Strategy: Why do Data Mining? Think Baseball!

Bay Area Baseball In 1999 Billy Beane (manager for the Oakland Athletics) found a novel use of data mining. Not a wealthy team Ranked 12th (out of 14) in payroll How to compete with rich teams? Beane hired a statistics whiz to analyze statistics advocated by baseball guru Bill James Beane was able to hire excellent players undervalued by the market. A year after Beane took over, the A’s ranked 2nd!

Implication Beane quantified how well a player would do. Implication: Not perfectly, just better than his peers Implication: Be on the lookout for fields where an expert is required to reach a decision based on judgmentally synthesizing quantifiable information across many dimensions. (sound like insurance underwriting?) Maybe a predictive model can beat the pro.

Example Who is worse?... And by how much? 20 y.o. driver with 1 minor violation who pays his bills on time and was written by your best agent Mature driver with a recent accident and has paid his bills late a few times Unlike the human, the algorithm knows how much weight to give each dimension… Classic PM strategy: build underwriting models to achieve profitable growth.

Keeping Score Billy Beane CEO who wants to run the next Progressive Beane’s Scouts Underwriter Potential Team Member Potential Insured Bill James’ stats Predictive variables – old or new (e.g. credit) Billy Bean’s number cruncher You! (or people on your team)

What is Predictive Modeling?

Three Concepts Scoring engines Lift curves Out-of-sample tests A “predictive model” by any other name… Lift curves How much worse than average are the policies with the worst scores? Out-of-sample tests How well will the model work in the real world? Unbiased estimate of predictive power

Classic Application: Scoring Engines Scoring engine: formula that classifies or separates policies (or risks, accounts, agents…) into profitable vs. unprofitable Retaining vs. non-retaining… (Non-)Linear equation f( ) of several predictive variables Produces continuous range of scores score = f(X1, X2, …, XN)

What “Powers” a Scoring Engine? score = f(X1, X2, …, XN) The X1, X2,…, XN are as important as the f( )! Why actuarial expertise is necessary A large part of the modeling process consists of variable creation and selection Usually possible to generate 100’s of variables Steepest part of the learning curve

Model Evaluation: Lift Curves Sort data by score Break the dataset into 10 equal pieces Best “decile”: lowest score  lowest LR Worst “decile”: highest score  highest LR Difference: “Lift” Lift = segmentation power Lift  ROI of the modeling project

Out-of-Sample Testing Randomly divide data into 3 pieces Training data, Test data, Validation data Use Training data to fit models Score the Test data to create a lift curve Perform the train/test steps iteratively until you have a model you’re happy with During this iterative phase, validation data is set aside in a “lock box” Once model has been finalized, score the Validation data and produce a lift curve Unbiased estimate of future performance

Comparison of Techniques Models built to detect whether an email message is really spam. “Gains charts” from several models  Analogous to lift curves Good for binary target All techniques work ok! Good variable creation at least as important as modeling technique.

Credit Scoring is an Example All of these concepts apply to Credit Scoring Knowledge discovery in databases (KDD) Scoring engine Lift Curve evaluation  translates to LR improvement  ROI Blind-test validation Credit scoring has been the insurance industry’s segue into data mining

Applications Beyond Credit The classic: Profitability Scoring Model Underwriting/Pricing applications Retention models Elasticity models Cross-sell models Lifetime Value models Agent/agency monitoring Target marketing Fraud detection Customer segmentation no target variable (“unsupervised learning”)

Data Sources Company’s internal data Externally purchased data Policy-level records Loss & premium transactions Agent database Billing VIN…….. Externally purchased data Credit CLUE MVR Census ….

The Predictive Modeling Process Early: Variable Creation Middle: Data Exploration & Modeling Late: Analysis & Implementation

Variable Creation Research possible data sources Extract/purchase data Check data for quality (QA) Messy! (still deep in the mines) Create Predictive and Target Variables Opportunity to quantify tribal wisdom …and come up with new ideas Can be a very big task! Steepest part of the learning curve

Types of Predictive Variables Behavioral Historical Claim, billing, credit … Policyholder Age/Gender, # employees … Policy specifics Vehicle age, Construction Type … Territorial Census, Weather …

Data Exploration & Variable Transformation 1-way analyses of predictive variables Exploratory Data Analysis (EDA) Data Visualization Use EDA to cap / transform predictive variables Extreme values Missing values …etc

Multivariate Modeling Examine correlations among the variables Weed out redundant, weak, poorly distributed variables Model design Build candidate models Regression/GLM Decision Trees/MARS Neural Networks Select final model

Building the Model Pair down collection of predictive variables to a manageable set Iterative process Build candidate models on “training data” Evaluate on “test data” Many things to tweak Different target variables Different predictive variables Different modeling techniques # NN nodes, hidden layers; tree splitting rules…

Considerations Do signs/magnitudes of parameters make sense? Statistically significant? Is the model biased for/against certain types of policies? States? Policy sizes? ... Predictive power holds up for large policies? Continuity Are there small changes in input values that might produce large swings in scores Make sure that an agent can’t game the system

Model Analysis & Implementation Perform model analytics Necessary for client to gain comfort with the model Calibrate Models Create user-friendly “scale” – client dictates Implement models Programming skills are critical here Monitor performance Distribution of scores over time, predictiveness, usage of model... Plan model maintenance

Where Actuarial Science Needs Data Mining Modeling Techniques Where Actuarial Science Needs Data Mining

The Greatest Hits Unsupervised: no target variable Clustering Principal Components (dimension reduction) Supervised: predict a target variable Regression  GLM Neural Networks MARS: Multivariate Adaptive Regression Splines CART: Classification And Regression Trees

Regression and its Relations GLM: relax regression’s distributional assumptions Logistic regression (binary target) Poisson regression (count target) MARS & NN Clever ways of automatically transforming and interacting input variables Why: sometimes “true” relationships aren’t linear Universal approximators: model any functional form CART is simplified MARS

Neural Net Motivation Let X1, X2, X3 be three predictive variables policy age, historical LR, driver age Let Y be the target variable Loss ratio A NNET model is a complicated, non-linear, function φ such that: φ(X1, X2, X3) ≈ Y

In visual terms…

NNET lingo Green: “input layer” Red: “hidden layer” Yellow: “output layer” The {a, b} numbers are “weights” to be estimated. The network architecture and the weights constitute the model.

In more detail…

In more detail… The NNET model results from substituting the expressions for Z1 and Z2 in the expression for Y.

In more detail… Notice that the expression for Y has the form of a logistic regression. Similarly with Z1, Z2.

In more detail… You can therefore think of a NNET as a set of logistic regressions embedded in another logistic regression.

Universal Approximators The essential idea: by layering several logistic regressions in this way… …we can model any functional form no matter how many non-linearities or interactions between variables X1, X2,… by varying # of nodes and training cycles only NNETs are sometimes called “universal function approximators”.

MARS / CART Motivation NNETs use the logistic function to combine variables and automatically model any functional form MARS uses an analogous clever idea to do the same work MARS “basis functions” CART can be viewed as simplified MARS Basis functions are horizontal step functions  NNETS, MARS, and CART are all cousins of classic regression analysis

Reference For Beginners: For Mavens: Data Mining Techniques --Michael Berry & Gordon Linhoff For Mavens: The Elements of Statistical Learning --Jerome Friedman, Trevor Hastie, Robert Tibshirani