Predictive Modeling CAS Reinsurance Seminar May 7, 2007 Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining,

Slides:

Advertisements

Similar presentations

The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke

Advertisements

Continued Psy 524 Ainsworth

Brief introduction on Logistic Regression

Chapter 7 – Classification and Regression Trees

Chapter 7 – Classification and Regression Trees

x – independent variable (input)

Chapter 7 – K-Nearest-Neighbor

Sparse vs. Ensemble Approaches to Supervised Learning

Distinguishing the Forest from the Trees University of Texas November 11, 2009 Richard Derrig, PhD, Opal Consulting Louise Francis,

Lecture 5 (Classification with Decision Trees)

Data Mining – Best Practices CAS 2008 Spring Meeting Quebec City, Canada Louise Francis, FCAS, MAAA

Data mining and statistical learning - lecture 13 Separating hyperplane.

Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!

EFFECTIVE PREDICTIVE MODELING- DATA,ANALYTICS AND PRACTICE MANAGEMENT Richard A. Derrig Ph.D. OPAL Consulting LLC Karthik Balakrishnan Ph.D. ISO Innovative.

Severity Distributions for GLMs: Gamma or Lognormal? Presented by Luyang Fu, Grange Mutual Richard Moncher, Bristol West 2004 CAS Spring Meeting Colorado.

Chapter 5 Data mining : A Closer Look.

Classification and Prediction: Regression Analysis

Ensemble Learning (2), Tree and Forest

Decision Tree Models in Data Mining

1 Data Preparation Part 1: Exploratory Data Analysis & Data Cleaning, Missing Data CAS 2007 Ratemaking Seminar Louise Francis, FCAS Francis Analytics and.

Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.

Overview DM for Business Intelligence.

Simple Linear Regression

Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Applications The General Linear Model. Transformations.

Introduction to Generalized Linear Models Prepared by Louise Francis Francis Analytics and Actuarial Data Mining, Inc. October 3, 2004.

Free and Cheap Sources of External Data CAS 2007 Predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.

Chapter 9 – Classification and Regression Trees

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

1 Statistical Techniques Chapter Linear Regression Analysis Simple Linear Regression.

Data Mining: Neural Network Applications by Louise Francis CAS Annual Meeting, Nov 11, 2002 Francis Analytics and Actuarial Data Mining, Inc.

University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.

2007 CAS Predictive Modeling Seminar Estimating Loss Costs at the Address Level Glenn Meyers ISO Innovative Analytics.

Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.

1 Combining GLM and data mining techniques Greg Taylor Taylor Fry Consulting Actuaries University of Melbourne University of New South Wales Casualty Actuarial.

Data Mining – Best Practices Part #2 Richard Derrig, PhD, Opal Consulting LLC CAS Spring Meeting June 16-18, 2008.

1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.

Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.

MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1)

CONFIDENTIAL1 Hidden Decision Trees to Design Predictive Scores – Application to Fraud Detection Vincent Granville, Ph.D. AnalyticBridge October 27, 2009.

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression Regression Trees.

Dimension Reduction in Workers Compensation CAS predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.

Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.

Predictive Modeling Spring 2005 CAMAR meeting Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc

Glenn Meyers ISO Innovative Analytics 2007 CAS Annual Meeting Estimating Loss Cost at the Address Level.

Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.

Dealing with continuous variables and geographical information in non life insurance ratemaking Maxime Clijsters.

Dancing With Dirty Data: Methods for Exploring and Cleaning Data 2005 CAS Ratemaking Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial.

1 Data Preparation Part 1: Exploratory Data Analysis & Data Cleaning, Missing Data CAS Predictive Modeling Seminar Louise Francis Francis Analytics and.

Boosting and Additive Trees (Part 1) Ch. 10 Presented by Tal Blum.

Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.

Distinguishing the Forest from the Trees 2006 CAS Ratemaking Seminar Richard Derrig, PhD, Opal Consulting Louise Francis, FCAS,

Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

 Naïve Bayes  Data import – Delimited, Fixed, SAS, SPSS, OBDC  Variable creation & transformation  Recode variables  Factor variables  Missing.

Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Data Transformation: Normalization

Chapter 7. Classification and Prediction

KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional.

An Empirical Comparison of Supervised Learning Algorithms

Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.

Dimension Reduction in Workers Compensation

The Elements of Statistical Learning

Data Mining Lecture 11.

Overview of Supervised Learning

Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.

Presentation transcript:

Predictive Modeling CAS Reinsurance Seminar May 7, 2007 Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.

2 Francis Analytics Why Predictive Modeling? Better use of data than traditional methods Advanced methods for dealing with messy data now available

3 Francis Analytics Data Mining Goes Prime Time

4 Francis Analytics Becoming A Popular Tool In All Industries

5 Francis Analytics Real Life Insurance Application – The “Boris Gang”

6 Francis Analytics Predictive Modeling Family Predictive Modeling Classical Linear Models GLMsData Mining

7 Francis Analytics A Casualty Actuary’s Perspective on Data Modeling The Stone Age: 1914 – … –Simple deterministic methods Use of blunt instruments: the analytical analog of bows and arrows –Often ad-hoc –Slice and dice data –Based on empirical data – little use of parametric models The Pre – Industrial age: … –Fit probability distribution to model tails –Simulation models and numerical methods for variability and uncertainty analysis –Focus is on underwriting, not claims The Industrial Age – 1985 … –Begin to use computer catastrophe models The 20 th Century – 1990… –European actuaries begin to use GLMs The Computer Age 1996… –Begin to discuss data mining at conferences –At end of 20 st century, large consulting firms starts to build a data mining practice The Current era – A mixture of above –In personal lines, modeling the rule rather than the exception Often GLM based, though GLMs evolving to GAMs –Commercial lines beginning to embrace modeling

8 Francis Analytics Data Quality: A Data Mining Problem Actuary reviewing a database

9 Francis Analytics CHAID: The First CAS DM Paper: 1990

10 Francis Analytics A Problem: Nonlinear Functions An Insurance Nonlinear Function: Provider Bill vs. Probability of Independent Medical Exam

11 Francis Analytics Classical Statistics: Regression Estimation of parameters: Fit line that minimizes deviation between actual and fitted values

12 Francis Analytics Generalized Linear Models (GLMs) Relax normality assumption –Exponential family of distributions Models some kinds of nonlinearity

13 Francis Analytics Generalized Linear Models Common Links for GLMs The identity link: h(Y) = Y The log link: h(Y) = ln(Y) The inverse link: h(Y) = The logit link: h(Y) = The probit link: h(Y) =

14 Francis Analytics Major Kinds of Data Mining Supervised learning –Most common situation –A dependent variable Frequency Loss ratio Fraud/no fraud –Some methods Regression CART Some neural networks Unsupervised learning –No dependent variable –Group like records together A group of claims with similar characteristics might be more likely to be fraudulent Ex: Territory assignment, Text Mining –Some methods Association rules K-means clustering Kohonen neural networks

15 Francis Analytics Desirable Features of a Data Mining Method Any nonlinear relationship can be approximated A method that works when the form of the nonlinearity is unknown The effect of interactions can be easily determined and incorporated into the model The method generalizes well on out-of sample data

16 Francis Analytics The Fraud Surrogates used as Dependent Variables Independent Medical Exam (IME) requested Special Investigation Unit (SIU) referral –(IME successful) –(SIU successful) Data: Detailed Auto Injury Claim Database for Massachusetts Accident Years ( )

17 Francis Analytics Predictor Variables Claim file variables –Provider bill, Provider type –Injury Derived from claim file variables –Attorneys per zip code –Docs per zip code Using external data –Average household income –Households per zip

18 Francis Analytics Different Kinds of Decision Trees Single Trees (CART, CHAID) Ensemble Trees, a more recent development (TREENET, RANDOM FOREST) –A composite or weighted average of many trees (perhaps 100 or more)

19 Francis Analytics Non Tree Methods MARS – Multivariate Adaptive Regression Splines Neural Networks Naïve Bayes (Baseline) Logistic Regression (Baseline)

20 Francis Analytics Illustrations Dependent variables –Paid auto BI losses –Fraud surrogates: IME Requested; SIU Requested Predictors –Provider 2 Bill and other variables –Known to have nonlinear relationship The following illustrations will show how some of the techniques model nonlinearity using these predictor and dependent variables

21 Francis Analytics Classification and Regression Trees (CART) Tree Splits are binary If the variable is numeric, split is based on R 2 or sum or mean squared error –For any variable, choose the two way split of data that reduces the mse the most –Do for all independent variables –Choose the variable that reduces the squared errors the most When dependent is categorical, other goodness of fit measures (gini index, deviance) are used

22 Francis Analytics CART – Example of 1 st split on Provider 2 Bill, With Paid as Dependent For the entire database, total squared deviation of paid losses around the predicted value (i.e., the mean) is 4.95x1013. The SSE declines to 4.66x10 13 after the data are partitioned using $5,021 as the cutpoint. Any other partition of the provider bill produces a larger SSE than 4.66x For instance, if a cutpoint of $10,000 is selected, the SSE is 4.76*10 13.

23 Francis Analytics Continue Splitting to get more homogenous groups at terminal nodes

24 Francis Analytics CART

25 Francis Analytics Ensemble Trees: Fit More Than One Tree Fit a series of trees Each tree added improves the fit of the model Average or Sum the results of the fits There are many methods to fit the trees and prevent overfitting Boosting: Iminer Ensemble and Treenet Bagging: Random Forest

26 Francis Analytics First Tree PredictionSecond (Weighted) Tree Prediction Create Sequence of Predictions

27 Francis Analytics Treenet Prediction of IME Requested

28 Francis Analytics Random Forest: IME vs. Provider 2 Bill

29 Francis Analytics Neural Networks =

30 Francis Analytics Classical Statistics: Regression One of the most common methods of fitting a function is linear regression Models a relationship between two variables by fitting a straight line through points A lot of infrastructure is available for assessing appropriateness of model

31 Francis Analytics Neural Networks Also minimizes squared deviation between fitted and actual values Can be viewed as a non-parametric, non-linear regression

32 Francis Analytics Hidden Layer of Neural Network (Input Transfer Function)

33 Francis Analytics The Activation Function (Transfer Function) The sigmoid logistic function

34 Francis Analytics Neural Network: Provider 2 Bill vs. IME Requested

35 Francis Analytics MARS: Provider 2 Bill vs. IME Requested

36 Francis Analytics How MARS Fits Nonlinear Function MARS fits a piecewise regression –BF1 = max(0, X – 1,401.00) –BF2 = max(0, 1, X ) –BF3 = max(0, X ) –Y = E-03 * BF E-03 * BF E-03 * BF3; BF1 is basis function –BF1, BF2, BF3 are basis functions MARS uses statistical optimization to find best basis function(s) Basis function similar to dummy variable in regression. Like a combination of a dummy indicator and a linear independent variable

37 Francis Analytics The Basis Functions: Category Grouping and Interactions Injury type 4 (neck sprain), and type 5 (back sprain) increase faster and have higher scores than the other injury types –BF1 = max(0, X ) –BF2 = ( INJTYPE = 4 OR INJTYPE = 5) –BF3 = max(0, X - 159) * BF2 –Y = * BF * BF E-03 * BF3 where –X is the provider bill –INJTYPE is the injury type

38 Francis Analytics Interactions: MARS Fit

39 Francis Analytics Baseline Method: Naive Bayes Classifier Naive Bayes assumes feature (predictor variables) independence conditional on each category Probability that an observation X will have a specific set of values for the independent variables is the product of the conditional probabilities of observing each of the values given target category c j, j=1 to m (m typically 2)

40 Francis Analytics Naïve Bayes Formula A constant

41 Francis Analytics Naïve Bayes (cont.) Only classification problems (categorical dependents) Only categorical independents Create groupings for numeric variables such as provider bill and age –We split provider bill and age into quintiles (5 groups)

42 Francis Analytics Naïve Bayes: Probability of Independent Variable Given No IME/ IME

43 Francis Analytics Put it together to make a prediction Plug in specific values for Provider 2 Bill and provider type and get a prediction Chain together product of bill* probability of type probability * probability IME

44 Francis Analytics Advantages/Disadvantages Computationally efficient Under many circumstances has performed well Assumption of conditional independence often does not hold Can’t be used for numeric variables

45 Francis Analytics Naïve Bayes Predicted IME vs. Provider 2 Bill

46 Francis Analytics True/False Positives and True/False Negatives (Type I and Type II Errors) The “Confusion” Matrix Choose a “cut point” in the model score. Claims > cut point, classify “yes”.

47 Francis Analytics ROC Curves and Area Under the ROC Curve Want good performance both on sensitivity and specificity Sensitivity and specificity depend on cut points chosen –Choose a series of different cut points, and compute sensitivity and specificity for each of them –Graph results Plot sensitivity vs 1-specifity Compute an overall measure of “lift”, or area under the curve

48 Francis Analytics TREENET ROC Curve – IME Explain AUROC AUROC = 0.701

49 Francis Analytics Logistic ROC Curve – SIU AUROC = 0.612

50 Francis Analytics Ranking of Methods/Software – IME Requested

51 Francis Analytics Some Software Packages That Can be Used  Excel  Access  Free Software  R  Web based software  S-Plus (similar to commercial version of R)  SPSS  CART/MARS  Data Mining suites – (SAS Enterprise Miner/SPSS Clementine)

52 Francis Analytics References Derrig, R., Francis, L., “Distinguishing the Forest from the Trees: A Comparison of Tree Based Data Mining Methods”, CAS Winter Forum, March 2006, Derrig, R., Francis, L., “A Comparison of Methods for Predicting Fraud ”,Risk Theory Seminar, April 2006 Francis, L., “Taming Text: An Introduction to Text Mining”, CAS Winter Forum, March 2006, Francis, L.A., Neural Networks Demystified, Casualty Actuarial Society Forum, Winter, pp , Francis, L.A., Martian Chronicles: Is MARS better than Neural Networks? Casualty Actuarial Society Forum, Winter, pp , 2003b. Dahr, V, Seven Methods for Transforming Corporate into Business Intelligence, Prentice Hall, 1997 The web site has some tutorials and presentations

Predictive Modeling CAS Reinsurance Seminar May,