Presentation is loading. Please wait.

Presentation is loading. Please wait.

Free and Cheap Sources of External Data CAS 2007 Predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.

Similar presentations


Presentation on theme: "Free and Cheap Sources of External Data CAS 2007 Predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc."— Presentation transcript:

1 Free and Cheap Sources of External Data CAS 2007 Predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. Louise_francis@msn.com www.data-mines.com

2 Objectives Information sharing Introduce some useful sources of data to augment company internal databases Show examples of applications using external data

3 Why Augment Data? For small companies, new lines of business, internal data may not be sufficient Add variables (i.e, demographic and economic) that are not in data

4 Some Kinds of External Data Demographic Geographic Economic –Unemployment rate, avg wage, etc –Financial Market Insurance data Occupational Weather

5 Zip Code Level Data Census bureau web site, www.census.gov has a wealth of informationwww.census.gov May require some processing effort to put into useful format for analysis For a small fee there are vendors who pre- process some of the useful data One of them is zip-codes.com

6 Zip-codes.com

7 Some Useful Variables Average Income Population Average house value # people per house Latitude, longitude –Use to compute distances City, county

8 Distance formula

9 The Data

10 California Auto Data by ZIP BI Exposures BI Losses BI Claims PD Exposures PD Losses PD Claims

11 CAARP Data CAARP data California Auto Assigned Risk Plan Collected by state Aggregated data Request from Statistical Analysis Division of department

12 California Proposed Changes to Territory Rating

13 Effect of Change by County

14 Effect of Change by Pure Premium Group

15 Effect of Change by Average House Value

16 Effect of Change by Average Income

17 The Data used for Fraud Model Described in “Distinguishing the Forest From the Trees”, Derrig and Francis, 2005 CAS Winter Forum

18 The Fraud Surrogates used as Dependent Variables Independent Medical Exam (IME) requested Special Investigation Unit (SIU) referral –(IME successful) –(SIU successful) Data: Detailed Auto Injury Claim Database for Massachusetts Accident Years (1995-1997)

19 Predictor Variables Claim file variables –Provider bill, Provider type –Injury Derived from claim file variables –Attorneys per zip code –Docs per zip code Using external data –Average household income –Households per zip

20 Neural Network Ranking of Variables

21 Variable Importance for IME Requested for 3 Methods

22 Variable Importance (IME) Based on Average of Methods

23 Trends Using External Information People still rely on Masterson’s indices and other indices based on the CPI Shortcomings –Hedonic adjustment –Substitution –Imputed rental cost –Geometric chaining –See www.shadowstats.com or Getting Prices Right by Economic Policy Institute and Dean Bakerwww.shadowstats.com Insurance inflation has typically been much higher than these indications Many need reliable trend indications on smaller segments of their data Trend is another weak link in the modeling process

24 Questions?


Download ppt "Free and Cheap Sources of External Data CAS 2007 Predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc."

Similar presentations


Ads by Google