Similarity Modeling Chuck Boucek

Slides:



Advertisements
Similar presentations
WMS-IV Wechsler Memory Scale - Fourth Edition
Advertisements

Integrated Benefits/Absence Management Operational Risk Management and Measurement from the Buyer’s Perspective CAS/SOA: Enterprise Risk Management Symposium.
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r Assumptions.
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Considerations in P&C Pricing Segmentation February 25, 2015 Bob Weishaar, Ph.D., FCAS, MAAA.
Using Utility Theory for Describing Best Estimate Reserves Mark W. Littmann 1998 Casualty Loss Reserve Seminar Philadelphia, Pennsylvania.
Proprietary & Confidential 1 Product Development Workshop Part 7: Product Monitoring/Risk Management 2012 CAS Ratemaking and Product Management Seminar.
Chapter 3 - Part B Descriptive Statistics: Numerical Methods
Agenda Background Model for purchase probability (often reffered to as conversion rate) Model for renewwal probability What is the link between price change,
Travelers Analytics: U of M Stats 8053 Insurance Modeling Problem
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Update on Current Activities Grant Peters – Chair November 2006
1 1 Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University © 2002 South-Western/Thomson Learning.
October 4, 2007 Proprietary & Confidential Overview of Professional Liability PLUS – Southwest Chapter Meeting.
Integrating the Broad Range Applications of Predictive Modeling in a Competitive Market Environment Jun Yan Mo Mosud Cheng-sheng Peter Wu 2008 CAS Spring.
Hidden Risks in Casualty (Re)insurance Casualty Actuaries in Reinsurance (CARe) 2007 David R. Clark, Vice President Munich Reinsurance America, Inc.
2007 CAS Predictive Modeling Seminar Estimating Loss Costs at the Address Level Glenn Meyers ISO Innovative Analytics.
Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
CEN st Lecture CEN 4021 Software Engineering II Instructor: Masoud Sadjadi Monitoring (POMA)
Predictive Modeling for Small Commercial Risks CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.
Asbestos Valuation CLRS – Chicago; September 8, 2003 Kevin M. Madigan, PhD, ACAS, MAAA Vice President, Platinum Underwriters Bermuda, Ltd. Claus S. Metzner,
2006 Seminar for the Appointed Actuary Colloque pour l’actuaire désigné Seminar for the Appointed Actuary Colloque pour l’actuaire désigné 2006.
Milliman Asbestos Valuation 2004 Casualty Loss Reserve Seminar Las Vegas, Nevada September 13, 2004 Claus S. Metzner, FSA, FCAS, MAAA, Aktuar – SAV Actuary,
Risk Diversification and Insurance
Copyright TruRisk, LLC Overview Presentation Strategy—Winning by Changing the Rules (patents pending) TruRisk, LLC.
Experian is a registered trademark of Experian Information Solutions, Inc. © © Experian Information Solutions, Inc Confidential and proprietary -
Glenn Meyers ISO Innovative Analytics 2007 CAS Annual Meeting Estimating Loss Cost at the Address Level.
CANE 2007 Spring Meeting Visualizing Predictive Modeling Results Chuck Boucek (312)
Ab Rate Monitoring Steven Petlick CAS Underwriting Cycle Seminar October 5, 2009.
1 Casualty Actuarial Society 2008 Seminar on Ratemaking Use of GLMs in Ratemaking David Dahl, FCAS, MAAA Casualty Actuary Oregon Insurance Division.
1 Deloitte Consulting LLP Predictive Modeling for Commercial Risks Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2005 Special Interest Seminar Chicago September.
Special Challenges With Large Data Mining Projects CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.
Alternative Risk Financing Vehicles. Began development in 2010 Launched first captive in 2011 Current Active Captive Portfolio ‒ Legacy health – Heterogeneous.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
T Relationships do matter: Understanding how nurse-physician relationships can impact patient care outcomes Sandra L. Siedlecki PhD RN CNS.
Statistics & Evidence-Based Practice
Actuarial Review of Emerging Risks
UNIT ONE REVIEW Exploring Data.
Exploratory Data Analysis
Chapter 12 Understanding Research Results: Description and Correlation
Analysis of Quantitative Data
PLANNING, MATERIALITY AND ASSESSING THE RISK OF MISSTATEMENT
1. Data Processing Sci Info Skills.
Forecasting Methods Dr. T. T. Kachwala.
Analysis Using Spreadsheets
Notes on Logistic Regression
2012 CAS Ratemaking and Product Management Seminar
Non-linear relationships
Sec 9C – Logistic Regression and Propensity scores
Casualty Actuarial Society Practical discounting and risk adjustment issues relating to property/casualty claim liabilities Research conducted.
Regression Techniques
Chapter Outline 3.1 THE PERVASIVENESS OF RISK
Mortality Trends The Good, the Bad and the Future
Sections 2.1 and 2.2 Problems.
Session II: Reserve Ranges Who Does What
Graphical Descriptive Techniques
Chapter 3 Describing Data Using Numerical Measures
Understanding Research Results: Description and Correlation
DS4 Interpreting Sets of Data
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
Descriptive Statistics
3. Use an in-line sensor to sense when the effects of tool wear...
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
Generalized Linear Models
Data analysis LO: Identify and apply different methods of measuring central tendencies and dispersion.
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Similarity Modeling Chuck Boucek Insurance and Actuarial Advisory Services Similarity Modeling Chuck Boucek CAS Seminar on Predictive Modeling – Las Vegas, Nevada October 11 – 12, 2007 www.ey.com/us/actuarial

Agenda Overview Underwriting Model Similarity Model Market Deployment Conceptual framework Process Model results Market Deployment

Overview Company’s book of business is a function of its Target market Underwriting practices Location of agencies Underwriting model is based on company’s own data Model is sometimes employed to score risks outside of target market Company approaches to addressing policyholders at extremes of historical data Limits on risks that are scored via model Manual underwriting of selected groups Modeling approach (Continued)

Overview Business risks of predictive modeling Model does not appropriately assess policyholders Model deployment worsens policyholder retention Model is inappropriately extrapolated to policyholders Can an underwriting model be used to score all potential policyholders? How to treat policyholders at extremes of historical data? Is there a structured way of assessing how similar a policyholder is to the policyholders in the data used to build the model?

Underwriting Model – Lift Chart 1.4 1.4 Predicted Actual 1.3 1.3 1.2 1.2 1.1 1.1 Relative Frequency 1.0 1.0 0.9 0.9 0.8 0.8 0.7 0.7 0.6 Source of all graphs: Ernst & Young Insurance and Actuarial Advisory Services 0.6 (Continued)

Underwriting Model – Predictor Variable #1 Credit Variable 0.0 0.2 0.4 0.6 0.8 1.0 2,000 4,000 6,000 8,000 10,000 1.2 1.4 1.6 1.8 2.0 Relative Frequency Underwriting Model – Predictor Variable #1 2 SE Number of Observations # Observations (Continued)

Underwriting Model – Predictor Variable #2 2.0 60,000 1.8 A 1.6 2 SE Number of Observations 40,000 Relative Frequency # Observations 1.4 B 1.2 20,000 C 1.0 0.8 A B C Internal Class Variable Internal Class Variable

Similarity Model Conceptual framework Policyholders with greater certainty in the claim frequency prediction should score higher Generally, there is greater uncertainty in predictions from areas with more sparse data Want to be able to assess similarity in a multivariate framework (Continued)

Similarity Model Process Generate one record for every record in modeling database Numeric variable will be from a uniform distribution over the range of actual values Factor variable will have an equal frequency for each level of the factor Variables are limited to those in underwriting model Create an indicator variable 1 = Record is from actual data 0 = Record is from generated data Perform a logistic regression with the indicator variable as response variable Response variable assumed to be binomial Logit link function (Continued)

Similarity Model – Predictor Variable #1 Credit Variable: Indicator = 1 0.0 0.2 0.4 0.6 0.8 1.0 2,000 4,000 6,000 8,000 10,000 Credit Variable: Indicator = 0 Similarity Model – Predictor Variable #1 # Observations # Observations Number of Observations Number of Observations (Continued)

Similarity Model – Predictor Variable #2 60,000 60,000 Number of Observations Number of Observations # Observations # Observations 40,000 40,000 20,000 20,000 A B C A B C Internal Class Variable: Indicator = 1 Internal Class Variable: Indicator = 0 (Continued)

Similarity Model Results – Average Scores 50,000 0.98 40,000 # Observations 30,000 Number of Observations 20,000 10,000 0.88 0.8 0.75 0.66 0.65 0.56 0.55 0.63 0.59 0-0.1 0.1-0.2 0.2-0.3 0.3-0.4 0.4-0.5 0.5-0.6 0.6-0.7 0.7-0.8 0.8-0.9 0.9-1 (Continued) Credit Variable

Similarity Model Results – Average Scores 0.95 60,000 40,000 # Observations Number of Observations 0.92 20,000 0.8 A B C (Continued) Internal Class Variable

Similarity Model – Summary Advantages of this process Produces results consistent with desired characteristics It is a structured way of sorting risks from the fringes of the distribution It is straightforward conceptually It employs a Generalized Linear Model This is not the only process that can be employed (Continued)

Similarity Model – Impact on Lift Chart Similarity Decile 10 1.6 1.6 Predicted Actual 1.4 1.4 1.2 1.2 Relative Frequency 1.0 1.0 0.8 0.8 0.6 0.6 (Continued) 0.4 0.4

Similarity Model – Impact on Lift Chart Similarity Decile 2 1.6 1.6 Predicted Actual 1.4 1.4 1.2 1.2 Relative Frequency 1.0 1.0 0.8 0.8 0.6 0.6 (Continued) 0.4 0.4

Similarity Model – Impact on Lift Chart Similarity Decile 1 1.6 1.6 Predicted Actual 1.4 1.4 1.2 1.2 Relative Frequency 1.0 1.0 0.8 0.8 0.6 0.6 (Continued) 0.4 0.4

Similarity Model Observations Policyholders in similarity decile 1 should show more variability in actual results The predicted spread in claim frequency is smaller in decile 10 than in decile 1 The policyholders in decile 10 tend to be alike

Market Deployment Applications Select a cutoff similarity score Based on variability in lift chart Professional judgment of underwriting staff Policyholders with lower similarity scores will receive greater underwriting attention Monitor results by similarity group