Multi-Dimensional Credibility Excess Work Comp Application.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

The Simple Regression Model
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Linear regression models
Ch11 Curve Fitting Dr. Deshi Ye
Lecture 2 Today: Statistical Review cont’d:
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 12 Simple Regression
Point estimation, interval estimation
Linear Regression with One Regression
Chapter 4 Multiple Regression.
Evaluating Hypotheses
Chapter 11 Multiple Regression.
The Simple Regression Model
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
Review of Probability and Statistics
1A.1 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Review Some Basic Statistical Concepts Appendix 1A.
Simple Linear Regression and Correlation
Lecture 5 Correlation and Regression
Correlation & Regression
Objectives of Multiple Regression
Basic Concepts in Credibility CAS Seminar on Ratemaking Salt Lake City, Utah Paul J. Brehm, FCAS, MAAA Minneapolis March 13-15, 2006.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Introduction to Linear Regression and Correlation Analysis
Introduction to Credibility CAS Seminar on Ratemaking Las Vegas, Nevada March 12-13, 2001.
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Ch4 Describing Relationships Between Variables. Pressure.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
CHAPTER 14 MULTIPLE REGRESSION
CAS Spring Meeting Commentary on the New Hazard Groups June 18, 2007 Jose Couret Orlando.
Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Introduction to Credibility CAS Seminar on Ratemaking New Orleans March 10-11, 2005.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Correlation & Regression Analysis
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
CHAPTER- 3.2 ERROR ANALYSIS. 3.3 SPECIFIC ERROR FORMULAS  The expressions of Equations (3.13) and (3.14) were derived for the general relationship of.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Estimating standard error using bootstrap
Theme 6. Linear regression
Nonparametric Statistics
The simple linear regression model and parameter estimation
Chapter 7. Classification and Prediction
CH 5: Multivariate Methods
Evgeniya Anatolievna Kolomak, Professor
ECONOMETRICS DR. DEEPTI.
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
CHAPTER 29: Multiple Regression*
Nonparametric Statistics
CONCEPTS OF ESTIMATION
Regression Models - Introduction
Sampling Distribution
Sampling Distribution
Simple Linear Regression
Product moment correlation
Regression Models - Introduction
Presentation transcript:

Multi-Dimensional Credibility Excess Work Comp Application

Simplified Version of Least Squares Credibility in General

2 Loss Models Setup  An individual insured (policyholder) has n iid observations X 1,…,X n whose distribution is from a parameter    is an instance of a random variable  with density   Define  = E(X j |  ) and v(  ) = Var(X j |  )   is called the hypothetical mean and v(  ) is the process variance –In classical statistics,  is called the population mean, but Charles Hewitt, a Bayesian, considered that to be a model construct, not a truly existing entity, and so called it hypothetical, and the terminology has persisted  Let  = E  v = Ev , a = Var[   v is the expected process variance and a is the variance of hypothetical means  Bühlmann: estimate  linearly by a 0 +  a j X j minimizing expected squared error  Answer is a 0 = (1 – z) , a i = z/n i>0 where z = n/(n+k), k = v/a  Estimates  by zX* + (1 – z)  =  + z(X* –  ) = EX* + z(X* – EX*)  We will generalize the left side, but derive the right side

3 Simplified Version  Let X* be the mean of the X j ’s  Bühlmann’s result is to estimate  by zX* + (1 – z) .  Derivation of z is much simpler if you start with that instead of a 0 +  a j X j.  Not giving up much by this simplification because best linear estimate of the mean is the sample mean.  Assumptions imply  = X* + v(  ) ½  =  + a ½ , where  and  are independent mean 0, variance 1 deviations.  Generalize this to having two estimators X and Y of C with expected squared errors of s 2 and t 2, respectively, where s and t might even be random variables themselves.  Find z that minimizes E { [C – zX + (z–1)Y] 2 }

4 Finding z  Find z that minimizes E { [C – zX + (z–1)Y] 2 }  X = C + s , Y = C + t   Set derivative to zero – 0 = E { [C – zX + (z–1)Y][Y–X] } = E { [–zs  + (z–1)t  ][t  –s  ] } = E[zs 2  2 + (z–1)t 2  2 ] = zE[s 2 ] + (z–1)E[t 2 ] –Thus z = E(t 2 ) / [E(s 2 ) + E(t 2 )]  In the credibility model E(s 2 ) is the expected process variance and t 2 is already a constant – the variance of the hypothetical means  Also z = [1/E(s 2 )] / [1/E(s 2 ) + 1/E(t 2 )] so the weight on X is proportional to the reciprocal of its variance, and similarly for Y  This is a standard statistical result

Excess Pricing for Work Comp Classes

6 Workers Compensation Excess Pricing Model  Bureau excess prices traditionally based on hazard groups  Excess potential - very different across hazard groups –but also within hazard groups  Bureau methodology weights injury-type severity distributions by hazard group injury-type frequency splits  Can do that by class –Requires credibility procedure to get class distribution of losses by injury type

7 Severity by Injury Type, Massachusetts: Large Loss Potential Is Driven by Fatal, PT FatalPTMajorMinorTT Mean $ 411,287 $ 896,725 $ 137,163 $ 15,826 $ 12,367 95th Percentile $ 1,285,878 $ 2,566,482 $ 307,876 $ 42,187 $ 49,050 Ratio to TT Mean th Percentile

8 Differences in Injury-Type Frequencies Across and Within Hazard Groups: Ratios to Temporary Total Means HGFatal:TTPT:TTMajor:TT 10.21%0.33%6.10% 20.28%0.44%7.06% 30.69%0.74%11.61% 41.83%1.44%27.27% 95th Percentile Class* HGFatal:TTPT:TT %0.74% %1.47% %2.66% %2.77% *95 th percentile of larger classes Hazard group means are very different but significant variation exists within each hazard group

9 Correlation of Ratios to TT Across Classes Hazard Group III PTMajorMinor Fatal39%45%20% PT52%31% Major28% Use correlations to better estimate class frequencies. Major predictive of fatal and PT.

Credibility Including Correlation

11 Credibility with Correlation  Denote by V, W, X, Y - class ratios to TT for Fatal, PT, Major & Minor  Credibility Formula for Fatal for Class i: – E v i + b(V i – E V i ) + c(W i – E W i ) + d(X i – E X i ) +e(Y i – E Y i ) –Here Ev i = EV i is the hazard group mean for Fatal:TT; b is usual z  Example credibilities for fatal for a class in HG III with 300 TT claims – b = 32.6%, c = 5.0%, d = 1.3%, e = 0.2%  Major frequency - over 15 times fatal –so factor of 1.3% is in ballpark of being like 20% for fatal  Minor frequency - over 50 times fatal –so factor of 0.2% has impact of a factor of 10% for fatal (assuming differences from mean are of same magnitude as the mean)  How are these estimated?

12 Denote four injury types by V, W, X, and Y. For the i th class, denote the population mean ratios (i.e., the true conditional, or “hypothetical” means) as v i, w i, x i, and y i. Here these are mean ratios to TT. Credibility with Correlation

13 We observe each class i for each time period t. Denote by W i the class sample mean ratio for all time periods weighted by exposures m it (TT claims), where there are N periods of observation. Similarly for V, X, and Y. Let m i denote the sum over the time periods t of the m it m is the sum over classes i of the m i. Then within Var(W it |w i ) =  Wi 2 /m i Notation

14 Assume a linear model and minimize expected squared error, where expectation is taken across all classes in the hazard group. For PT this can be expressed as minimizing: E[(a + bV i + cW i + dX i + eY i – w i ) 2 ] The coefficients sought are a, b, c, d, and e. Differentiating wrt a gives: a = – E( bV i + cW i + dX i + eY i – w i ) Plugging in that for a makes the estimate of w i = Ew i + b(V i – EV i ) + c(W i – EW i ) + d(X i – EX i ) + e(Y i – EY i ) estimate of w i

15 We have w i = Ew i + b(V i – EV i ) + c(W i – EW i ) + d(X i – EX i ) + e(Y i – EY i ) Since in taking the mean across classes Ew i = EW i, c is the traditional credibility factor z. The derivative of E[(a + bV i + cW i + dX i + eY i – w i ) 2 ] wrt b gives: aEV i + E[V i ( bV i + cW i + dX i + eY i – w i )] = 0 Plugging in for a then yields: 0 = E(bV i + cW i + dX i + eY i – w i )EV i + E[bV i 2 + cV i W i + dV i X i + eV i Y i – V i w i ] Using Cov(X,Y) = E[XY] – EXEY,this can be rearranged to give: Cov(V i,w i ) = b Var(V i ) + c Cov(V i,W i ) + d Cov(V i,X i ) + e Cov(V i,Y i ) Doing the same for c, d, and e will yield three more equations that look like (3), but with the variance moving over one position each time. Thus you will end up with four equations that can be written as a single matrix equation:

16 where C is the covariance matrix of the class by injury- type sample means Cov(V i,Y i ) etc. You need estimates of all covariances - like estimating the EPV and VHM But…with these you can solve this equation for b, c, d, and e to be used for PT. Repeat for the other injury types.

17 How’s That Working for You?

18 Comparison to NCCI Hazard Groups Sum of Squared Errors for PT/TT Ratios Three Odd Years Predicted from Three Even Years

19 Comparison to NCCI Hazard Groups Sum of Squared Errors for Injury Type Ratios to TT Three Odd Years Predicted from Three Even Years Conclusion: Slight improvement by this measure

20 Other Tests  Individual class ratios are highly variable  Grouping classes might show up the effects better  Quintiles test for a hazard group –Group the classes in the hazard group into 5 sets based on ranking predicted ratio of injury count types to TT –Look at actual vs. predicted for those sets

21 Hazard Group D Quintiles Test for PT / TT Ratios

22 Sum of squared prediction errors Credibility better except for HG A Fatal and PT

23 Distribution of Credibility Indicated Class Means within Hazard Groups Ratio of PT / TT Counts

24 Distribution of Credibility Indicated Class Means within Hazard Groups Ratio of Major / TT Counts